From elbueler at alaska.edu Tue Mar 1 00:39:57 2016 From: elbueler at alaska.edu (Ed Bueler) Date: Mon, 29 Feb 2016 21:39:57 -0900 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: References: Message-ID: Barry -- Will try it. > ... since, presumably, other more powerful IO tools exist that would be used for "real" problems? I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. In fact petsc binary seems fairly convenient for that. On the other hand, I am not sure I've ever done anything "real". ;-) Anyone out there: Are there a good *convenient* tools for saving space/time-series (= movies) from PETSc TS? I want to add frames and movies from PETSc into slides, etc. I can think of NetCDF but it seems not-very-convenient, and I am worried not well-supported from PETSc. Is setting up TS with events (=TSSetEventMonitor()) and writing separate snapshot files the preferred scalable usage, despite the extra effort compared to "-ts_monitor_solution binary:foo.dat"? Ed On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: > > Ed, > > I have added a branch barry/feature-ts-monitor-binary that supports > -ts_monitor binary:timesteps that will store in simple binary format each > of the time steps associated with each solution. This in conjugation with > -ts_monitor_solution binary:solutions will give you two files you can read > in. But note that timesteps is a simple binary file of double precision > numbers you should read in directly in python, you cannot use > PetscBinaryIO.py which is what you will use to read in the solutions file. > > Barry > > Currently PETSc has a binary file format where we can save Vec, Mat, IS, > each is marked with a type id for PetscBinaryIO.py to detect, we do not > have type ids for simple double precision numbers or arrays of numbers. > This is why I have no way of saving the time steps in a way that > PetscBinaryIO.py could read them in currently. I don't know how far we want > to go in "spiffing up" the PETSc binary format to do more elaborate things > since, presumably, other more power IO tools exist that would be used for > "real" problems? > > > > On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: > > > > Dear PETSc -- > > > > I have a short C ode code that uses TS to solve y' = g(t,y) where y(t) > is a 2-dim'l vector. My code defaults to -ts_type rk so it does adaptive > time-stepping; thus using -ts_monitor shows times at stdout: > > > > $ ./ode -ts_monitor > > solving from t0 = 0.000 with initial time step dt = 0.10000 ... > > 0 TS dt 0.1 time 0. > > 1 TS dt 0.170141 time 0.1 > > 2 TS dt 0.169917 time 0.270141 > > 3 TS dt 0.171145 time 0.440058 > > 4 TS dt 0.173931 time 0.611203 > > 5 TS dt 0.178719 time 0.785134 > > 6 TS dt 0.0361473 time 0.963853 > > 7 TS dt 0.188252 time 1. > > error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 > > > > I want to output the trajectory in PETSc binary and plot it in python > using bin/PetscBinaryIO.py. Clearly I need the times shown above to do > that. > > > > Note "-ts_monitor_solution binary:XX" gives me a binary file with only y > values in it, but not the corresponding times. > > > > My question is, how to get those times in either the same binary file > (preferred) or separate binary files? I have tried > > > > $ ./ode -ts_monitor binary:foo.dat # invalid > > $ ./ode -ts_monitor_solution binary:bar.dat # no t in file > > $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no t > in file > > > > without success. (I am not sure what the boolean option > -ts_save_trajectory does, by the way.) > > > > Thanks! > > > > Ed > > > > PS Sorry if this is a "RTFM" question, but so far I can't find the > documentation. > > > > > > -- > > Ed Bueler > > Dept of Math and Stat and Geophysical Institute > > University of Alaska Fairbanks > > Fairbanks, AK 99775-6660 > > 301C Chapman and 410D Elvey > > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > -- Ed Bueler Dept of Math and Stat and Geophysical Institute University of Alaska Fairbanks Fairbanks, AK 99775-6660 301C Chapman and 410D Elvey 907 474-7693 and 907 474-7199 (fax 907 474-5394) -------------- next part -------------- An HTML attachment was scrubbed... URL: From elbueler at alaska.edu Tue Mar 1 01:26:04 2016 From: elbueler at alaska.edu (Ed Bueler) Date: Mon, 29 Feb 2016 22:26:04 -0900 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: References: Message-ID: Barry -- I am reading the resulting file successfully using import struct import sys f = open('timesteps','r') while True: try: bytes = f.read(8) except: print "f.read() failed" sys.exit(1) if len(bytes) > 0: print struct.unpack('>d',bytes)[0] else: break f.close() However, was there a more elegant intended method? I am disturbed by the apparent need to specify big-endian-ness (= '>d') in the struct.unpack() call. Ed On Mon, Feb 29, 2016 at 9:39 PM, Ed Bueler wrote: > Barry -- > > Will try it. > > > ... since, presumably, other more powerful IO tools exist that would be > used for "real" problems? > > I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. > In fact petsc binary seems fairly convenient for that. On the other hand, > I am not sure I've ever done anything "real". ;-) > > Anyone out there: Are there a good *convenient* tools for saving > space/time-series (= movies) from PETSc TS? I want to add frames and > movies from PETSc into slides, etc. I can think of NetCDF but it seems > not-very-convenient, and I am worried not well-supported from PETSc. Is > setting up TS with events (=TSSetEventMonitor()) and writing separate > snapshot files the preferred scalable usage, despite the extra effort > compared to "-ts_monitor_solution binary:foo.dat"? > > Ed > > > On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: > >> >> Ed, >> >> I have added a branch barry/feature-ts-monitor-binary that supports >> -ts_monitor binary:timesteps that will store in simple binary format each >> of the time steps associated with each solution. This in conjugation with >> -ts_monitor_solution binary:solutions will give you two files you can read >> in. But note that timesteps is a simple binary file of double precision >> numbers you should read in directly in python, you cannot use >> PetscBinaryIO.py which is what you will use to read in the solutions file. >> >> Barry >> >> Currently PETSc has a binary file format where we can save Vec, Mat, IS, >> each is marked with a type id for PetscBinaryIO.py to detect, we do not >> have type ids for simple double precision numbers or arrays of numbers. >> This is why I have no way of saving the time steps in a way that >> PetscBinaryIO.py could read them in currently. I don't know how far we want >> to go in "spiffing up" the PETSc binary format to do more elaborate things >> since, presumably, other more power IO tools exist that would be used for >> "real" problems? >> >> >> > On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: >> > >> > Dear PETSc -- >> > >> > I have a short C ode code that uses TS to solve y' = g(t,y) where >> y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does >> adaptive time-stepping; thus using -ts_monitor shows times at stdout: >> > >> > $ ./ode -ts_monitor >> > solving from t0 = 0.000 with initial time step dt = 0.10000 ... >> > 0 TS dt 0.1 time 0. >> > 1 TS dt 0.170141 time 0.1 >> > 2 TS dt 0.169917 time 0.270141 >> > 3 TS dt 0.171145 time 0.440058 >> > 4 TS dt 0.173931 time 0.611203 >> > 5 TS dt 0.178719 time 0.785134 >> > 6 TS dt 0.0361473 time 0.963853 >> > 7 TS dt 0.188252 time 1. >> > error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 >> > >> > I want to output the trajectory in PETSc binary and plot it in python >> using bin/PetscBinaryIO.py. Clearly I need the times shown above to do >> that. >> > >> > Note "-ts_monitor_solution binary:XX" gives me a binary file with only >> y values in it, but not the corresponding times. >> > >> > My question is, how to get those times in either the same binary file >> (preferred) or separate binary files? I have tried >> > >> > $ ./ode -ts_monitor binary:foo.dat # invalid >> > $ ./ode -ts_monitor_solution binary:bar.dat # no t in file >> > $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no >> t in file >> > >> > without success. (I am not sure what the boolean option >> -ts_save_trajectory does, by the way.) >> > >> > Thanks! >> > >> > Ed >> > >> > PS Sorry if this is a "RTFM" question, but so far I can't find the >> documentation. >> > >> > >> > -- >> > Ed Bueler >> > Dept of Math and Stat and Geophysical Institute >> > University of Alaska Fairbanks >> > Fairbanks, AK 99775-6660 >> > 301C Chapman and 410D Elvey >> > 907 474-7693 and 907 474-7199 (fax 907 474-5394) >> >> > > > -- > Ed Bueler > Dept of Math and Stat and Geophysical Institute > University of Alaska Fairbanks > Fairbanks, AK 99775-6660 > 301C Chapman and 410D Elvey > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > -- Ed Bueler Dept of Math and Stat and Geophysical Institute University of Alaska Fairbanks Fairbanks, AK 99775-6660 301C Chapman and 410D Elvey 907 474-7693 and 907 474-7199 (fax 907 474-5394) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Tue Mar 1 01:44:08 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 1 Mar 2016 10:44:08 +0300 Subject: [petsc-users] PETSc release soon, request for input on needed fixes or enhancements In-Reply-To: References: Message-ID: On 27 February 2016 at 23:36, Barry Smith wrote: > We are planning the PETSc release 3.7 shortly. If you know of any bugs that need to be fixed or enhancements added before the release please let us know. Barry, long time ago I reported a regression in PetIGA related to the new Vec assembly routines. I guess Jed didn't have a chance to look at it. Until that happens, I propose to not use the new routines as the default ones. -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 4332 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 From zonexo at gmail.com Tue Mar 1 02:03:15 2016 From: zonexo at gmail.com (TAY wee-beng) Date: Tue, 1 Mar 2016 16:03:15 +0800 Subject: [petsc-users] Investigate parallel code to improve parallelism In-Reply-To: References: <56D06F16.9000200@gmail.com> <141087E6-A7C1-41FF-B3E8-74FF587E3535@mcs.anl.gov> <56D07D0C.7050109@gmail.com> <5FBB830C-4A5E-40B2-9845-E0DC68B02BD3@mcs.anl.gov> <56D3207C.6090808@gmail.com> <91716C74-7716-4C7D-98B5-1606A8A93766@mcs.anl.gov> <56D399F5.9030709@gmail.com> <65862BCD-10AB-4821-9816-981D1AFFE7DE@mcs.anl.gov> <56D3AC5C.7040208@gmail.com> Message-ID: <56D54CC3.80709@gmail.com> On 29/2/2016 11:21 AM, Barry Smith wrote: >> On Feb 28, 2016, at 8:26 PM, TAY wee-beng wrote: >> >> >> On 29/2/2016 9:41 AM, Barry Smith wrote: >>>> On Feb 28, 2016, at 7:08 PM, TAY Wee Beng wrote: >>>> >>>> Hi, >>>> >>>> I've attached the files for x cells running y procs. hypre is called natively I'm not sure if PETSc catches it. >>> So you are directly creating hypre matrices and calling the hypre solver in another piece of your code? >> Yes because I'm using the simple structure (struct) layout for Cartesian grids. It's about twice as fast compared to BoomerAMG > Understood > >> . I can't create PETSc matrix and use the hypre struct layout, right? >>> In the PETSc part of the code if you compare the 2x_y to the x_y you see that doubling the problem size resulted in 2.2 as much time for the KSPSolve. Most of this large increase is due to the increased time in the scatter which went up to 150/54. = 2.7777777777777777 but the amount of data transferred only increased by 1e5/6.4e4 = 1.5625 Normally I would not expect to see this behavior and would not expect such a large increase in the communication time. >>> >>> Barry >>> >>> >>> >> So ideally it should be 2 instead of 2.2, is that so? > Ideally > >> May I know where are you looking at? Because I can't find the nos. > The column labeled Avg len tells the average length of messages which increases from 6.4e4 to 1e5 while the time max increase by 2.77 (I took the sum of the VecScatterBegin and VecScatter End rows. > >> So where do you think the error comes from? > It is not really an error it is just that it is taking more time then one would hope it would take. >> Or how can I troubleshoot further? > > If you run the same problem several times how much different are the numerical timings for each run? Hi, I have re-done x_y and 2x_y again. I have attached the files with _2 for the 2nd run. They're exactly the same. Should I try running on another cluster? I also tried running the same problem with more cells and more time steps (to reduce start up effects) on another cluster. But I forgot to run it with -log_summary. Anyway, the results show: 1. Using 1.5 million cells with 48 procs and 3M with 96p took 65min and 69min. Using the weak scaling formula I attached earlier, it gives about 88% efficiency 2. Using 3 million cells with 48 procs and 6M with 96p took 114min and 121min. Using the weak scaling formula I attached earlier, it gives about 88% efficiency 3. Using 3.75 million cells with 48 procs and 7.5M with 96p took 134min and 143min. Using the weak scaling formula I attached earlier, it gives about 87% efficiency 4. Using 4.5 million cells with 48 procs and 9M with 96p took 160min and 176min (extrapolated). Using the weak scaling formula I attached earlier, it gives about 80% efficiency So it seems that I should run with 3.75 million cells with 48 procs and scale along this ratio. Beyond that, my efficiency decreases. Is that so? Maybe I should also run with -log_summary to get better estimate... Thanks. > > >> Thanks >>>> Thanks >>>> >>>> On 29/2/2016 1:11 AM, Barry Smith wrote: >>>>> As I said before, send the -log_summary output for the two processor sizes and we'll look at where it is spending its time and how it could possibly be improved. >>>>> >>>>> Barry >>>>> >>>>>> On Feb 28, 2016, at 10:29 AM, TAY wee-beng wrote: >>>>>> >>>>>> >>>>>> On 27/2/2016 12:53 AM, Barry Smith wrote: >>>>>>>> On Feb 26, 2016, at 10:27 AM, TAY wee-beng wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 26/2/2016 11:32 PM, Barry Smith wrote: >>>>>>>>>> On Feb 26, 2016, at 9:28 AM, TAY wee-beng wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I have got a 3D code. When I ran with 48 procs and 11 million cells, it runs for 83 min. When I ran with 96 procs and 22 million cells, it ran for 99 min. >>>>>>>>> This is actually pretty good! >>>>>>>> But if I'm not wrong, if I increase the no. of cells, the parallelism will keep on decreasing. I hope it scales up to maybe 300 - 400 procs. >>>>>> Hi, >>>>>> >>>>>> I think I may have mentioned this before, that is, I need to submit a proposal to request for computing nodes. In the proposal, I'm supposed to run some simulations to estimate the time it takes to run my code. Then an excel file will use my input to estimate the efficiency when I run my code with more cells. They use 2 mtds to estimate: >>>>>> >>>>>> 1. strong scaling, whereby I run 2 cases - 1st with n cells and x procs, then with n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf. >>>>>> >>>>>> 2. weak scaling, whereby I run 2 cases - 1st with n cells and x procs, then with 2n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf. >>>>>> >>>>>> So if I use 48 and 96 procs and get maybe 80% efficiency, by the time I hit 800 procs, I get 32% efficiency for strong scaling. They expect at least 50% efficiency for my code. To reach that, I need to achieve 89% efficiency when I use 48 and 96 procs. >>>>>> >>>>>> So now my qn is how accurate is this type of calculation, especially wrt to PETSc? >>>>>> >>>>>> Similarly, for weak scaling, is it accurate? >>>>>> >>>>>> Can I argue that this estimation is not suitable for PETSc or hypre? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>>>>>> So it's not that parallel. I want to find out which part of the code I need to improve. Also if PETsc and hypre is working well in parallel. What's the best way to do it? >>>>>>>>> Run both with -log_summary and send the output for each case. This will show where the time is being spent and which parts are scaling less well. >>>>>>>>> >>>>>>>>> Barry >>>>>>>> That's only for the PETSc part, right? So for other parts of the code, including hypre part, I will not be able to find out. If so, what can I use to check these parts? >>>>>>> You will still be able to see what percentage of the time is spent in hypre and if it increases with the problem size and how much. So the information will still be useful. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>>>>> I thought of doing profiling but if the code is optimized, I wonder if it still works well. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>> >>>> -- >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> <2x_2y.txt><2x_y.txt><4x_2y.txt> -------------- next part -------------- 0.000000000000000E+000 0.600000000000000 17.5000000000000 120.000000000000 0.000000000000000E+000 0.250000000000000 1.00000000000000 0.400000000000000 0 -400000 AB,AA,BB -2.51050002424745 2.47300002246629 2.51050002424745 2.43950002087513 size_x,size_y,size_z 79 137 141 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 35 1 24 1 66360 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 36 69 1 24 66361 130824 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 70 103 1 24 130825 195288 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 104 137 1 24 195289 259752 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 1 35 25 48 259753 326112 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 36 69 25 48 326113 390576 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 70 103 25 48 390577 455040 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 104 137 25 48 455041 519504 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 1 35 49 72 519505 585864 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 36 69 49 72 585865 650328 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 70 103 49 72 650329 714792 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 104 137 49 72 714793 779256 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 35 73 95 779257 842851 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 36 69 73 95 842852 904629 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 70 103 73 95 904630 966407 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 104 137 73 95 966408 1028185 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 1 35 96 118 1028186 1091780 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 36 69 96 118 1091781 1153558 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 70 103 96 118 1153559 1215336 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 104 137 96 118 1215337 1277114 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 1 35 119 141 1277115 1340709 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 36 69 119 141 1340710 1402487 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 70 103 119 141 1402488 1464265 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 104 137 119 141 1464266 1526043 body_cg_ini 0.850000999999998 9.999999998273846E-007 6.95771875020604 3104 surfaces with wrong vertex ordering Warning - length difference between element and cell max_element_length,min_element_length,min_delta 7.847540176996057E-002 3.349995610000001E-002 4.700000000000000E-002 maximum ngh_surfaces and ngh_vertics are 47 22 minimum ngh_surfaces and ngh_vertics are 22 9 min IIB_cell_no 0 max IIB_cell_no 112 final initial IIB_cell_no 5600 min I_cell_no 0 max I_cell_no 96 final initial I_cell_no 4800 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u) 5600 4800 5600 4800 IIB_I_cell_no_uvw_total1 1221 1206 1212 775 761 751 1 0.01904762 0.28410536 0.31610359 1.14440147 -0.14430869E+03 -0.13111542E+02 0.15251948E+07 2 0.01348578 0.34638018 0.42392119 1.23447223 -0.16528393E+03 -0.10238827E+02 0.15250907E+07 3 0.01252674 0.38305826 0.49569053 1.27891383 -0.16912542E+03 -0.95950253E+01 0.15250695E+07 4 0.01199639 0.41337279 0.54168038 1.29584768 -0.17048065E+03 -0.94814301E+01 0.15250602E+07 5 0.01165251 0.43544137 0.57347276 1.30255981 -0.17129184E+03 -0.95170304E+01 0.15250538E+07 300 0.00236362 3.56353622 5.06727508 4.03923148 -0.78697893E+03 0.15046453E+05 0.15263125E+07 600 0.00253142 2.94537779 5.74258126 4.71794271 -0.38271069E+04 -0.49150195E+04 0.15289768E+07 900 0.00220341 3.10439489 6.70144317 4.01105348 -0.71943943E+04 0.13728311E+05 0.15320532E+07 1200 0.00245748 3.53496741 7.33163591 4.01935315 -0.85017750E+04 -0.77550358E+04 0.15350351E+07 1500 0.00244299 3.71751725 5.93463559 4.12005108 -0.95364451E+04 0.81223334E+04 0.15373061E+07 1800 0.00237474 3.49908653 5.20866314 4.69712853 -0.10382365E+05 -0.18966840E+04 0.15385160E+07 escape_time reached, so abort cd_cl_cs_mom_implicit1 -1.03894256791350 -1.53179673343374 6.737940408853320E-002 0.357464909626058 -0.103698436387821 -2.42688484514611 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out on a petsc-3.6.3_static_rel named n12-09 with 24 processors, by wtay Sat Feb 27 16:09:41 2016 Using Petsc Release Version 3.6.3, Dec, 03, 2015 Max Max/Min Avg Total Time (sec): 2.922e+03 1.00001 2.922e+03 Objects: 2.008e+04 1.00000 2.008e+04 Flops: 1.651e+11 1.08049 1.582e+11 3.797e+12 Flops/sec: 5.652e+07 1.08049 5.414e+07 1.299e+09 MPI Messages: 8.293e+04 1.89333 6.588e+04 1.581e+06 MPI Message Lengths: 4.109e+09 2.03497 4.964e+04 7.849e+10 MPI Reductions: 4.427e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.9219e+03 100.0% 3.7965e+12 100.0% 1.581e+06 100.0% 4.964e+04 100.0% 4.427e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 3998 1.0 4.4655e+01 5.1 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 820 VecDotNorm2 1999 1.0 4.0603e+01 7.6 1.59e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 902 VecNorm 3998 1.0 3.0557e+01 6.2 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 1198 VecCopy 3998 1.0 4.4206e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 12002 1.0 9.3725e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 3998 1.0 9.1178e+00 1.5 3.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8030 VecWAXPY 3998 1.0 9.3186e+00 1.5 1.59e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3928 VecAssemblyBegin 3998 1.0 1.5680e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0 VecAssemblyEnd 3998 1.0 1.1443e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 16002 1.0 9.0984e+00 1.4 0.00e+00 0.0 1.2e+06 6.4e+04 0.0e+00 0 0 77100 0 0 0 77100 0 0 VecScatterEnd 16002 1.0 4.4821e+01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatMult 3998 1.0 1.4268e+02 1.3 6.05e+10 1.1 3.0e+05 1.1e+05 0.0e+00 4 37 19 43 0 4 37 19 43 0 9753 MatSolve 5997 1.0 2.0469e+02 1.4 8.84e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 53 0 0 0 6 53 0 0 0 9921 MatLUFactorNum 104 1.0 2.2332e+01 1.1 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6922 MatILUFactorSym 1 1.0 1.0867e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 1 1.0 3.8305e-02 1.9 7.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4603 MatAssemblyBegin 105 1.0 2.0776e+00 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 105 1.0 2.4702e+00 1.1 0.00e+00 0.0 1.5e+02 2.8e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 7.1249e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 105 1.0 9.8758e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1999 1.0 4.1857e+02 1.0 1.65e+11 1.1 3.0e+05 1.1e+05 1.0e+04 14100 19 43 23 14100 19 43 23 9070 PCSetUp 208 1.0 2.2440e+01 1.1 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6888 PCSetUpOnBlocks 1999 1.0 2.7087e-01 1.1 6.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5487 PCApply 5997 1.0 2.3123e+02 1.3 9.50e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 9444 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 4032 4032 31782464 0 Vector Scatter 2010 15 3738624 0 Matrix 4 4 190398024 0 Distributed Mesh 2003 8 39680 0 Star Forest Bipartite Graph 4006 16 13696 0 Discrete System 2003 8 6784 0 Index Set 4013 4013 14715400 0 IS L to G Mapping 2003 8 2137148 0 Krylov Solver 2 2 2296 0 Preconditioner 2 2 1896 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 8.15392e-06 Average time for zero size MPI_Send(): 1.12454e-05 #PETSc Option Table entries: -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1 ----------------------------------------- Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12 Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core Using PETSc directory: /home/wtay/Codes/petsc-3.6.3 Using PETSc arch: petsc-3.6.3_static_rel ----------------------------------------- Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include ----------------------------------------- Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90 Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl ----------------------------------------- -------------- next part -------------- 0.000000000000000E+000 0.600000000000000 17.5000000000000 120.000000000000 0.000000000000000E+000 0.250000000000000 1.00000000000000 0.400000000000000 0 -400000 AB,AA,BB -2.78150003711926 2.76500003633555 2.78150003711926 2.70650003355695 size_x,size_y,size_z 100 172 171 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 29 1 43 1 124700 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 30 58 1 43 124701 249400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 59 87 1 43 249401 374100 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 88 116 1 43 374101 498800 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 117 144 1 43 498801 619200 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 145 172 1 43 619201 739600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 1 29 44 86 739601 864300 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 30 58 44 86 864301 989000 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 59 87 44 86 989001 1113700 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 88 116 44 86 1113701 1238400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 117 144 44 86 1238401 1358800 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 145 172 44 86 1358801 1479200 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 29 87 129 1479201 1603900 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 30 58 87 129 1603901 1728600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 59 87 87 129 1728601 1853300 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 88 116 87 129 1853301 1978000 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 117 144 87 129 1978001 2098400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 145 172 87 129 2098401 2218800 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 1 29 130 171 2218801 2340600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 30 58 130 171 2340601 2462400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 59 87 130 171 2462401 2584200 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 88 116 130 171 2584201 2706000 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 117 144 130 171 2706001 2823600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 145 172 130 171 2823601 2941200 body_cg_ini 0.850000999999998 9.999999998273846E-007 6.95771875020604 3104 surfaces with wrong vertex ordering Warning - length difference between element and cell max_element_length,min_element_length,min_delta 7.847540176996057E-002 3.349995610000001E-002 3.500000000000000E-002 maximum ngh_surfaces and ngh_vertics are 28 12 minimum ngh_surfaces and ngh_vertics are 14 5 min IIB_cell_no 0 max IIB_cell_no 229 final initial IIB_cell_no 11450 min I_cell_no 0 max I_cell_no 200 final initial I_cell_no 10000 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u) 11450 10000 11450 10000 IIB_I_cell_no_uvw_total1 2230 2227 2166 1930 1926 1847 1 0.01411765 0.30104754 0.32529731 1.15440698 -0.30539502E+03 -0.29715696E+02 0.29394159E+07 2 0.00973086 0.41244573 0.45086899 1.22116550 -0.34890134E+03 -0.25062690E+02 0.29392110E+07 3 0.00918177 0.45383616 0.51179402 1.27757073 -0.35811483E+03 -0.25027396E+02 0.29391677E+07 4 0.00885764 0.47398774 0.55169119 1.31019526 -0.36250500E+03 -0.25910050E+02 0.29391470E+07 5 0.00872241 0.48832538 0.57967282 1.32679047 -0.36545763E+03 -0.26947216E+02 0.29391325E+07 300 0.00163886 4.27898628 6.83028522 3.60837060 -0.19609891E+04 0.43984454E+05 0.29435194E+07 600 0.00160193 3.91014241 4.97460210 5.10461274 -0.61092521E+03 0.18910563E+05 0.29467790E+07 900 0.00150521 3.27352854 5.85427996 4.49166453 -0.89281765E+04 -0.12171584E+05 0.29507471E+07 1200 0.00165280 3.05922213 7.37243530 5.16434634 -0.10954640E+05 0.22049957E+05 0.29575213E+07 1500 0.00153718 3.54908044 5.42918256 4.84940953 -0.16430153E+05 0.24407130E+05 0.29608940E+07 1800 0.00155455 3.30956962 8.35799538 4.50638757 -0.20003619E+05 -0.20349497E+05 0.29676102E+07 escape_time reached, so abort cd_cl_cs_mom_implicit1 -1.29348921431473 -2.44525665200003 -0.238725356553914 0.644444280391413 -3.056662699041206E-002 -2.91791118488116 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out on a petsc-3.6.3_static_rel named n12-06 with 24 processors, by wtay Mon Feb 29 21:45:09 2016 Using Petsc Release Version 3.6.3, Dec, 03, 2015 Max Max/Min Avg Total Time (sec): 5.933e+03 1.00000 5.933e+03 Objects: 2.008e+04 1.00000 2.008e+04 Flops: 3.129e+11 1.06806 3.066e+11 7.360e+12 Flops/sec: 5.273e+07 1.06807 5.169e+07 1.241e+09 MPI Messages: 8.298e+04 1.89703 6.585e+04 1.580e+06 MPI Message Lengths: 6.456e+09 2.05684 7.780e+04 1.229e+11 MPI Reductions: 4.427e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.9326e+03 100.0% 7.3595e+12 100.0% 1.580e+06 100.0% 7.780e+04 100.0% 4.427e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 3998 1.0 1.0612e+02 2.0 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 665 VecDotNorm2 1999 1.0 9.4306e+01 2.1 2.99e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 748 VecNorm 3998 1.0 8.7330e+01 2.0 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 808 VecCopy 3998 1.0 7.4317e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 12002 1.0 1.1626e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 3998 1.0 1.7543e+01 1.4 5.98e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8044 VecWAXPY 3998 1.0 1.6637e+01 1.4 2.99e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4241 VecAssemblyBegin 3998 1.0 3.0367e+01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0 VecAssemblyEnd 3998 1.0 1.5386e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 16002 1.0 1.7833e+01 1.4 0.00e+00 0.0 1.2e+06 1.0e+05 0.0e+00 0 0 77100 0 0 0 77100 0 0 VecScatterEnd 16002 1.0 1.2689e+02 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatMult 3998 1.0 3.1700e+02 1.3 1.15e+11 1.1 3.0e+05 1.7e+05 0.0e+00 5 37 19 43 0 5 37 19 43 0 8482 MatSolve 5997 1.0 3.6841e+02 1.3 1.67e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 54 0 0 0 6 54 0 0 0 10707 MatLUFactorNum 104 1.0 4.3137e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 7016 MatILUFactorSym 1 1.0 3.5212e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 1 1.0 9.1592e-02 3.0 1.45e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3720 MatAssemblyBegin 105 1.0 5.1547e+00 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 105 1.0 4.7898e+00 1.1 0.00e+00 0.0 1.5e+02 4.3e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 2.0590e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 105 1.0 4.5063e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1999 1.0 9.1527e+02 1.0 3.13e+11 1.1 3.0e+05 1.7e+05 1.0e+04 15100 19 43 23 15100 19 43 23 8040 PCSetUp 208 1.0 4.3499e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6958 PCSetUpOnBlocks 1999 1.0 8.2526e-01 1.3 1.25e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3526 PCApply 5997 1.0 4.1644e+02 1.3 1.80e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 10192 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 4032 4032 53827712 0 Vector Scatter 2010 15 7012720 0 Matrix 4 4 359683260 0 Distributed Mesh 2003 8 39680 0 Star Forest Bipartite Graph 4006 16 13696 0 Discrete System 2003 8 6784 0 Index Set 4013 4013 25819112 0 IS L to G Mapping 2003 8 3919440 0 Krylov Solver 2 2 2296 0 Preconditioner 2 2 1896 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 2.14577e-07 Average time for MPI_Barrier(): 1.03951e-05 Average time for zero size MPI_Send(): 1.83781e-06 #PETSc Option Table entries: -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1 ----------------------------------------- Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12 Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core Using PETSc directory: /home/wtay/Codes/petsc-3.6.3 Using PETSc arch: petsc-3.6.3_static_rel ----------------------------------------- Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include ----------------------------------------- Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90 Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl ----------------------------------------- -------------- next part -------------- 0.000000000000000E+000 0.600000000000000 17.5000000000000 120.000000000000 0.000000000000000E+000 0.250000000000000 1.00000000000000 0.400000000000000 0 -400000 AB,AA,BB -2.51050002424745 2.47300002246629 2.51050002424745 2.43950002087513 size_x,size_y,size_z 79 137 141 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 35 1 24 1 66360 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 36 69 1 24 66361 130824 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 70 103 1 24 130825 195288 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 104 137 1 24 195289 259752 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 1 35 25 48 259753 326112 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 36 69 25 48 326113 390576 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 70 103 25 48 390577 455040 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 104 137 25 48 455041 519504 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 1 35 49 72 519505 585864 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 36 69 49 72 585865 650328 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 70 103 49 72 650329 714792 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 104 137 49 72 714793 779256 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 35 73 95 779257 842851 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 36 69 73 95 842852 904629 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 70 103 73 95 904630 966407 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 104 137 73 95 966408 1028185 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 1 35 96 118 1028186 1091780 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 36 69 96 118 1091781 1153558 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 70 103 96 118 1153559 1215336 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 104 137 96 118 1215337 1277114 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 1 35 119 141 1277115 1340709 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 36 69 119 141 1340710 1402487 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 70 103 119 141 1402488 1464265 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 104 137 119 141 1464266 1526043 body_cg_ini 0.850000999999998 9.999999998273846E-007 6.95771875020604 3104 surfaces with wrong vertex ordering Warning - length difference between element and cell max_element_length,min_element_length,min_delta 7.847540176996057E-002 3.349995610000001E-002 4.700000000000000E-002 maximum ngh_surfaces and ngh_vertics are 47 22 minimum ngh_surfaces and ngh_vertics are 22 9 min IIB_cell_no 0 max IIB_cell_no 112 final initial IIB_cell_no 5600 min I_cell_no 0 max I_cell_no 96 final initial I_cell_no 4800 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u) 5600 4800 5600 4800 IIB_I_cell_no_uvw_total1 1221 1206 1212 775 761 751 1 0.01904762 0.28410536 0.31610359 1.14440147 -0.14430869E+03 -0.13111542E+02 0.15251948E+07 2 0.01348578 0.34638018 0.42392119 1.23447223 -0.16528393E+03 -0.10238827E+02 0.15250907E+07 3 0.01252674 0.38305826 0.49569053 1.27891383 -0.16912542E+03 -0.95950253E+01 0.15250695E+07 4 0.01199639 0.41337279 0.54168038 1.29584768 -0.17048065E+03 -0.94814301E+01 0.15250602E+07 5 0.01165251 0.43544137 0.57347276 1.30255981 -0.17129184E+03 -0.95170304E+01 0.15250538E+07 300 0.00236362 3.56353622 5.06727508 4.03923148 -0.78697893E+03 0.15046453E+05 0.15263125E+07 600 0.00253142 2.94537779 5.74258126 4.71794271 -0.38271069E+04 -0.49150195E+04 0.15289768E+07 900 0.00220341 3.10439489 6.70144317 4.01105348 -0.71943943E+04 0.13728311E+05 0.15320532E+07 1200 0.00245748 3.53496741 7.33163591 4.01935315 -0.85017750E+04 -0.77550358E+04 0.15350351E+07 1500 0.00244299 3.71751725 5.93463559 4.12005108 -0.95364451E+04 0.81223334E+04 0.15373061E+07 1800 0.00237474 3.49908653 5.20866314 4.69712853 -0.10382365E+05 -0.18966840E+04 0.15385160E+07 escape_time reached, so abort cd_cl_cs_mom_implicit1 -1.03894256791350 -1.53179673343374 6.737940408853320E-002 0.357464909626058 -0.103698436387821 -2.42688484514611 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out on a petsc-3.6.3_static_rel named n12-06 with 24 processors, by wtay Mon Feb 29 20:55:15 2016 Using Petsc Release Version 3.6.3, Dec, 03, 2015 Max Max/Min Avg Total Time (sec): 2.938e+03 1.00001 2.938e+03 Objects: 2.008e+04 1.00000 2.008e+04 Flops: 1.651e+11 1.08049 1.582e+11 3.797e+12 Flops/sec: 5.620e+07 1.08049 5.384e+07 1.292e+09 MPI Messages: 8.293e+04 1.89333 6.588e+04 1.581e+06 MPI Message Lengths: 4.109e+09 2.03497 4.964e+04 7.849e+10 MPI Reductions: 4.427e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.9382e+03 100.0% 3.7965e+12 100.0% 1.581e+06 100.0% 4.964e+04 100.0% 4.427e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 3998 1.0 3.7060e+01 4.0 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 988 VecDotNorm2 1999 1.0 3.3165e+01 5.1 1.59e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 1104 VecNorm 3998 1.0 3.0081e+01 5.7 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 1217 VecCopy 3998 1.0 4.2268e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 12002 1.0 9.0293e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 3998 1.0 8.8463e+00 1.5 3.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8276 VecWAXPY 3998 1.0 9.0856e+00 1.6 1.59e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4029 VecAssemblyBegin 3998 1.0 1.2290e+01 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0 VecAssemblyEnd 3998 1.0 1.1405e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 16002 1.0 9.0506e+00 1.4 0.00e+00 0.0 1.2e+06 6.4e+04 0.0e+00 0 0 77100 0 0 0 77100 0 0 VecScatterEnd 16002 1.0 4.8845e+01 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatMult 3998 1.0 1.3324e+02 1.1 6.05e+10 1.1 3.0e+05 1.1e+05 0.0e+00 4 37 19 43 0 4 37 19 43 0 10444 MatSolve 5997 1.0 1.9260e+02 1.4 8.84e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 53 0 0 0 6 53 0 0 0 10543 MatLUFactorNum 104 1.0 2.3135e+01 1.2 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6681 MatILUFactorSym 1 1.0 1.4099e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 1 1.0 4.5088e-02 2.6 7.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3911 MatAssemblyBegin 105 1.0 2.4788e+0011.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 105 1.0 2.4778e+00 1.1 0.00e+00 0.0 1.5e+02 2.8e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 7.9679e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 105 1.0 9.6669e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1999 1.0 3.9941e+02 1.0 1.65e+11 1.1 3.0e+05 1.1e+05 1.0e+04 14100 19 43 23 14100 19 43 23 9505 PCSetUp 208 1.0 2.3286e+01 1.2 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6638 PCSetUpOnBlocks 1999 1.0 3.7027e-01 1.3 6.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4014 PCApply 5997 1.0 2.1975e+02 1.4 9.50e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 9937 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 4032 4032 31782464 0 Vector Scatter 2010 15 3738624 0 Matrix 4 4 190398024 0 Distributed Mesh 2003 8 39680 0 Star Forest Bipartite Graph 4006 16 13696 0 Discrete System 2003 8 6784 0 Index Set 4013 4013 14715400 0 IS L to G Mapping 2003 8 2137148 0 Krylov Solver 2 2 2296 0 Preconditioner 2 2 1896 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 7.20024e-06 Average time for zero size MPI_Send(): 2.08616e-06 #PETSc Option Table entries: -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1 ----------------------------------------- Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12 Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core Using PETSc directory: /home/wtay/Codes/petsc-3.6.3 Using PETSc arch: petsc-3.6.3_static_rel ----------------------------------------- Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include ----------------------------------------- Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90 Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl ----------------------------------------- -------------- next part -------------- 0.000000000000000E+000 0.600000000000000 17.5000000000000 120.000000000000 0.000000000000000E+000 0.250000000000000 1.00000000000000 0.400000000000000 0 -400000 AB,AA,BB -2.78150003711926 2.76500003633555 2.78150003711926 2.70650003355695 size_x,size_y,size_z 100 172 171 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 29 1 43 1 124700 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 30 58 1 43 124701 249400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 59 87 1 43 249401 374100 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 88 116 1 43 374101 498800 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 117 144 1 43 498801 619200 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 145 172 1 43 619201 739600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 1 29 44 86 739601 864300 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 30 58 44 86 864301 989000 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 59 87 44 86 989001 1113700 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 88 116 44 86 1113701 1238400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 117 144 44 86 1238401 1358800 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 145 172 44 86 1358801 1479200 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 29 87 129 1479201 1603900 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 30 58 87 129 1603901 1728600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 59 87 87 129 1728601 1853300 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 88 116 87 129 1853301 1978000 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 117 144 87 129 1978001 2098400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 145 172 87 129 2098401 2218800 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 1 29 130 171 2218801 2340600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 30 58 130 171 2340601 2462400 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 59 87 130 171 2462401 2584200 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 88 116 130 171 2584201 2706000 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 117 144 130 171 2706001 2823600 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 145 172 130 171 2823601 2941200 body_cg_ini 0.850000999999998 9.999999998273846E-007 6.95771875020604 3104 surfaces with wrong vertex ordering Warning - length difference between element and cell max_element_length,min_element_length,min_delta 7.847540176996057E-002 3.349995610000001E-002 3.500000000000000E-002 maximum ngh_surfaces and ngh_vertics are 28 12 minimum ngh_surfaces and ngh_vertics are 14 5 min IIB_cell_no 0 max IIB_cell_no 229 final initial IIB_cell_no 11450 min I_cell_no 0 max I_cell_no 200 final initial I_cell_no 10000 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u) 11450 10000 11450 10000 IIB_I_cell_no_uvw_total1 2230 2227 2166 1930 1926 1847 1 0.01411765 0.30104754 0.32529731 1.15440698 -0.30539502E+03 -0.29715696E+02 0.29394159E+07 2 0.00973086 0.41244573 0.45086899 1.22116550 -0.34890134E+03 -0.25062690E+02 0.29392110E+07 3 0.00918177 0.45383616 0.51179402 1.27757073 -0.35811483E+03 -0.25027396E+02 0.29391677E+07 4 0.00885764 0.47398774 0.55169119 1.31019526 -0.36250500E+03 -0.25910050E+02 0.29391470E+07 5 0.00872241 0.48832538 0.57967282 1.32679047 -0.36545763E+03 -0.26947216E+02 0.29391325E+07 300 0.00163886 4.27898628 6.83028522 3.60837060 -0.19609891E+04 0.43984454E+05 0.29435194E+07 600 0.00160193 3.91014241 4.97460210 5.10461274 -0.61092521E+03 0.18910563E+05 0.29467790E+07 900 0.00150521 3.27352854 5.85427996 4.49166453 -0.89281765E+04 -0.12171584E+05 0.29507471E+07 1200 0.00165280 3.05922213 7.37243530 5.16434634 -0.10954640E+05 0.22049957E+05 0.29575213E+07 1500 0.00153718 3.54908044 5.42918256 4.84940953 -0.16430153E+05 0.24407130E+05 0.29608940E+07 1800 0.00155455 3.30956962 8.35799538 4.50638757 -0.20003619E+05 -0.20349497E+05 0.29676102E+07 escape_time reached, so abort cd_cl_cs_mom_implicit1 -1.29348921431473 -2.44525665200003 -0.238725356553914 0.644444280391413 -3.056662699041206E-002 -2.91791118488116 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out on a petsc-3.6.3_static_rel named n12-09 with 24 processors, by wtay Sat Feb 27 16:58:01 2016 Using Petsc Release Version 3.6.3, Dec, 03, 2015 Max Max/Min Avg Total Time (sec): 5.791e+03 1.00001 5.791e+03 Objects: 2.008e+04 1.00000 2.008e+04 Flops: 3.129e+11 1.06806 3.066e+11 7.360e+12 Flops/sec: 5.402e+07 1.06807 5.295e+07 1.271e+09 MPI Messages: 8.298e+04 1.89703 6.585e+04 1.580e+06 MPI Message Lengths: 6.456e+09 2.05684 7.780e+04 1.229e+11 MPI Reductions: 4.427e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.7911e+03 100.0% 7.3595e+12 100.0% 1.580e+06 100.0% 7.780e+04 100.0% 4.427e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 3998 1.0 1.1437e+02 2.3 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 617 VecDotNorm2 1999 1.0 1.0442e+02 2.6 2.99e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 676 VecNorm 3998 1.0 8.5426e+01 2.2 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 826 VecCopy 3998 1.0 7.3321e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 12002 1.0 1.2399e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 3998 1.0 1.8118e+01 1.4 5.98e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 7788 VecWAXPY 3998 1.0 1.6979e+01 1.3 2.99e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4155 VecAssemblyBegin 3998 1.0 4.1001e+01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0 VecAssemblyEnd 3998 1.0 1.4657e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 16002 1.0 1.9519e+01 1.5 0.00e+00 0.0 1.2e+06 1.0e+05 0.0e+00 0 0 77100 0 0 0 77100 0 0 VecScatterEnd 16002 1.0 1.3223e+02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatMult 3998 1.0 3.0904e+02 1.3 1.15e+11 1.1 3.0e+05 1.7e+05 0.0e+00 5 37 19 43 0 5 37 19 43 0 8700 MatSolve 5997 1.0 3.9285e+02 1.4 1.67e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 54 0 0 0 6 54 0 0 0 10040 MatLUFactorNum 104 1.0 4.2097e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 7190 MatILUFactorSym 1 1.0 2.9875e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 1 1.0 1.3492e-01 3.3 1.45e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2525 MatAssemblyBegin 105 1.0 5.9000e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 105 1.0 4.7665e+00 1.1 0.00e+00 0.0 1.5e+02 4.3e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 3.6001e-0518.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 1.6249e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 105 1.0 2.7945e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1999 1.0 9.1973e+02 1.0 3.13e+11 1.1 3.0e+05 1.7e+05 1.0e+04 16100 19 43 23 16100 19 43 23 8001 PCSetUp 208 1.0 4.2401e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 7138 PCSetUpOnBlocks 1999 1.0 7.2389e-01 1.2 1.25e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4020 PCApply 5997 1.0 4.4054e+02 1.3 1.80e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 9634 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 4032 4032 53827712 0 Vector Scatter 2010 15 7012720 0 Matrix 4 4 359683260 0 Distributed Mesh 2003 8 39680 0 Star Forest Bipartite Graph 4006 16 13696 0 Discrete System 2003 8 6784 0 Index Set 4013 4013 25819112 0 IS L to G Mapping 2003 8 3919440 0 Krylov Solver 2 2 2296 0 Preconditioner 2 2 1896 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.90735e-07 Average time for MPI_Barrier(): 7.20024e-06 Average time for zero size MPI_Send(): 1.83781e-06 #PETSc Option Table entries: -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1 ----------------------------------------- Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12 Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core Using PETSc directory: /home/wtay/Codes/petsc-3.6.3 Using PETSc arch: petsc-3.6.3_static_rel ----------------------------------------- Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include ----------------------------------------- Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90 Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl ----------------------------------------- From alena.kopanicakova13 at gmail.com Tue Mar 1 02:58:18 2016 From: alena.kopanicakova13 at gmail.com (alena kopanicakova) Date: Tue, 1 Mar 2016 09:58:18 +0100 Subject: [petsc-users] MOOSE_SNES_ISSUES Message-ID: Hello, I am developing my own nonlinear solver, which aims to solve simulations from MOOSE. Communication with moose is done via SNES interface. I am obtaining Jacobian and residual in following way: SNESComputeFunction(snes, X, R); SNESSetJacobian(snes, jac, jac, SNESComputeJacobianDefault, NULL); SNESComputeJacobian(snes, X, jac, jac); Unfortunately, by this setting it takes incredible amount of time to obtain Jacobian. Taking closer look at perf log, it seems that difference between mine and MOOSE solver is, that my executioner calls compute_residual() function ridiculously many times. I have no idea, what could be causing such a behavior. Do you have any suggestions, how to set up interface properly? or which change is needed for obtaining more-less same performance as MOOSE executioner? many thanks, alena -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- Framework Information: MOOSE version: git commit e79292e on 2016-01-05 PETSc Version: 3.6.1 Current Time: Tue Mar 1 00:51:45 2016 Executable Timestamp: Tue Mar 1 00:36:30 2016 Parallelism: Num Processors: 1 Num Threads: 1 Mesh: Distribution: serial Mesh Dimension: 3 Spatial Dimension: 3 Nodes: Total: 1331 Local: 1331 Elems: Total: 1000 Local: 1000 Num Subdomains: 1 Num Partitions: 1 Nonlinear System: Num DOFs: 3993 Num Local DOFs: 3993 Variables: { "disp_x" "disp_y" "disp_z" } Finite Element Types: "LAGRANGE" Approximation Orders: "FIRST" Execution Information: Executioner: Steady Solver Mode: NEWTON 0 Nonlinear |R| = 2.850000e-02 0 Linear |R| = 2.850000e-02 1 Linear |R| = 1.653670e-02 2 Linear |R| = 1.447168e-02 3 Linear |R| = 1.392965e-02 4 Linear |R| = 1.258440e-02 5 Linear |R| = 1.007181e-02 6 Linear |R| = 8.264315e-03 7 Linear |R| = 6.541897e-03 8 Linear |R| = 4.371900e-03 9 Linear |R| = 2.100406e-03 10 Linear |R| = 1.227539e-03 11 Linear |R| = 1.026286e-03 12 Linear |R| = 9.180101e-04 13 Linear |R| = 8.087598e-04 14 Linear |R| = 6.435247e-04 15 Linear |R| = 5.358688e-04 16 Linear |R| = 4.551657e-04 17 Linear |R| = 4.090276e-04 18 Linear |R| = 3.359290e-04 19 Linear |R| = 2.417643e-04 20 Linear |R| = 1.710452e-04 21 Linear |R| = 1.261996e-04 22 Linear |R| = 9.384052e-05 23 Linear |R| = 6.070637e-05 24 Linear |R| = 4.283233e-05 25 Linear |R| = 3.383792e-05 26 Linear |R| = 2.342289e-05 27 Linear |R| = 1.700855e-05 28 Linear |R| = 9.814278e-06 29 Linear |R| = 4.398519e-06 30 Linear |R| = 2.161205e-06 31 Linear |R| = 1.289206e-06 32 Linear |R| = 6.548007e-07 33 Linear |R| = 3.677894e-07 34 Linear |R| = 2.640006e-07 1 Nonlinear |R| = 2.400310e-02 0 Linear |R| = 2.400310e-02 1 Linear |R| = 9.102075e-03 2 Linear |R| = 5.017381e-03 3 Linear |R| = 3.840732e-03 4 Linear |R| = 2.990685e-03 5 Linear |R| = 1.990203e-03 6 Linear |R| = 1.085764e-03 7 Linear |R| = 4.657779e-04 8 Linear |R| = 3.049692e-04 9 Linear |R| = 1.625839e-04 10 Linear |R| = 1.124700e-04 11 Linear |R| = 7.764153e-05 12 Linear |R| = 5.698577e-05 13 Linear |R| = 4.581843e-05 14 Linear |R| = 4.262610e-05 15 Linear |R| = 3.792804e-05 16 Linear |R| = 3.404168e-05 17 Linear |R| = 2.536004e-05 18 Linear |R| = 1.577559e-05 19 Linear |R| = 9.099392e-06 20 Linear |R| = 6.140685e-06 21 Linear |R| = 5.083606e-06 22 Linear |R| = 4.521560e-06 23 Linear |R| = 3.601845e-06 24 Linear |R| = 2.776090e-06 25 Linear |R| = 2.252274e-06 26 Linear |R| = 1.898090e-06 27 Linear |R| = 1.620684e-06 28 Linear |R| = 1.395574e-06 29 Linear |R| = 1.157953e-06 30 Linear |R| = 9.540738e-07 31 Linear |R| = 8.487724e-07 32 Linear |R| = 7.634710e-07 33 Linear |R| = 6.254549e-07 34 Linear |R| = 4.811588e-07 35 Linear |R| = 3.930739e-07 36 Linear |R| = 3.340577e-07 37 Linear |R| = 2.873430e-07 38 Linear |R| = 2.407606e-07 39 Linear |R| = 1.978818e-07 2 Nonlinear |R| = 4.185813e-04 0 Linear |R| = 4.185813e-04 1 Linear |R| = 1.406808e-04 2 Linear |R| = 7.266714e-05 3 Linear |R| = 5.734138e-05 4 Linear |R| = 4.524739e-05 5 Linear |R| = 3.025661e-05 6 Linear |R| = 1.946626e-05 7 Linear |R| = 1.005809e-05 8 Linear |R| = 7.639142e-06 9 Linear |R| = 6.668613e-06 10 Linear |R| = 6.070601e-06 11 Linear |R| = 5.496769e-06 12 Linear |R| = 4.388115e-06 13 Linear |R| = 2.966258e-06 14 Linear |R| = 1.838201e-06 15 Linear |R| = 9.709174e-07 16 Linear |R| = 6.743766e-07 17 Linear |R| = 5.531138e-07 18 Linear |R| = 4.649969e-07 19 Linear |R| = 3.982799e-07 20 Linear |R| = 3.662679e-07 21 Linear |R| = 3.309140e-07 22 Linear |R| = 2.652039e-07 23 Linear |R| = 1.728911e-07 24 Linear |R| = 1.005779e-07 25 Linear |R| = 5.747041e-08 26 Linear |R| = 4.185011e-08 27 Linear |R| = 3.394446e-08 28 Linear |R| = 2.788435e-08 29 Linear |R| = 2.046992e-08 30 Linear |R| = 1.231943e-08 31 Linear |R| = 8.724911e-09 32 Linear |R| = 6.390162e-09 33 Linear |R| = 5.060595e-09 34 Linear |R| = 4.216656e-09 35 Linear |R| = 2.969865e-09 3 Nonlinear |R| = 2.494055e-07 0 Linear |R| = 2.494055e-07 1 Linear |R| = 8.559637e-08 2 Linear |R| = 4.335101e-08 3 Linear |R| = 3.214303e-08 4 Linear |R| = 2.549409e-08 5 Linear |R| = 1.899624e-08 6 Linear |R| = 1.522624e-08 7 Linear |R| = 1.258408e-08 8 Linear |R| = 1.098545e-08 9 Linear |R| = 1.009481e-08 10 Linear |R| = 8.423983e-09 11 Linear |R| = 6.946144e-09 12 Linear |R| = 5.624875e-09 13 Linear |R| = 4.448760e-09 14 Linear |R| = 2.834320e-09 15 Linear |R| = 1.614722e-09 16 Linear |R| = 9.409384e-10 17 Linear |R| = 7.775851e-10 18 Linear |R| = 6.905971e-10 19 Linear |R| = 6.129201e-10 20 Linear |R| = 5.438935e-10 21 Linear |R| = 4.435519e-10 22 Linear |R| = 3.486621e-10 23 Linear |R| = 2.811928e-10 24 Linear |R| = 2.159800e-10 25 Linear |R| = 1.670940e-10 26 Linear |R| = 1.338889e-10 27 Linear |R| = 9.926067e-11 28 Linear |R| = 7.483221e-11 29 Linear |R| = 5.045662e-11 30 Linear |R| = 2.772335e-11 31 Linear |R| = 1.814968e-11 32 Linear |R| = 1.264268e-11 33 Linear |R| = 9.856586e-12 34 Linear |R| = 7.802757e-12 35 Linear |R| = 6.092276e-12 36 Linear |R| = 4.785005e-12 37 Linear |R| = 3.887554e-12 38 Linear |R| = 3.125756e-12 39 Linear |R| = 2.543989e-12 40 Linear |R| = 2.062100e-12 4 Nonlinear |R| = 2.522382e-12 ------------------------------------------------------------------------------------------------------------ | Whale Performance: Alive time=4.25229, Active time=4.12034 | ------------------------------------------------------------------------------------------------------------ | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | | w/o Sub w/o Sub With Sub With Sub w/o S With S | |------------------------------------------------------------------------------------------------------------| | | | | | Exodus | | output() 2 0.0126 0.006300 0.0126 0.006300 0.31 0.31 | | | | Setup | | computeAuxiliaryKernels() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | | computeControls() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | | | Solve | | ComputeResidualThread 330 3.8288 0.011603 3.8288 0.011603 92.93 92.93 | | computeDiracContributions() 331 0.0002 0.000001 0.0002 0.000001 0.00 0.00 | | compute_jacobian() 1 0.1302 0.130242 0.1302 0.130243 3.16 3.16 | | compute_residual() 330 0.0394 0.000119 3.8724 0.011735 0.96 93.98 | | residual.close3() 330 0.0020 0.000006 0.0020 0.000006 0.05 0.05 | | residual.close4() 330 0.0019 0.000006 0.0019 0.000006 0.05 0.05 | | solve() 1 0.1052 0.105226 4.1079 4.107900 2.55 99.70 | ------------------------------------------------------------------------------------------------------------ | Totals: 1663 4.1203 100.00 | ------------------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------- | Setup Performance: Alive time=4.25186, Active time=0.060553 | ------------------------------------------------------------------------------------------------------------------------- | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | | w/o Sub w/o Sub With Sub With Sub w/o S With S | |-------------------------------------------------------------------------------------------------------------------------| | | | | | Setup | | Create Executioner 1 0.0003 0.000313 0.0003 0.000313 0.52 0.52 | | FEProblem::init::meshChanged() 1 0.0016 0.001626 0.0016 0.001626 2.69 2.69 | | Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 | | Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.00 0.00 | | Initial execTransfers() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000396 0.0004 0.000396 0.65 0.65 | | Initial updateGeomSearch() 2 0.0000 0.000002 0.0000 0.000002 0.00 0.00 | | NonlinearSystem::update() 1 0.0000 0.000036 0.0000 0.000036 0.06 0.06 | | Output Initial Condition 1 0.0107 0.010671 0.0107 0.010671 17.62 17.62 | | Prepare Mesh 1 0.0017 0.001737 0.0017 0.001737 2.87 2.87 | | copySolutionsBackwards() 1 0.0000 0.000023 0.0000 0.000023 0.04 0.04 | | eq.init() 1 0.0455 0.045469 0.0455 0.045469 75.09 75.09 | | getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 | | initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | | maxQps() 1 0.0003 0.000263 0.0003 0.000263 0.43 0.43 | | reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | | | ghostGhostedBoundaries | | eq.init() 1 0.0000 0.000002 0.0000 0.000002 0.00 0.00 | ------------------------------------------------------------------------------------------------------------------------- | Totals: 18 0.0606 100.00 | ------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- Framework Information: MOOSE version: git commit e79292e on 2016-01-05 PETSc Version: 3.6.1 Current Time: Tue Mar 1 00:41:08 2016 Executable Timestamp: Tue Mar 1 00:36:30 2016 Parallelism: Num Processors: 1 Num Threads: 1 Mesh: Distribution: serial Mesh Dimension: 3 Spatial Dimension: 3 Nodes: Total: 1331 Local: 1331 Elems: Total: 1000 Local: 1000 Num Subdomains: 1 Num Partitions: 1 Nonlinear System: Num DOFs: 3993 Num Local DOFs: 3993 Variables: { "disp_x" "disp_y" "disp_z" } Finite Element Types: "LAGRANGE" Approximation Orders: "FIRST" Execution Information: Executioner: PassoSteady Solver Mode: NEWTON In Function SNESCreate_passo_Newton_Solver it. || g || ------- ------------------- 1 0.0240028 2 0.000418569 3 2.49436e-07 4 1.52966e-12 Solver converged in 4 iterations. Outlier Variable Residual Norms: disp_z: 2.850000e-02 ------------------------------------------------------------------------------------------------------------ | Whale Performance: Alive time=199.422, Active time=199.285 | ------------------------------------------------------------------------------------------------------------ | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | | w/o Sub w/o Sub With Sub With Sub w/o S With S | |------------------------------------------------------------------------------------------------------------| | | | | | Exodus | | output() 2 0.0110 0.005503 0.0110 0.005503 0.01 0.01 | | | | Setup | | computeAuxiliaryKernels() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | computeControls() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | | computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | | | Solve | | ComputeResidualThread 15985 189.8671 0.011878 189.8671 0.011878 95.27 95.27 | | computeDiracContributions() 15986 0.0106 0.000001 0.0106 0.000001 0.01 0.01 | | compute_jacobian() 1 0.1265 0.126484 0.1265 0.126485 0.06 0.06 | | compute_residual() 15985 2.1702 0.000136 192.3009 0.012030 1.09 96.50 | | residual.close3() 15985 0.1200 0.000008 0.1200 0.000008 0.06 0.06 | | residual.close4() 15985 0.1263 0.000008 0.1263 0.000008 0.06 0.06 | | solve() 1 6.8537 6.853733 199.2830 199.282969 3.44 100.00 | ------------------------------------------------------------------------------------------------------------ | Totals: 79938 199.2854 100.00 | ------------------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------- | Setup Performance: Alive time=199.422, Active time=0.055161 | ------------------------------------------------------------------------------------------------------------------------- | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | | w/o Sub w/o Sub With Sub With Sub w/o S With S | |-------------------------------------------------------------------------------------------------------------------------| | | | | | Setup | | Create Executioner 1 0.0003 0.000315 0.0003 0.000315 0.57 0.57 | | FEProblem::init::meshChanged() 1 0.0016 0.001593 0.0016 0.001593 2.89 2.89 | | Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 | | Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.01 0.01 | | Initial execTransfers() 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | | Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000380 0.0004 0.000380 0.69 0.69 | | Initial updateGeomSearch() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | NonlinearSystem::update() 1 0.0000 0.000034 0.0000 0.000034 0.06 0.06 | | Output Initial Condition 1 0.0091 0.009094 0.0091 0.009094 16.49 16.49 | | Prepare Mesh 1 0.0018 0.001766 0.0018 0.001766 3.20 3.20 | | copySolutionsBackwards() 1 0.0000 0.000025 0.0000 0.000025 0.05 0.05 | | eq.init() 1 0.0417 0.041668 0.0417 0.041668 75.54 75.54 | | getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 | | initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | | maxQps() 1 0.0003 0.000270 0.0003 0.000270 0.49 0.49 | | reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | | | | ghostGhostedBoundaries | | eq.init() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | ------------------------------------------------------------------------------------------------------------------------- | Totals: 18 0.0552 100.00 | ------------------------------------------------------------------------------------------------------------------------- From patrick.sanan at gmail.com Tue Mar 1 03:11:07 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 1 Mar 2016 10:11:07 +0100 Subject: [petsc-users] MOOSE_SNES_ISSUES In-Reply-To: References: Message-ID: <20160301091107.GC69645@geop-332.ethz.ch> On Tue, Mar 01, 2016 at 09:58:18AM +0100, alena kopanicakova wrote: > Hello, > > I am developing my own nonlinear solver, which aims to solve simulations > from MOOSE. Communication with moose is done via SNES interface. > > I am obtaining Jacobian and residual in following way: > > SNESComputeFunction(snes, X, R); > > SNESSetJacobian(snes, jac, jac, SNESComputeJacobianDefault, As here, http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESComputeJacobianDefault.html , it seems like you are computing the Jacobian in a brute-force way with finite differences. Do you have an expression to compute the Jacobian? > NULL); > SNESComputeJacobian(snes, X, jac, jac); > > Unfortunately, by this setting it takes incredible amount of time to obtain > Jacobian. > Taking closer look at perf log, it seems that difference between mine and > MOOSE solver is, that my executioner calls compute_residual() function > ridiculously many times. > I have no idea, what could be causing such a behavior. > > Do you have any suggestions, how to set up interface properly? or which > change is needed for obtaining more-less same performance as MOOSE > executioner? > > > many thanks, > alena > > Framework Information: > MOOSE version: git commit e79292e on 2016-01-05 > PETSc Version: 3.6.1 > Current Time: Tue Mar 1 00:51:45 2016 > Executable Timestamp: Tue Mar 1 00:36:30 2016 > > Parallelism: > Num Processors: 1 > Num Threads: 1 > > Mesh: > Distribution: serial > Mesh Dimension: 3 > Spatial Dimension: 3 > Nodes: > Total: 1331 > Local: 1331 > Elems: > Total: 1000 > Local: 1000 > Num Subdomains: 1 > Num Partitions: 1 > > Nonlinear System: > Num DOFs: 3993 > Num Local DOFs: 3993 > Variables: { "disp_x" "disp_y" "disp_z" } > Finite Element Types: "LAGRANGE" > Approximation Orders: "FIRST" > > Execution Information: > Executioner: Steady > Solver Mode: NEWTON > > > > 0 Nonlinear |R| = 2.850000e-02 > 0 Linear |R| = 2.850000e-02 > 1 Linear |R| = 1.653670e-02 > 2 Linear |R| = 1.447168e-02 > 3 Linear |R| = 1.392965e-02 > 4 Linear |R| = 1.258440e-02 > 5 Linear |R| = 1.007181e-02 > 6 Linear |R| = 8.264315e-03 > 7 Linear |R| = 6.541897e-03 > 8 Linear |R| = 4.371900e-03 > 9 Linear |R| = 2.100406e-03 > 10 Linear |R| = 1.227539e-03 > 11 Linear |R| = 1.026286e-03 > 12 Linear |R| = 9.180101e-04 > 13 Linear |R| = 8.087598e-04 > 14 Linear |R| = 6.435247e-04 > 15 Linear |R| = 5.358688e-04 > 16 Linear |R| = 4.551657e-04 > 17 Linear |R| = 4.090276e-04 > 18 Linear |R| = 3.359290e-04 > 19 Linear |R| = 2.417643e-04 > 20 Linear |R| = 1.710452e-04 > 21 Linear |R| = 1.261996e-04 > 22 Linear |R| = 9.384052e-05 > 23 Linear |R| = 6.070637e-05 > 24 Linear |R| = 4.283233e-05 > 25 Linear |R| = 3.383792e-05 > 26 Linear |R| = 2.342289e-05 > 27 Linear |R| = 1.700855e-05 > 28 Linear |R| = 9.814278e-06 > 29 Linear |R| = 4.398519e-06 > 30 Linear |R| = 2.161205e-06 > 31 Linear |R| = 1.289206e-06 > 32 Linear |R| = 6.548007e-07 > 33 Linear |R| = 3.677894e-07 > 34 Linear |R| = 2.640006e-07 > 1 Nonlinear |R| = 2.400310e-02 > 0 Linear |R| = 2.400310e-02 > 1 Linear |R| = 9.102075e-03 > 2 Linear |R| = 5.017381e-03 > 3 Linear |R| = 3.840732e-03 > 4 Linear |R| = 2.990685e-03 > 5 Linear |R| = 1.990203e-03 > 6 Linear |R| = 1.085764e-03 > 7 Linear |R| = 4.657779e-04 > 8 Linear |R| = 3.049692e-04 > 9 Linear |R| = 1.625839e-04 > 10 Linear |R| = 1.124700e-04 > 11 Linear |R| = 7.764153e-05 > 12 Linear |R| = 5.698577e-05 > 13 Linear |R| = 4.581843e-05 > 14 Linear |R| = 4.262610e-05 > 15 Linear |R| = 3.792804e-05 > 16 Linear |R| = 3.404168e-05 > 17 Linear |R| = 2.536004e-05 > 18 Linear |R| = 1.577559e-05 > 19 Linear |R| = 9.099392e-06 > 20 Linear |R| = 6.140685e-06 > 21 Linear |R| = 5.083606e-06 > 22 Linear |R| = 4.521560e-06 > 23 Linear |R| = 3.601845e-06 > 24 Linear |R| = 2.776090e-06 > 25 Linear |R| = 2.252274e-06 > 26 Linear |R| = 1.898090e-06 > 27 Linear |R| = 1.620684e-06 > 28 Linear |R| = 1.395574e-06 > 29 Linear |R| = 1.157953e-06 > 30 Linear |R| = 9.540738e-07 > 31 Linear |R| = 8.487724e-07 > 32 Linear |R| = 7.634710e-07 > 33 Linear |R| = 6.254549e-07 > 34 Linear |R| = 4.811588e-07 > 35 Linear |R| = 3.930739e-07 > 36 Linear |R| = 3.340577e-07 > 37 Linear |R| = 2.873430e-07 > 38 Linear |R| = 2.407606e-07 > 39 Linear |R| = 1.978818e-07 > 2 Nonlinear |R| = 4.185813e-04 > 0 Linear |R| = 4.185813e-04 > 1 Linear |R| = 1.406808e-04 > 2 Linear |R| = 7.266714e-05 > 3 Linear |R| = 5.734138e-05 > 4 Linear |R| = 4.524739e-05 > 5 Linear |R| = 3.025661e-05 > 6 Linear |R| = 1.946626e-05 > 7 Linear |R| = 1.005809e-05 > 8 Linear |R| = 7.639142e-06 > 9 Linear |R| = 6.668613e-06 > 10 Linear |R| = 6.070601e-06 > 11 Linear |R| = 5.496769e-06 > 12 Linear |R| = 4.388115e-06 > 13 Linear |R| = 2.966258e-06 > 14 Linear |R| = 1.838201e-06 > 15 Linear |R| = 9.709174e-07 > 16 Linear |R| = 6.743766e-07 > 17 Linear |R| = 5.531138e-07 > 18 Linear |R| = 4.649969e-07 > 19 Linear |R| = 3.982799e-07 > 20 Linear |R| = 3.662679e-07 > 21 Linear |R| = 3.309140e-07 > 22 Linear |R| = 2.652039e-07 > 23 Linear |R| = 1.728911e-07 > 24 Linear |R| = 1.005779e-07 > 25 Linear |R| = 5.747041e-08 > 26 Linear |R| = 4.185011e-08 > 27 Linear |R| = 3.394446e-08 > 28 Linear |R| = 2.788435e-08 > 29 Linear |R| = 2.046992e-08 > 30 Linear |R| = 1.231943e-08 > 31 Linear |R| = 8.724911e-09 > 32 Linear |R| = 6.390162e-09 > 33 Linear |R| = 5.060595e-09 > 34 Linear |R| = 4.216656e-09 > 35 Linear |R| = 2.969865e-09 > 3 Nonlinear |R| = 2.494055e-07 > 0 Linear |R| = 2.494055e-07 > 1 Linear |R| = 8.559637e-08 > 2 Linear |R| = 4.335101e-08 > 3 Linear |R| = 3.214303e-08 > 4 Linear |R| = 2.549409e-08 > 5 Linear |R| = 1.899624e-08 > 6 Linear |R| = 1.522624e-08 > 7 Linear |R| = 1.258408e-08 > 8 Linear |R| = 1.098545e-08 > 9 Linear |R| = 1.009481e-08 > 10 Linear |R| = 8.423983e-09 > 11 Linear |R| = 6.946144e-09 > 12 Linear |R| = 5.624875e-09 > 13 Linear |R| = 4.448760e-09 > 14 Linear |R| = 2.834320e-09 > 15 Linear |R| = 1.614722e-09 > 16 Linear |R| = 9.409384e-10 > 17 Linear |R| = 7.775851e-10 > 18 Linear |R| = 6.905971e-10 > 19 Linear |R| = 6.129201e-10 > 20 Linear |R| = 5.438935e-10 > 21 Linear |R| = 4.435519e-10 > 22 Linear |R| = 3.486621e-10 > 23 Linear |R| = 2.811928e-10 > 24 Linear |R| = 2.159800e-10 > 25 Linear |R| = 1.670940e-10 > 26 Linear |R| = 1.338889e-10 > 27 Linear |R| = 9.926067e-11 > 28 Linear |R| = 7.483221e-11 > 29 Linear |R| = 5.045662e-11 > 30 Linear |R| = 2.772335e-11 > 31 Linear |R| = 1.814968e-11 > 32 Linear |R| = 1.264268e-11 > 33 Linear |R| = 9.856586e-12 > 34 Linear |R| = 7.802757e-12 > 35 Linear |R| = 6.092276e-12 > 36 Linear |R| = 4.785005e-12 > 37 Linear |R| = 3.887554e-12 > 38 Linear |R| = 3.125756e-12 > 39 Linear |R| = 2.543989e-12 > 40 Linear |R| = 2.062100e-12 > 4 Nonlinear |R| = 2.522382e-12 > > ------------------------------------------------------------------------------------------------------------ > | Whale Performance: Alive time=4.25229, Active time=4.12034 | > ------------------------------------------------------------------------------------------------------------ > | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | > | w/o Sub w/o Sub With Sub With Sub w/o S With S | > |------------------------------------------------------------------------------------------------------------| > | | > | | > | Exodus | > | output() 2 0.0126 0.006300 0.0126 0.006300 0.31 0.31 | > | | > | Setup | > | computeAuxiliaryKernels() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | > | computeControls() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | | > | Solve | > | ComputeResidualThread 330 3.8288 0.011603 3.8288 0.011603 92.93 92.93 | > | computeDiracContributions() 331 0.0002 0.000001 0.0002 0.000001 0.00 0.00 | > | compute_jacobian() 1 0.1302 0.130242 0.1302 0.130243 3.16 3.16 | > | compute_residual() 330 0.0394 0.000119 3.8724 0.011735 0.96 93.98 | > | residual.close3() 330 0.0020 0.000006 0.0020 0.000006 0.05 0.05 | > | residual.close4() 330 0.0019 0.000006 0.0019 0.000006 0.05 0.05 | > | solve() 1 0.1052 0.105226 4.1079 4.107900 2.55 99.70 | > ------------------------------------------------------------------------------------------------------------ > | Totals: 1663 4.1203 100.00 | > ------------------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------------------------------- > | Setup Performance: Alive time=4.25186, Active time=0.060553 | > ------------------------------------------------------------------------------------------------------------------------- > | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | > | w/o Sub w/o Sub With Sub With Sub w/o S With S | > |-------------------------------------------------------------------------------------------------------------------------| > | | > | | > | Setup | > | Create Executioner 1 0.0003 0.000313 0.0003 0.000313 0.52 0.52 | > | FEProblem::init::meshChanged() 1 0.0016 0.001626 0.0016 0.001626 2.69 2.69 | > | Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 | > | Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.00 0.00 | > | Initial execTransfers() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000396 0.0004 0.000396 0.65 0.65 | > | Initial updateGeomSearch() 2 0.0000 0.000002 0.0000 0.000002 0.00 0.00 | > | NonlinearSystem::update() 1 0.0000 0.000036 0.0000 0.000036 0.06 0.06 | > | Output Initial Condition 1 0.0107 0.010671 0.0107 0.010671 17.62 17.62 | > | Prepare Mesh 1 0.0017 0.001737 0.0017 0.001737 2.87 2.87 | > | copySolutionsBackwards() 1 0.0000 0.000023 0.0000 0.000023 0.04 0.04 | > | eq.init() 1 0.0455 0.045469 0.0455 0.045469 75.09 75.09 | > | getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 | > | initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | > | maxQps() 1 0.0003 0.000263 0.0003 0.000263 0.43 0.43 | > | reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | | > | ghostGhostedBoundaries | > | eq.init() 1 0.0000 0.000002 0.0000 0.000002 0.00 0.00 | > ------------------------------------------------------------------------------------------------------------------------- > | Totals: 18 0.0606 100.00 | > ------------------------------------------------------------------------------------------------------------------------- > > Framework Information: > MOOSE version: git commit e79292e on 2016-01-05 > PETSc Version: 3.6.1 > Current Time: Tue Mar 1 00:41:08 2016 > Executable Timestamp: Tue Mar 1 00:36:30 2016 > > Parallelism: > Num Processors: 1 > Num Threads: 1 > > Mesh: > Distribution: serial > Mesh Dimension: 3 > Spatial Dimension: 3 > Nodes: > Total: 1331 > Local: 1331 > Elems: > Total: 1000 > Local: 1000 > Num Subdomains: 1 > Num Partitions: 1 > > Nonlinear System: > Num DOFs: 3993 > Num Local DOFs: 3993 > Variables: { "disp_x" "disp_y" "disp_z" } > Finite Element Types: "LAGRANGE" > Approximation Orders: "FIRST" > > Execution Information: > Executioner: PassoSteady > Solver Mode: NEWTON > > > > In Function SNESCreate_passo_Newton_Solver > > it. || g || > ------- ------------------- > 1 0.0240028 > 2 0.000418569 > 3 2.49436e-07 > 4 1.52966e-12 > > Solver converged in 4 iterations. > > Outlier Variable Residual Norms: > disp_z: 2.850000e-02 > > ------------------------------------------------------------------------------------------------------------ > | Whale Performance: Alive time=199.422, Active time=199.285 | > ------------------------------------------------------------------------------------------------------------ > | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | > | w/o Sub w/o Sub With Sub With Sub w/o S With S | > |------------------------------------------------------------------------------------------------------------| > | | > | | > | Exodus | > | output() 2 0.0110 0.005503 0.0110 0.005503 0.01 0.01 | > | | > | Setup | > | computeAuxiliaryKernels() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | computeControls() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | > | computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | | > | Solve | > | ComputeResidualThread 15985 189.8671 0.011878 189.8671 0.011878 95.27 95.27 | > | computeDiracContributions() 15986 0.0106 0.000001 0.0106 0.000001 0.01 0.01 | > | compute_jacobian() 1 0.1265 0.126484 0.1265 0.126485 0.06 0.06 | > | compute_residual() 15985 2.1702 0.000136 192.3009 0.012030 1.09 96.50 | > | residual.close3() 15985 0.1200 0.000008 0.1200 0.000008 0.06 0.06 | > | residual.close4() 15985 0.1263 0.000008 0.1263 0.000008 0.06 0.06 | > | solve() 1 6.8537 6.853733 199.2830 199.282969 3.44 100.00 | > ------------------------------------------------------------------------------------------------------------ > | Totals: 79938 199.2854 100.00 | > ------------------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------------------------------- > | Setup Performance: Alive time=199.422, Active time=0.055161 | > ------------------------------------------------------------------------------------------------------------------------- > | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | > | w/o Sub w/o Sub With Sub With Sub w/o S With S | > |-------------------------------------------------------------------------------------------------------------------------| > | | > | | > | Setup | > | Create Executioner 1 0.0003 0.000315 0.0003 0.000315 0.57 0.57 | > | FEProblem::init::meshChanged() 1 0.0016 0.001593 0.0016 0.001593 2.89 2.89 | > | Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 | > | Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.01 0.01 | > | Initial execTransfers() 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | > | Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000380 0.0004 0.000380 0.69 0.69 | > | Initial updateGeomSearch() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | NonlinearSystem::update() 1 0.0000 0.000034 0.0000 0.000034 0.06 0.06 | > | Output Initial Condition 1 0.0091 0.009094 0.0091 0.009094 16.49 16.49 | > | Prepare Mesh 1 0.0018 0.001766 0.0018 0.001766 3.20 3.20 | > | copySolutionsBackwards() 1 0.0000 0.000025 0.0000 0.000025 0.05 0.05 | > | eq.init() 1 0.0417 0.041668 0.0417 0.041668 75.54 75.54 | > | getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 | > | initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 | > | maxQps() 1 0.0003 0.000270 0.0003 0.000270 0.49 0.49 | > | reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > | | > | ghostGhostedBoundaries | > | eq.init() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 | > ------------------------------------------------------------------------------------------------------------------------- > | Totals: 18 0.0552 100.00 | > ------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 473 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 1 06:10:40 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Mar 2016 06:10:40 -0600 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: References: Message-ID: On Tue, Mar 1, 2016 at 12:39 AM, Ed Bueler wrote: > Barry -- > > Will try it. > > > ... since, presumably, other more powerful IO tools exist that would be > used for "real" problems? > > I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. > In fact petsc binary seems fairly convenient for that. On the other hand, > I am not sure I've ever done anything "real". ;-) > > Anyone out there: Are there a good *convenient* tools for saving > space/time-series (= movies) from PETSc TS? I want to add frames and > movies from PETSc into slides, etc. I can think of NetCDF but it seems > not-very-convenient, and I am worried not well-supported from PETSc. Is > setting up TS with events (=TSSetEventMonitor()) and writing separate > snapshot files the preferred scalable usage, despite the extra effort > compared to "-ts_monitor_solution binary:foo.dat"? > I use HDF5, which is not ideal, but I see no alternative as yet. Matt > > Ed > > > On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: > >> >> Ed, >> >> I have added a branch barry/feature-ts-monitor-binary that supports >> -ts_monitor binary:timesteps that will store in simple binary format each >> of the time steps associated with each solution. This in conjugation with >> -ts_monitor_solution binary:solutions will give you two files you can read >> in. But note that timesteps is a simple binary file of double precision >> numbers you should read in directly in python, you cannot use >> PetscBinaryIO.py which is what you will use to read in the solutions file. >> >> Barry >> >> Currently PETSc has a binary file format where we can save Vec, Mat, IS, >> each is marked with a type id for PetscBinaryIO.py to detect, we do not >> have type ids for simple double precision numbers or arrays of numbers. >> This is why I have no way of saving the time steps in a way that >> PetscBinaryIO.py could read them in currently. I don't know how far we want >> to go in "spiffing up" the PETSc binary format to do more elaborate things >> since, presumably, other more power IO tools exist that would be used for >> "real" problems? >> >> >> > On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: >> > >> > Dear PETSc -- >> > >> > I have a short C ode code that uses TS to solve y' = g(t,y) where >> y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does >> adaptive time-stepping; thus using -ts_monitor shows times at stdout: >> > >> > $ ./ode -ts_monitor >> > solving from t0 = 0.000 with initial time step dt = 0.10000 ... >> > 0 TS dt 0.1 time 0. >> > 1 TS dt 0.170141 time 0.1 >> > 2 TS dt 0.169917 time 0.270141 >> > 3 TS dt 0.171145 time 0.440058 >> > 4 TS dt 0.173931 time 0.611203 >> > 5 TS dt 0.178719 time 0.785134 >> > 6 TS dt 0.0361473 time 0.963853 >> > 7 TS dt 0.188252 time 1. >> > error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 >> > >> > I want to output the trajectory in PETSc binary and plot it in python >> using bin/PetscBinaryIO.py. Clearly I need the times shown above to do >> that. >> > >> > Note "-ts_monitor_solution binary:XX" gives me a binary file with only >> y values in it, but not the corresponding times. >> > >> > My question is, how to get those times in either the same binary file >> (preferred) or separate binary files? I have tried >> > >> > $ ./ode -ts_monitor binary:foo.dat # invalid >> > $ ./ode -ts_monitor_solution binary:bar.dat # no t in file >> > $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no >> t in file >> > >> > without success. (I am not sure what the boolean option >> -ts_save_trajectory does, by the way.) >> > >> > Thanks! >> > >> > Ed >> > >> > PS Sorry if this is a "RTFM" question, but so far I can't find the >> documentation. >> > >> > >> > -- >> > Ed Bueler >> > Dept of Math and Stat and Geophysical Institute >> > University of Alaska Fairbanks >> > Fairbanks, AK 99775-6660 >> > 301C Chapman and 410D Elvey >> > 907 474-7693 and 907 474-7199 (fax 907 474-5394) >> >> > > > -- > Ed Bueler > Dept of Math and Stat and Geophysical Institute > University of Alaska Fairbanks > Fairbanks, AK 99775-6660 > 301C Chapman and 410D Elvey > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From cyrill.von.planta at usi.ch Tue Mar 1 07:57:21 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Tue, 1 Mar 2016 13:57:21 +0000 Subject: [petsc-users] Example for FAS Message-ID: <0223DD4E-A7A7-4A00-B98D-6F1AD043E7E7@usi.ch> Dear PETSc User-list, I am trying to use the FAS from PETSc in version 3.6.3. For other solvers there would be usually an example or two on the documentation, but on http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFAS.html I couldn?t find one at the bottom. Does anybody have an example for FAS using different preconditioners etc.? Best regards Cyrill From jed at jedbrown.org Tue Mar 1 08:05:30 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Mar 2016 14:05:30 +0000 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: References: Message-ID: <87ziui70ph.fsf@jedbrown.org> Ed Bueler writes: > Barry -- > > I am reading the resulting file successfully using > > import struct > import sys > f = open('timesteps','r') > while True: > try: > bytes = f.read(8) > except: > print "f.read() failed" > sys.exit(1) It seems odd to catch the exception and convert to sys.exit(1). This just loses the stack trace (which might have more detailed information) and makes the debugger catch the wrong location. > if len(bytes) > 0: > print struct.unpack('>d',bytes)[0] > else: > break > f.close() > > However, was there a more elegant intended method? I would use numpy.fromfile, which is one line. > I am disturbed by the apparent need to specify big-endian-ness (= > '>d') in the struct.unpack() call. You need it because PETSc binary files are big-endian. There is no place to encode the byte order in these raw binary files, so some convention is required for the file to portable. HDF5 includes this information, though it is an annoying dependency. Numpy files (*.npy) are a simpler alternative that PETSc could decide to support. https://docs.scipy.org/doc/numpy/neps/npy-format.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Tue Mar 1 08:08:01 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Mar 2016 14:08:01 +0000 Subject: [petsc-users] Example for FAS In-Reply-To: <0223DD4E-A7A7-4A00-B98D-6F1AD043E7E7@usi.ch> References: <0223DD4E-A7A7-4A00-B98D-6F1AD043E7E7@usi.ch> Message-ID: <87wppm70la.fsf@jedbrown.org> Cyrill Vonplanta writes: > Dear PETSc User-list, > > I am trying to use the FAS from PETSc in version 3.6.3. For other solvers there would be usually an example or two on the documentation, but on http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFAS.html I couldn?t find one at the bottom. It's pure run-time options. See src/snes/examples/tutorials/makefile and search for '-snes_type fas'. ex5 and ex19 are good examples to check out. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 1 08:21:10 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Mar 2016 08:21:10 -0600 Subject: [petsc-users] Example for FAS In-Reply-To: <87wppm70la.fsf@jedbrown.org> References: <0223DD4E-A7A7-4A00-B98D-6F1AD043E7E7@usi.ch> <87wppm70la.fsf@jedbrown.org> Message-ID: On Tue, Mar 1, 2016 at 8:08 AM, Jed Brown wrote: > Cyrill Vonplanta writes: > > > Dear PETSc User-list, > > > > I am trying to use the FAS from PETSc in version 3.6.3. For other > solvers there would be usually an example or two on the documentation, but > on > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFAS.html > I couldn?t find one at the bottom. > > It's pure run-time options. See src/snes/examples/tutorials/makefile > and search for '-snes_type fas'. ex5 and ex19 are good examples to > check out. > I also have examples from SNES ex12 if you want an unstructured grid: https://bitbucket.org/petsc/petsc/src/d076321ee94a992e029f0665fc86b0401d68c775/config/builder.py?at=master&fileviewer=file-view-default#builder.py-390 and here is a SNES ex5 example with FAS https://bitbucket.org/petsc/petsc/src/d076321ee94a992e029f0665fc86b0401d68c775/src/snes/examples/tutorials/makefile?at=master&fileviewer=file-view-default#makefile-411 Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 1 08:29:09 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Mar 2016 14:29:09 +0000 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <3798056C-7C20-459A-8AE3-A6768CC0301C@mcs.anl.gov> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3798056C-7C20-459A-8AE3-A6768CC0301C@mcs.anl.gov> Message-ID: <87r3fu6zm2.fsf@jedbrown.org> Barry Smith writes: >> 4-2) Another observation is that the true residual stagnates way >> earlier which I assume is an indication that the RHS is in fact >> not in the range of A. > > You can hope this is the case. > > Of course one cannot really know if the residual is stagnating due > to an inconsistent right hand side or for some other reason. But if > it stagnates at 10^-4 it is probably due to inconsistent right hand > side if it stagnates at 10^-12 the right hand side is probably > consistent. If it stagnates at 10^-8 then it could be due to > inconsistent rhs and or some other reason. If the true residual stagnates while the preconditioned residual continues to converge, it may be that the preconditioner is singular or nearly so. This often happens if you use a black-box method for a saddle point problem, for example. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From mfadams at lbl.gov Tue Mar 1 08:59:34 2016 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 1 Mar 2016 09:59:34 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> Message-ID: On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith wrote: > > On Feb 29, 2016, at 5:36 PM, Mark Adams wrote: > > >>> GAMG is use for AMR problems like this a lot in BISICLES. >>> >> >> Thanks for the reference. However, a quick look at their paper suggests >> they are using a finite volume discretization which should be symmetric and >> avoid all the shenanigans I'm going through! >> > > No, they are not symmetric. FV is even worse than vertex centered > methods. The BCs and the C-F interfaces add non-symmetry. > > > If you use a different discretization, it is possible to make the c-f > interface discretization symmetric --- but symmetry appears to come at a > cost of the reduction in the formal order of accuracy in the flux along the > c-f interface. I can probably dig up some code that would make it easy to > compare. > I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with refluxing, which I do not linearize. PETSc sums fluxes at faces directly, perhaps this IS symmetric? Toby might know. > > -- Boyce > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 1 09:08:42 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Mar 2016 15:08:42 +0000 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> Message-ID: <87io166xs5.fsf@jedbrown.org> Mark Adams writes: > I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with > refluxing, which I do not linearize. PETSc sums fluxes at faces directly, > perhaps this IS symmetric? Depends on scaling. Refluxing versus evaluating fluxes once on the fine faces should give the same result. Note that you can make a symmetric FV discretization (for elliptic problems) by writing it as a mixed FE method and choosing a quadrature so that fluxes can be eliminated. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From griffith at cims.nyu.edu Tue Mar 1 09:16:27 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Tue, 1 Mar 2016 10:16:27 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> Message-ID: > On Mar 1, 2016, at 9:59 AM, Mark Adams wrote: > > > > On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith > wrote: > >> On Feb 29, 2016, at 5:36 PM, Mark Adams > wrote: >> >> >> GAMG is use for AMR problems like this a lot in BISICLES. >> >> Thanks for the reference. However, a quick look at their paper suggests they are using a finite volume discretization which should be symmetric and avoid all the shenanigans I'm going through! >> >> No, they are not symmetric. FV is even worse than vertex centered methods. The BCs and the C-F interfaces add non-symmetry. > > > If you use a different discretization, it is possible to make the c-f interface discretization symmetric --- but symmetry appears to come at a cost of the reduction in the formal order of accuracy in the flux along the c-f interface. I can probably dig up some code that would make it easy to compare. > > I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with refluxing, which I do not linearize. PETSc sums fluxes at faces directly, perhaps this IS symmetric? Toby might know. If you are talking about solving Poisson on a composite grid, then refluxing and summing up fluxes are probably the same procedure. Users of these kinds of discretizations usually want to use the conservative divergence at coarse-fine interfaces, and so the main question is how to set up the viscous/diffusive flux stencil at coarse-fine interfaces (or, equivalently, the stencil for evaluating ghost cell values at coarse-fine interfaces). It is possible to make the overall discretization symmetric if you use a particular stencil for the flux computation. I think this paper (http://www.ams.org/journals/mcom/1991-56-194/S0025-5718-1991-1066831-5/S0025-5718-1991-1066831-5.pdf ) is one place to look. (This stuff is related to "mimetic finite difference" discretizations of Poisson.) This coarse-fine interface discretization winds up being symmetric (although possibly only w.r.t. a weighted inner product --- I can't remember the details), but the fluxes are only first-order accurate at coarse-fine interfaces. -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From cyrill.von.planta at usi.ch Tue Mar 1 09:19:47 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Tue, 1 Mar 2016 15:19:47 +0000 Subject: [petsc-users] Example for FAS In-Reply-To: References: <0223DD4E-A7A7-4A00-B98D-6F1AD043E7E7@usi.ch> <87wppm70la.fsf@jedbrown.org> Message-ID: <0C0E6786-2F9A-4577-AB7D-8E7E00E8BCBD@usi.ch> Thanks a lot. Checking out with example 5 works fine, problem solved (for now). Cyrill From griffith at cims.nyu.edu Tue Mar 1 09:32:33 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Tue, 1 Mar 2016 10:32:33 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <87io166xs5.fsf@jedbrown.org> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <87io166xs5.fsf@jedbrown.org> Message-ID: <9400A2BC-73CD-46E9-A627-F8A43826FE24@cims.nyu.edu> > On Mar 1, 2016, at 10:08 AM, Jed Brown wrote: > > Mark Adams writes: > >> I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with >> refluxing, which I do not linearize. PETSc sums fluxes at faces directly, >> perhaps this IS symmetric? > > Depends on scaling. Refluxing versus evaluating fluxes once on the fine > faces should give the same result. Note that you can make a symmetric > FV discretization (for elliptic problems) by writing it as a mixed FE > method and choosing a quadrature so that fluxes can be eliminated. Jed, can you also do this for Stokes? It seems like something like RT0 is the right place to start. Thanks, -- Boyce From jed at jedbrown.org Tue Mar 1 09:56:07 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Mar 2016 15:56:07 +0000 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <9400A2BC-73CD-46E9-A627-F8A43826FE24@cims.nyu.edu> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <87io166xs5.fsf@jedbrown.org> <9400A2BC-73CD-46E9-A627-F8A43826FE24@cims.nyu.edu> Message-ID: <87d1re6vl4.fsf@jedbrown.org> Boyce Griffith writes: > Jed, can you also do this for Stokes? It seems like something like > RT0 is the right place to start. See, for example, Arnold, Falk, and Winther's 2007 paper on mixed FEM for elasticity with weakly imposed symmetry. It's the usual H(div) methodology and should apply equally well to Stokes. I'm not aware of any analysis or results of choosing quadrature to eliminate flux terms in these discretizations. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From max.rietmann at gmail.com Tue Mar 1 09:58:06 2016 From: max.rietmann at gmail.com (Max Rietmann) Date: Tue, 1 Mar 2016 16:58:06 +0100 Subject: [petsc-users] Read and use boundaries from exodus in finite element context Message-ID: <56D5BC0E.2090907@gmail.com> Dear all, We are building out a new finite-element code for wave propagation and I am currently implementing the boundary conditions. My first pass will simply have dirichlet boundaries, but eventually we will have more sophisticated options available. I am creating an exodus mesh in Trelis/Cubit, in which I can create one (or more) "side sets", which are properly read by the exodus reader. From reading the petsc source (plexexodusii.c), it seems that these basically create a list of faces, which belong to the side set. A call to DMPlexGetLabelValue(dm,"Face Set",face_id,&value), allows me to see if "face_id" is on the boundary by checking value (>=1 for boundary, or -1 for not in set). Additional side sets get ids = {2,3,etc}. This allows us to have multiple types of boundary (absorbing, reflecting, etc). However, I am unsure how to determine if an element has a particular face_id in order to determine if one face of the element is on a boundary (and which value it has {-1,1,2,3, etc}). The routine is listed here: PetscErrorCode DMPlexGetLabelValue(DM dm, const char name[], PetscInt point, PetscInt *value) How do I determine the "point" of a face, if I'm iterating through my elements. thanks! Max Rietmann PS I've also seen the DMPlexAddBoundary method, but I wasn't sure how to use it in our setting. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 1 10:09:03 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Mar 2016 10:09:03 -0600 Subject: [petsc-users] Read and use boundaries from exodus in finite element context In-Reply-To: <56D5BC0E.2090907@gmail.com> References: <56D5BC0E.2090907@gmail.com> Message-ID: On Tue, Mar 1, 2016 at 9:58 AM, Max Rietmann wrote: > Dear all, > > We are building out a new finite-element code for wave propagation and I > am currently implementing the boundary conditions. My first pass will > simply have dirichlet boundaries, but eventually we will have more > sophisticated options available. > Is this going to be a purely explicit code? > I am creating an exodus mesh in Trelis/Cubit, in which I can create one > (or more) "side sets", which are properly read by the exodus reader. From > reading the petsc source (plexexodusii.c), it seems that these basically > create a list of faces, which belong to the side set. > Yes. > A call to DMPlexGetLabelValue(dm,"Face Set",face_id,&value), allows me to > see if "face_id" is on the boundary by checking value (>=1 for boundary, or > -1 for not in set). Additional side sets get ids = {2,3,etc}. This allows > us to have multiple types of boundary (absorbing, reflecting, etc). > Yes. Note that you can also extract them all at once into a sorted array. > However, I am unsure how to determine if an element has a particular > face_id in order to determine if one face of the element is on a boundary > (and which value it has {-1,1,2,3, etc}). > Here you can see me doing the thing you are asking for: https://bitbucket.org/petsc/petsc/src/2d7a10a4145949cccd1c2cb7dc0d518dc12666a9/src/dm/impls/plex/plexsubmesh.c?at=master&fileviewer=file-view-default#plexsubmesh.c-106 The code is slightly overkill for what you want since it also works for vertices. You could call DMPlexGetSupport() on the face rather than DMPlexGetTransitiveClosure() and the code would be simpler. > The routine is listed here: > > PetscErrorCode DMPlexGetLabelValue(DM dm, const char name[], PetscInt > point, PetscInt *value) > > How do I determine the "point" of a face, if I'm iterating through my > elements. > > thanks! > > Max Rietmann > > PS I've also seen the DMPlexAddBoundary method, but I wasn't sure how to > use it in our setting. > This is used to hook into the DMPlexSetBoundaryValues() interface so that BC values are properly loaded into local vectors before assembly operations. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 1 10:32:27 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Mar 2016 10:32:27 -0600 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: References: Message-ID: <2795BBE1-C04B-45CA-B99C-2F36DD9BB7B4@mcs.anl.gov> > On Mar 1, 2016, at 12:39 AM, Ed Bueler wrote: > > Barry -- > > Will try it. > > > ... since, presumably, other more powerful IO tools exist that would be used for "real" problems? > > I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. In fact petsc binary seems fairly convenient for that. On the other hand, I am not sure I've ever done anything "real". ;-) > > Anyone out there: Are there a good *convenient* tools for saving space/time-series (= movies) from PETSc TS? I want to add frames and movies from PETSc into slides, etc. I can think of NetCDF but it seems not-very-convenient, and I am worried not well-supported from PETSc. Is setting up TS with events (=TSSetEventMonitor()) and writing separate snapshot files the preferred scalable usage, despite the extra effort compared to "-ts_monitor_solution binary:foo.dat"? Ed, As I said in my previous email since we don't have a way of indicating plain double precision numbers in our binary files it is not possible to put both the vectors and the time steps in the same file without augmenting our file format. Barry > > Ed > > > On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: > > Ed, > > I have added a branch barry/feature-ts-monitor-binary that supports -ts_monitor binary:timesteps that will store in simple binary format each of the time steps associated with each solution. This in conjugation with -ts_monitor_solution binary:solutions will give you two files you can read in. But note that timesteps is a simple binary file of double precision numbers you should read in directly in python, you cannot use PetscBinaryIO.py which is what you will use to read in the solutions file. > > Barry > > Currently PETSc has a binary file format where we can save Vec, Mat, IS, each is marked with a type id for PetscBinaryIO.py to detect, we do not have type ids for simple double precision numbers or arrays of numbers. This is why I have no way of saving the time steps in a way that PetscBinaryIO.py could read them in currently. I don't know how far we want to go in "spiffing up" the PETSc binary format to do more elaborate things since, presumably, other more power IO tools exist that would be used for "real" problems? > > > > On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: > > > > Dear PETSc -- > > > > I have a short C ode code that uses TS to solve y' = g(t,y) where y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does adaptive time-stepping; thus using -ts_monitor shows times at stdout: > > > > $ ./ode -ts_monitor > > solving from t0 = 0.000 with initial time step dt = 0.10000 ... > > 0 TS dt 0.1 time 0. > > 1 TS dt 0.170141 time 0.1 > > 2 TS dt 0.169917 time 0.270141 > > 3 TS dt 0.171145 time 0.440058 > > 4 TS dt 0.173931 time 0.611203 > > 5 TS dt 0.178719 time 0.785134 > > 6 TS dt 0.0361473 time 0.963853 > > 7 TS dt 0.188252 time 1. > > error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 > > > > I want to output the trajectory in PETSc binary and plot it in python using bin/PetscBinaryIO.py. Clearly I need the times shown above to do that. > > > > Note "-ts_monitor_solution binary:XX" gives me a binary file with only y values in it, but not the corresponding times. > > > > My question is, how to get those times in either the same binary file (preferred) or separate binary files? I have tried > > > > $ ./ode -ts_monitor binary:foo.dat # invalid > > $ ./ode -ts_monitor_solution binary:bar.dat # no t in file > > $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no t in file > > > > without success. (I am not sure what the boolean option -ts_save_trajectory does, by the way.) > > > > Thanks! > > > > Ed > > > > PS Sorry if this is a "RTFM" question, but so far I can't find the documentation. > > > > > > -- > > Ed Bueler > > Dept of Math and Stat and Geophysical Institute > > University of Alaska Fairbanks > > Fairbanks, AK 99775-6660 > > 301C Chapman and 410D Elvey > > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > > > > -- > Ed Bueler > Dept of Math and Stat and Geophysical Institute > University of Alaska Fairbanks > Fairbanks, AK 99775-6660 > 301C Chapman and 410D Elvey > 907 474-7693 and 907 474-7199 (fax 907 474-5394) From bhatiamanav at gmail.com Tue Mar 1 10:43:55 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Tue, 1 Mar 2016 10:43:55 -0600 Subject: [petsc-users] MatGetSize for SeqBAIJ Message-ID: <91C7779E-4BEE-41F1-A9C1-3D978CCB1C6F@gmail.com> Hi Is MatGetSize for a SeqBAIJ matrix expected to return the number of block rows and columns, or the total number of rows and columns (blocks rows times block size)? Thanks, Manav From hzhang at mcs.anl.gov Tue Mar 1 10:44:30 2016 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 1 Mar 2016 10:44:30 -0600 Subject: [petsc-users] MatGetSize for SeqBAIJ In-Reply-To: <91C7779E-4BEE-41F1-A9C1-3D978CCB1C6F@gmail.com> References: <91C7779E-4BEE-41F1-A9C1-3D978CCB1C6F@gmail.com> Message-ID: Manav: > > Is MatGetSize for a SeqBAIJ matrix expected to return the number of > block rows and columns, or the total number of rows and columns (blocks > rows times block size)? > + m - the number of global rows - n - the number of global columns the total number of rows and columns (blocks rows times block size). Hong -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 1 10:50:40 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Mar 2016 10:50:40 -0600 Subject: [petsc-users] PETSc release soon, request for input on needed fixes or enhancements In-Reply-To: References: Message-ID: <24906039-9B30-4397-B0C1-EAAC34DB624F@mcs.anl.gov> Jed? > On Mar 1, 2016, at 1:44 AM, Lisandro Dalcin wrote: > > On 27 February 2016 at 23:36, Barry Smith wrote: >> We are planning the PETSc release 3.7 shortly. If you know of any bugs that need to be fixed or enhancements added before the release please let us know. > > Barry, long time ago I reported a regression in PetIGA related to the > new Vec assembly routines. I guess Jed didn't have a chance to look at > it. Until that happens, I propose to not use the new routines as the > default ones. > > > -- > Lisandro Dalcin > ============ > Research Scientist > Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) > Extreme Computing Research Center (ECRC) > King Abdullah University of Science and Technology (KAUST) > http://ecrc.kaust.edu.sa/ > > 4700 King Abdullah University of Science and Technology > al-Khawarizmi Bldg (Bldg 1), Office # 4332 > Thuwal 23955-6900, Kingdom of Saudi Arabia > http://www.kaust.edu.sa > > Office Phone: +966 12 808-0459 From bhatiamanav at gmail.com Tue Mar 1 10:56:24 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Tue, 1 Mar 2016 10:56:24 -0600 Subject: [petsc-users] MatGetSize for SeqBAIJ In-Reply-To: References: <91C7779E-4BEE-41F1-A9C1-3D978CCB1C6F@gmail.com> Message-ID: Thanks. That means I am doing something goofy in setting up my matrix. I am trying to create a matrix with block size 2, and 3000 as the number of block rows/columns. So, I would expect an output of 6000x6000 from the following code, but I get 3000x3000. Is it the sequence of my function calls? Thanks, Manav PetscErrorCode ierr; Mat mat; PetscInt m,n; ierr = MatCreate(mpi_comm, &mat); ierr = MatSetSizes(mat, 3000, 3000, 3000, 3000); ierr = MatSetType(mat, MATBAIJ); ierr = MatSetBlockSize(mat, 2); ierr = MatGetSize(mat, &m, &n); std::cout << m << " " << n << std::endl; > On Mar 1, 2016, at 10:44 AM, Hong wrote: > > Manav: > Is MatGetSize for a SeqBAIJ matrix expected to return the number of block rows and columns, or the total number of rows and columns (blocks rows times block size)? > > + m - the number of global rows > - n - the number of global columns > the total number of rows and columns (blocks rows times block size). > > Hong > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 1 10:56:17 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Mar 2016 10:56:17 -0600 Subject: [petsc-users] Investigate parallel code to improve parallelism In-Reply-To: <56D54CC3.80709@gmail.com> References: <56D06F16.9000200@gmail.com> <141087E6-A7C1-41FF-B3E8-74FF587E3535@mcs.anl.gov> <56D07D0C.7050109@gmail.com> <5FBB830C-4A5E-40B2-9845-E0DC68B02BD3@mcs.anl.gov> <56D3207C.6090808@gmail.com> <91716C74-7716-4C7D-98B5-1606A8A93766@mcs.anl.gov> <56D399F5.9030709@gmail.com> <65862BCD-10AB-4821-9816-981D1AFFE7DE@mcs.anl.gov> <56D3AC5C.7040208@gmail.com> <56D54CC3.80709@gmail.com> Message-ID: If you have access to a different cluster you can try there and see if the communication is any better. Likely you would get better speedup on an IBM BlueGene since it has a good network relative to the processing power. So best to run on IBM. Barry > On Mar 1, 2016, at 2:03 AM, TAY wee-beng wrote: > > > On 29/2/2016 11:21 AM, Barry Smith wrote: >>> On Feb 28, 2016, at 8:26 PM, TAY wee-beng wrote: >>> >>> >>> On 29/2/2016 9:41 AM, Barry Smith wrote: >>>>> On Feb 28, 2016, at 7:08 PM, TAY Wee Beng wrote: >>>>> >>>>> Hi, >>>>> >>>>> I've attached the files for x cells running y procs. hypre is called natively I'm not sure if PETSc catches it. >>>> So you are directly creating hypre matrices and calling the hypre solver in another piece of your code? >>> Yes because I'm using the simple structure (struct) layout for Cartesian grids. It's about twice as fast compared to BoomerAMG >> Understood >> >>> . I can't create PETSc matrix and use the hypre struct layout, right? >>>> In the PETSc part of the code if you compare the 2x_y to the x_y you see that doubling the problem size resulted in 2.2 as much time for the KSPSolve. Most of this large increase is due to the increased time in the scatter which went up to 150/54. = 2.7777777777777777 but the amount of data transferred only increased by 1e5/6.4e4 = 1.5625 Normally I would not expect to see this behavior and would not expect such a large increase in the communication time. >>>> >>>> Barry >>>> >>>> >>>> >>> So ideally it should be 2 instead of 2.2, is that so? >> Ideally >> >>> May I know where are you looking at? Because I can't find the nos. >> The column labeled Avg len tells the average length of messages which increases from 6.4e4 to 1e5 while the time max increase by 2.77 (I took the sum of the VecScatterBegin and VecScatter End rows. >> >>> So where do you think the error comes from? >> It is not really an error it is just that it is taking more time then one would hope it would take. >>> Or how can I troubleshoot further? >> >> If you run the same problem several times how much different are the numerical timings for each run? > Hi, > > I have re-done x_y and 2x_y again. I have attached the files with _2 for the 2nd run. They're exactly the same. > > Should I try running on another cluster? > > I also tried running the same problem with more cells and more time steps (to reduce start up effects) on another cluster. But I forgot to run it with -log_summary. Anyway, the results show: > > 1. Using 1.5 million cells with 48 procs and 3M with 96p took 65min and 69min. Using the weak scaling formula I attached earlier, it gives about 88% efficiency > > 2. Using 3 million cells with 48 procs and 6M with 96p took 114min and 121min. Using the weak scaling formula I attached earlier, it gives about 88% efficiency > > 3. Using 3.75 million cells with 48 procs and 7.5M with 96p took 134min and 143min. Using the weak scaling formula I attached earlier, it gives about 87% efficiency > > 4. Using 4.5 million cells with 48 procs and 9M with 96p took 160min and 176min (extrapolated). Using the weak scaling formula I attached earlier, it gives about 80% efficiency > > So it seems that I should run with 3.75 million cells with 48 procs and scale along this ratio. Beyond that, my efficiency decreases. Is that so? Maybe I should also run with -log_summary to get better estimate... > > Thanks. >> >> >>> Thanks >>>>> Thanks >>>>> >>>>> On 29/2/2016 1:11 AM, Barry Smith wrote: >>>>>> As I said before, send the -log_summary output for the two processor sizes and we'll look at where it is spending its time and how it could possibly be improved. >>>>>> >>>>>> Barry >>>>>> >>>>>>> On Feb 28, 2016, at 10:29 AM, TAY wee-beng wrote: >>>>>>> >>>>>>> >>>>>>> On 27/2/2016 12:53 AM, Barry Smith wrote: >>>>>>>>> On Feb 26, 2016, at 10:27 AM, TAY wee-beng wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 26/2/2016 11:32 PM, Barry Smith wrote: >>>>>>>>>>> On Feb 26, 2016, at 9:28 AM, TAY wee-beng wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I have got a 3D code. When I ran with 48 procs and 11 million cells, it runs for 83 min. When I ran with 96 procs and 22 million cells, it ran for 99 min. >>>>>>>>>> This is actually pretty good! >>>>>>>>> But if I'm not wrong, if I increase the no. of cells, the parallelism will keep on decreasing. I hope it scales up to maybe 300 - 400 procs. >>>>>>> Hi, >>>>>>> >>>>>>> I think I may have mentioned this before, that is, I need to submit a proposal to request for computing nodes. In the proposal, I'm supposed to run some simulations to estimate the time it takes to run my code. Then an excel file will use my input to estimate the efficiency when I run my code with more cells. They use 2 mtds to estimate: >>>>>>> >>>>>>> 1. strong scaling, whereby I run 2 cases - 1st with n cells and x procs, then with n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf. >>>>>>> >>>>>>> 2. weak scaling, whereby I run 2 cases - 1st with n cells and x procs, then with 2n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf. >>>>>>> >>>>>>> So if I use 48 and 96 procs and get maybe 80% efficiency, by the time I hit 800 procs, I get 32% efficiency for strong scaling. They expect at least 50% efficiency for my code. To reach that, I need to achieve 89% efficiency when I use 48 and 96 procs. >>>>>>> >>>>>>> So now my qn is how accurate is this type of calculation, especially wrt to PETSc? >>>>>>> >>>>>>> Similarly, for weak scaling, is it accurate? >>>>>>> >>>>>>> Can I argue that this estimation is not suitable for PETSc or hypre? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>>>>>> So it's not that parallel. I want to find out which part of the code I need to improve. Also if PETsc and hypre is working well in parallel. What's the best way to do it? >>>>>>>>>> Run both with -log_summary and send the output for each case. This will show where the time is being spent and which parts are scaling less well. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>> That's only for the PETSc part, right? So for other parts of the code, including hypre part, I will not be able to find out. If so, what can I use to check these parts? >>>>>>>> You will still be able to see what percentage of the time is spent in hypre and if it increases with the problem size and how much. So the information will still be useful. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>>>>> I thought of doing profiling but if the code is optimized, I wonder if it still works well. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thank you. >>>>>>>>>>> >>>>>>>>>>> Yours sincerely, >>>>>>>>>>> >>>>>>>>>>> TAY wee-beng >>>>>>>>>>> >>>>>>> >>>>> -- >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> <2x_2y.txt><2x_y.txt><4x_2y.txt> > > <2x_y_2.txt><2x_y.txt> From mirzadeh at gmail.com Tue Mar 1 11:06:00 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 1 Mar 2016 12:06:00 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> Message-ID: Nice discussion. On Tue, Mar 1, 2016 at 10:16 AM, Boyce Griffith wrote: > > On Mar 1, 2016, at 9:59 AM, Mark Adams wrote: > > > > On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith > wrote: > >> >> On Feb 29, 2016, at 5:36 PM, Mark Adams wrote: >> >> >>>> GAMG is use for AMR problems like this a lot in BISICLES. >>>> >>> >>> Thanks for the reference. However, a quick look at their paper suggests >>> they are using a finite volume discretization which should be symmetric and >>> avoid all the shenanigans I'm going through! >>> >> >> No, they are not symmetric. FV is even worse than vertex centered >> methods. The BCs and the C-F interfaces add non-symmetry. >> >> >> If you use a different discretization, it is possible to make the c-f >> interface discretization symmetric --- but symmetry appears to come at a >> cost of the reduction in the formal order of accuracy in the flux along the >> c-f interface. I can probably dig up some code that would make it easy to >> compare. >> > > I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with > refluxing, which I do not linearize. PETSc sums fluxes at faces directly, > perhaps this IS symmetric? Toby might know. > > > If you are talking about solving Poisson on a composite grid, then > refluxing and summing up fluxes are probably the same procedure. > I am not familiar with the terminology used here. What does the refluxing mean? > > Users of these kinds of discretizations usually want to use the > conservative divergence at coarse-fine interfaces, and so the main question > is how to set up the viscous/diffusive flux stencil at coarse-fine > interfaces (or, equivalently, the stencil for evaluating ghost cell values > at coarse-fine interfaces). It is possible to make the overall > discretization symmetric if you use a particular stencil for the flux > computation. I think this paper ( > http://www.ams.org/journals/mcom/1991-56-194/S0025-5718-1991-1066831-5/S0025-5718-1991-1066831-5.pdf) > is one place to look. (This stuff is related to "mimetic finite difference" > discretizations of Poisson.) This coarse-fine interface discretization > winds up being symmetric (although possibly only w.r.t. a weighted inner > product --- I can't remember the details), but the fluxes are only > first-order accurate at coarse-fine interfaces. > > Right. I think if the discretization is conservative, i.e. discretizing div of grad, and is compact, i.e. only involves neighboring cells sharing a common face, then it is possible to construct symmetric discretization. An example, that I have used before in other contexts, is described here: http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf An interesting observation is although the fluxes are only first order accurate, the final solution to the linear system exhibits super convergence, i.e. second-order accurate, even in L_inf. Similar behavior is observed with non-conservative, node-based finite difference discretizations. -- Boyce > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 1 11:06:43 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Mar 2016 11:06:43 -0600 Subject: [petsc-users] MatGetSize for SeqBAIJ In-Reply-To: References: <91C7779E-4BEE-41F1-A9C1-3D978CCB1C6F@gmail.com> Message-ID: On Tue, Mar 1, 2016 at 10:56 AM, Manav Bhatia wrote: > Thanks. That means I am doing something goofy in setting up my matrix. > > I am trying to create a matrix with block size 2, and 3000 as the number > of block rows/columns. So, I would expect an output of 6000x6000 from the > following code, but I get 3000x3000. Is it the sequence of my function > calls? > > Thanks, > Manav > > > > PetscErrorCode ierr; > Mat mat; > PetscInt m,n; > > ierr = MatCreate(mpi_comm, &mat); > ierr = MatSetSizes(mat, 3000, 3000, 3000, 3000); > MatSetSizes also takes the number of rows, not blocks. Matt > ierr = MatSetType(mat, MATBAIJ); > ierr = MatSetBlockSize(mat, 2); > ierr = MatGetSize(mat, &m, &n); > > std::cout << m << " " << n << std::endl; > > > > On Mar 1, 2016, at 10:44 AM, Hong wrote: > > Manav: >> >> Is MatGetSize for a SeqBAIJ matrix expected to return the number of >> block rows and columns, or the total number of rows and columns (blocks >> rows times block size)? >> > > + m - the number of global rows > - n - the number of global columns > the total number of rows and columns (blocks rows times block size). > > Hong > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue Mar 1 11:18:06 2016 From: hongzhang at anl.gov (Hong Zhang) Date: Tue, 1 Mar 2016 11:18:06 -0600 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: <2795BBE1-C04B-45CA-B99C-2F36DD9BB7B4@mcs.anl.gov> References: <2795BBE1-C04B-45CA-B99C-2F36DD9BB7B4@mcs.anl.gov> Message-ID: <6FBEBBF0-C694-41D5-98E5-324DE8CC73DD@anl.gov> Hi Barry, Actually the TSTrajectory object can save both the solution and the corresponding time information into binary files. Although it is designed specifically for adjoint checkpointing, it does have the side benefit to assist in visualization of simulation trajectories. For adjoint checkpointing, not only solution is saved, but also stage values are saved. So I created a new TSTrajectory type particularly for visualization purpose, which can be access at my branch hongzh/tstrajectory_visualization. One can enable this feature using the command line options -ts_save_trajectory 1 -tstrajectory_type visualization The full trajectory will be saved into a folder with multiple files, one file corresponding to one time step. Then we can use a MATLAB script, such as PetscBinaryRead.m, to read these binary files. But the default script coming with PETSc needs to be modified a little bit. Because the trajectory is always saved as a solution vector, followed by a PetscReal variable, I suggest to use the following code in MATLAB: if header == 1211214 % Petsc Vec Object %% Read state vector m = double(read(fd,1,indices)); x = zeros(m,1); v = read(fd,m,precision); x(:) = v; %% Read time t = read(fd,1,precision); end Shri (copied in this email) has successfully used this approach to do visualization. Perhaps we can include this feature in the new release if it is useful to some users. Hong > On Mar 1, 2016, at 10:32 AM, Barry Smith wrote: > >> >> On Mar 1, 2016, at 12:39 AM, Ed Bueler wrote: >> >> Barry -- >> >> Will try it. >> >>> ... since, presumably, other more powerful IO tools exist that would be used for "real" problems? >> >> I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. In fact petsc binary seems fairly convenient for that. On the other hand, I am not sure I've ever done anything "real". ;-) >> >> Anyone out there: Are there a good *convenient* tools for saving space/time-series (= movies) from PETSc TS? I want to add frames and movies from PETSc into slides, etc. I can think of NetCDF but it seems not-very-convenient, and I am worried not well-supported from PETSc. Is setting up TS with events (=TSSetEventMonitor()) and writing separate snapshot files the preferred scalable usage, despite the extra effort compared to "-ts_monitor_solution binary:foo.dat"? > > Ed, > > As I said in my previous email since we don't have a way of indicating plain double precision numbers in our binary files it is not possible to put both the vectors and the time steps in the same file without augmenting our file format. > > Barry > > > >> >> Ed >> >> >> On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: >> >> Ed, >> >> I have added a branch barry/feature-ts-monitor-binary that supports -ts_monitor binary:timesteps that will store in simple binary format each of the time steps associated with each solution. This in conjugation with -ts_monitor_solution binary:solutions will give you two files you can read in. But note that timesteps is a simple binary file of double precision numbers you should read in directly in python, you cannot use PetscBinaryIO.py which is what you will use to read in the solutions file. >> >> Barry >> >> Currently PETSc has a binary file format where we can save Vec, Mat, IS, each is marked with a type id for PetscBinaryIO.py to detect, we do not have type ids for simple double precision numbers or arrays of numbers. This is why I have no way of saving the time steps in a way that PetscBinaryIO.py could read them in currently. I don't know how far we want to go in "spiffing up" the PETSc binary format to do more elaborate things since, presumably, other more power IO tools exist that would be used for "real" problems? >> >> >>> On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: >>> >>> Dear PETSc -- >>> >>> I have a short C ode code that uses TS to solve y' = g(t,y) where y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does adaptive time-stepping; thus using -ts_monitor shows times at stdout: >>> >>> $ ./ode -ts_monitor >>> solving from t0 = 0.000 with initial time step dt = 0.10000 ... >>> 0 TS dt 0.1 time 0. >>> 1 TS dt 0.170141 time 0.1 >>> 2 TS dt 0.169917 time 0.270141 >>> 3 TS dt 0.171145 time 0.440058 >>> 4 TS dt 0.173931 time 0.611203 >>> 5 TS dt 0.178719 time 0.785134 >>> 6 TS dt 0.0361473 time 0.963853 >>> 7 TS dt 0.188252 time 1. >>> error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 >>> >>> I want to output the trajectory in PETSc binary and plot it in python using bin/PetscBinaryIO.py. Clearly I need the times shown above to do that. >>> >>> Note "-ts_monitor_solution binary:XX" gives me a binary file with only y values in it, but not the corresponding times. >>> >>> My question is, how to get those times in either the same binary file (preferred) or separate binary files? I have tried >>> >>> $ ./ode -ts_monitor binary:foo.dat # invalid >>> $ ./ode -ts_monitor_solution binary:bar.dat # no t in file >>> $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no t in file >>> >>> without success. (I am not sure what the boolean option -ts_save_trajectory does, by the way.) >>> >>> Thanks! >>> >>> Ed >>> >>> PS Sorry if this is a "RTFM" question, but so far I can't find the documentation. >>> >>> >>> -- >>> Ed Bueler >>> Dept of Math and Stat and Geophysical Institute >>> University of Alaska Fairbanks >>> Fairbanks, AK 99775-6660 >>> 301C Chapman and 410D Elvey >>> 907 474-7693 and 907 474-7199 (fax 907 474-5394) >> >> >> >> >> -- >> Ed Bueler >> Dept of Math and Stat and Geophysical Institute >> University of Alaska Fairbanks >> Fairbanks, AK 99775-6660 >> 301C Chapman and 410D Elvey >> 907 474-7693 and 907 474-7199 (fax 907 474-5394) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Mar 1 11:52:15 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 1 Mar 2016 17:52:15 +0000 Subject: [petsc-users] [SLEPc] record against which PETSc installation it was compiled In-Reply-To: <65F5A6CC-5E3A-43CD-B0C3-1315317692E2@gmail.com> References: <2C3C0E47-ED3A-4E5C-8F28-081D299E7BC1@gmail.com> <65F5A6CC-5E3A-43CD-B0C3-1315317692E2@gmail.com> Message-ID: <02FE9A14-B84B-4B07-A5D9-64FB9CD3E242@dsic.upv.es> > El 29 feb 2016, a las 17:58, Denis Davydov escribi?: > > > >> On 29 Feb 2016, at 18:50, Jose E. Roman wrote: >> >> >>> El 29 feb 2016, a las 7:45, Denis Davydov escribi?: >>> >>> Dear all, >>> >>> It would be good if SLEPc would store a location of PETSc used during the build in some >>> config file, e.g. `slepcconf.h`, so that this information could be retrieved by external libraries (e.g. deal.ii) >>> to prevent configuring with PETSc and SLEPc while SLEPc was linking to a different PETSc installation. >>> See the discussion here https://github.com/dealii/dealii/issues/2167 >>> >>> Kind regards, >>> Denis >>> >> >> I have added this: >> https://bitbucket.org/slepc/slepc/branch/jose/configure >> >> However, I am not totally convinced of this change, because PETSC_DIR is then defined both in petscconf.h and slepcconf.h, so behaviour could change depending on whether user code includes one or the other in the first place. > > Thanks a lot, Jose. > As an alternative one could call it PETSC_DIR_IN_SLEPC or alike, so that the two are different. Ok. I changed it to SLEPC_PETSC_DIR. Jose > > Regards, > Denis. > >> >> I will test this further before merging into master. >> >> Jose From jed at jedbrown.org Tue Mar 1 12:15:51 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Mar 2016 18:15:51 +0000 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> Message-ID: <87si0a5ajs.fsf@jedbrown.org> Mohammad Mirzadeh writes: > I am not familiar with the terminology used here. What does the refluxing > mean? The Chombo/BoxLib family of methods evaluate fluxes between coarse grid cells overlaying refined grids, then later visit the fine grids and reevaluate those fluxes. The correction needs to be propagated back to the adjoining coarse grid cell to maintain conservation. It's an implementation detail that they call refluxing. > Right. I think if the discretization is conservative, i.e. discretizing div > of grad, and is compact, i.e. only involves neighboring cells sharing a > common face, then it is possible to construct symmetric discretization. An > example, that I have used before in other contexts, is described here: > http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf It's unfortunate that this paper repeats some unfounded multigrid slander and then basically claims to have uniform convergence using incomplete Cholesky with CG. In reality, incomplete Cholesky is asymptotically no better than Jacobi. > An interesting observation is although the fluxes are only first order > accurate, the final solution to the linear system exhibits super > convergence, i.e. second-order accurate, even in L_inf. Perhaps for aligned coefficients; definitely not for unaligned coefficients. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From david.knezevic at akselos.com Tue Mar 1 12:52:51 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Tue, 1 Mar 2016 13:52:51 -0500 Subject: [petsc-users] PCFactorSetUpMatSolverPackage with SNES Message-ID: Based on KSP ex52, I use PCFactorSetUpMatSolverPackage in the process of setting various MUMPS ictnl options. This works fine for me when I'm solving linear problems. I then wanted to use PCFactorSetUpMatSolverPackage with the PC from a SNES object. I tried to do this with the following code (after calling SNESCreate, SNESSetFunction, and SNESSetJacobian): KSP snes_ksp; SNESGetKSP(snes, &snes_ksp); PC snes_pc; KSPGetPC(snes_ksp, &snes_pc); PCFactorSetMatSolverPackage(snes_pc, MATSOLVERMUMPS); PCFactorSetUpMatSolverPackage(snes_pc); However, I get a segfault on the call to PCFactorSetUpMatSolverPackage in this case. I was wondering what I need to do to make this work? Note that I want to set the MUMPS ictnl parameters via code rather than via the commandline since sometimes MUMPS fails (e.g. with error -9 due to a workspace size that is too small) and I need to automatically re-run the solve with different ictnl values when this happens. Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 1 12:56:33 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Mar 2016 12:56:33 -0600 Subject: [petsc-users] PCFactorSetUpMatSolverPackage with SNES In-Reply-To: References: Message-ID: On Tue, Mar 1, 2016 at 12:52 PM, David Knezevic wrote: > Based on KSP ex52, I use PCFactorSetUpMatSolverPackage in the process of > setting various MUMPS ictnl options. This works fine for me when I'm > solving linear problems. > > I then wanted to use PCFactorSetUpMatSolverPackage with the PC from a SNES > object. I tried to do this with the following code (after calling > SNESCreate, SNESSetFunction, and SNESSetJacobian): > > KSP snes_ksp; > SNESGetKSP(snes, &snes_ksp); > PC snes_pc; > KSPGetPC(snes_ksp, &snes_pc); > PCFactorSetMatSolverPackage(snes_pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(snes_pc); > > However, I get a segfault on the call to PCFactorSetUpMatSolverPackage in > this case. I was wondering what I need to do to make this work? > > Note that I want to set the MUMPS ictnl parameters via code rather than > via the commandline since sometimes MUMPS fails (e.g. with error -9 due to > a workspace size that is too small) and I need to automatically re-run the > solve with different ictnl values when this happens. > That is a good reason. However, I would organize this differently. I would still set the type from the command line. Then later in your code, after SNES is setup correctly, you get out the PC and reset the icntl values if you have a failure. Its very difficult to get the setup logic correct, and its one of the most error prone parts of PETSc. RIght now, the most reliable way to do things is to have all the information available up front in options. Someday, I want to write a small DFA piece in PETSc that can encode all this logic so that simple errors people make can be diagnosed early with nice error messages. Thanks, Matt > Thanks, > David > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Tue Mar 1 13:07:38 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Tue, 1 Mar 2016 14:07:38 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> Message-ID: <436D1754-1D89-4240-B5FA-31A9A182034D@cims.nyu.edu> > On Mar 1, 2016, at 12:06 PM, Mohammad Mirzadeh wrote: > > Nice discussion. > > > On Tue, Mar 1, 2016 at 10:16 AM, Boyce Griffith > wrote: > >> On Mar 1, 2016, at 9:59 AM, Mark Adams > wrote: >> >> >> >> On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith > wrote: >> >>> On Feb 29, 2016, at 5:36 PM, Mark Adams > wrote: >>> >>> >>> GAMG is use for AMR problems like this a lot in BISICLES. >>> >>> Thanks for the reference. However, a quick look at their paper suggests they are using a finite volume discretization which should be symmetric and avoid all the shenanigans I'm going through! >>> >>> No, they are not symmetric. FV is even worse than vertex centered methods. The BCs and the C-F interfaces add non-symmetry. >> >> >> If you use a different discretization, it is possible to make the c-f interface discretization symmetric --- but symmetry appears to come at a cost of the reduction in the formal order of accuracy in the flux along the c-f interface. I can probably dig up some code that would make it easy to compare. >> >> I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with refluxing, which I do not linearize. PETSc sums fluxes at faces directly, perhaps this IS symmetric? Toby might know. > > If you are talking about solving Poisson on a composite grid, then refluxing and summing up fluxes are probably the same procedure. > > I am not familiar with the terminology used here. What does the refluxing mean? > > > Users of these kinds of discretizations usually want to use the conservative divergence at coarse-fine interfaces, and so the main question is how to set up the viscous/diffusive flux stencil at coarse-fine interfaces (or, equivalently, the stencil for evaluating ghost cell values at coarse-fine interfaces). It is possible to make the overall discretization symmetric if you use a particular stencil for the flux computation. I think this paper (http://www.ams.org/journals/mcom/1991-56-194/S0025-5718-1991-1066831-5/S0025-5718-1991-1066831-5.pdf ) is one place to look. (This stuff is related to "mimetic finite difference" discretizations of Poisson.) This coarse-fine interface discretization winds up being symmetric (although possibly only w.r.t. a weighted inner product --- I can't remember the details), but the fluxes are only first-order accurate at coarse-fine interfaces. > > > Right. I think if the discretization is conservative, i.e. discretizing div of grad, and is compact, i.e. only involves neighboring cells sharing a common face, then it is possible to construct symmetric discretization. An example, that I have used before in other contexts, is described here: http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf > > An interesting observation is although the fluxes are only first order accurate, the final solution to the linear system exhibits super convergence, i.e. second-order accurate, even in L_inf. Similar behavior is observed with non-conservative, node-based finite difference discretizations. I don't know about that --- check out Table 1 in the paper you cite, which seems to indicate first-order convergence in all norms. The symmetric discretization in the Ewing paper is only slightly more complicated, but will give full 2nd-order accuracy in L-1 (and maybe also L-2 and L-infinity). One way to think about it is that you are using simple linear interpolation at coarse-fine interfaces (3-point interpolation in 2D, 4-point interpolation in 3D) using a stencil that is symmetric with respect to the center of the coarse grid cell. A (discrete) Green's functions argument explains why one gets higher-order convergence despite localized reductions in accuracy along the coarse-fine interface --- it has to do with the fact that errors from individual grid locations do not have that large of an effect on the solution, and these c-f interface errors are concentrated along on a lower dimensional surface in the domain. -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From boyceg at email.unc.edu Tue Mar 1 13:16:49 2016 From: boyceg at email.unc.edu (Griffith, Boyce Eugene) Date: Tue, 1 Mar 2016 19:16:49 +0000 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <87d1re6vl4.fsf@jedbrown.org> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <87io166xs5.fsf@jedbrown.org> <9400A2BC-73CD-46E9-A627-F8A43826FE24@cims.nyu.edu> <87d1re6vl4.fsf@jedbrown.org> Message-ID: <7F84C4B3-B978-479D-9F2D-DEF7B1476865@email.unc.edu> On Mar 1, 2016, at 10:56 AM, Jed Brown > wrote: Boyce Griffith > writes: Jed, can you also do this for Stokes? It seems like something like RT0 is the right place to start. See, for example, Arnold, Falk, and Winther's 2007 paper on mixed FEM for elasticity with weakly imposed symmetry. It's the usual H(div) methodology and should apply equally well to Stokes. I'm not aware of any analysis or results of choosing quadrature to eliminate flux terms in these discretizations. Two papers that are along the direction that I have in mind are: http://onlinelibrary.wiley.com/doi/10.1002/fld.1566/abstract http://onlinelibrary.wiley.com/doi/10.1002/fld.1723/abstract I would love to know how to do this kind of thing on a SAMR or octree grid. -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Tue Mar 1 13:41:43 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 1 Mar 2016 14:41:43 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <436D1754-1D89-4240-B5FA-31A9A182034D@cims.nyu.edu> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <436D1754-1D89-4240-B5FA-31A9A182034D@cims.nyu.edu> Message-ID: On Tue, Mar 1, 2016 at 2:07 PM, Boyce Griffith wrote: > > On Mar 1, 2016, at 12:06 PM, Mohammad Mirzadeh wrote: > > Nice discussion. > > > On Tue, Mar 1, 2016 at 10:16 AM, Boyce Griffith > wrote: > >> >> On Mar 1, 2016, at 9:59 AM, Mark Adams wrote: >> >> >> >> On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith >> wrote: >> >>> >>> On Feb 29, 2016, at 5:36 PM, Mark Adams wrote: >>> >>> >>>>> GAMG is use for AMR problems like this a lot in BISICLES. >>>>> >>>> >>>> Thanks for the reference. However, a quick look at their paper suggests >>>> they are using a finite volume discretization which should be symmetric and >>>> avoid all the shenanigans I'm going through! >>>> >>> >>> No, they are not symmetric. FV is even worse than vertex centered >>> methods. The BCs and the C-F interfaces add non-symmetry. >>> >>> >>> If you use a different discretization, it is possible to make the c-f >>> interface discretization symmetric --- but symmetry appears to come at a >>> cost of the reduction in the formal order of accuracy in the flux along the >>> c-f interface. I can probably dig up some code that would make it easy to >>> compare. >>> >> >> I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with >> refluxing, which I do not linearize. PETSc sums fluxes at faces directly, >> perhaps this IS symmetric? Toby might know. >> >> >> If you are talking about solving Poisson on a composite grid, then >> refluxing and summing up fluxes are probably the same procedure. >> > > I am not familiar with the terminology used here. What does the refluxing > mean? > > >> >> Users of these kinds of discretizations usually want to use the >> conservative divergence at coarse-fine interfaces, and so the main question >> is how to set up the viscous/diffusive flux stencil at coarse-fine >> interfaces (or, equivalently, the stencil for evaluating ghost cell values >> at coarse-fine interfaces). It is possible to make the overall >> discretization symmetric if you use a particular stencil for the flux >> computation. I think this paper ( >> http://www.ams.org/journals/mcom/1991-56-194/S0025-5718-1991-1066831-5/S0025-5718-1991-1066831-5.pdf) >> is one place to look. (This stuff is related to "mimetic finite difference" >> discretizations of Poisson.) This coarse-fine interface discretization >> winds up being symmetric (although possibly only w.r.t. a weighted inner >> product --- I can't remember the details), but the fluxes are only >> first-order accurate at coarse-fine interfaces. >> >> > Right. I think if the discretization is conservative, i.e. discretizing > div of grad, and is compact, i.e. only involves neighboring cells sharing a > common face, then it is possible to construct symmetric discretization. An > example, that I have used before in other contexts, is described here: > http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf > > An interesting observation is although the fluxes are only first order > accurate, the final solution to the linear system exhibits super > convergence, i.e. second-order accurate, even in L_inf. Similar behavior is > observed with non-conservative, node-based finite difference > discretizations. > > > I don't know about that --- check out Table 1 in the paper you cite, which > seems to indicate first-order convergence in all norms. > Sorry my bad. That was the original work which was later extended in doi:10.1016/j.compfluid.2005.01.006 to second order (c.f. section 3.3) by using flux weighting in the traverse direction. > > The symmetric discretization in the Ewing paper is only slightly more > complicated, but will give full 2nd-order accuracy in L-1 (and maybe also > L-2 and L-infinity). One way to think about it is that you are using simple > linear interpolation at coarse-fine interfaces (3-point interpolation in > 2D, 4-point interpolation in 3D) using a stencil that is symmetric with > respect to the center of the coarse grid cell. > > I'll look into that paper. One can never get enough of ideas for C-F treatments in AMR applications :). > A (discrete) Green's functions argument explains why one gets higher-order > convergence despite localized reductions in accuracy along the coarse-fine > interface --- it has to do with the fact that errors from individual grid > locations do not have that large of an effect on the solution, and these > c-f interface errors are concentrated along on a lower dimensional surface > in the domain. > This intuitively makes sense and in fact when you plot the error, you do see spikes at the C-F interfaces. Do you know of a resource that does a rigorous analysis of the C-F treatment on the solution error? > > -- Boyce > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Tue Mar 1 13:48:06 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 1 Mar 2016 14:48:06 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <87si0a5ajs.fsf@jedbrown.org> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <87si0a5ajs.fsf@jedbrown.org> Message-ID: On Tue, Mar 1, 2016 at 1:15 PM, Jed Brown wrote: > Mohammad Mirzadeh writes: > > > I am not familiar with the terminology used here. What does the refluxing > > mean? > > The Chombo/BoxLib family of methods evaluate fluxes between coarse grid > cells overlaying refined grids, then later visit the fine grids and > reevaluate those fluxes. The correction needs to be propagated back to > the adjoining coarse grid cell to maintain conservation. It's an > implementation detail that they call refluxing. > Thanks for clarification. > > > Right. I think if the discretization is conservative, i.e. discretizing > div > > of grad, and is compact, i.e. only involves neighboring cells sharing a > > common face, then it is possible to construct symmetric discretization. > An > > example, that I have used before in other contexts, is described here: > > http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf > > It's unfortunate that this paper repeats some unfounded multigrid > slander and then basically claims to have uniform convergence using > incomplete Cholesky with CG. In reality, incomplete Cholesky is > asymptotically no better than Jacobi. > > > An interesting observation is although the fluxes are only first order > > accurate, the final solution to the linear system exhibits super > > convergence, i.e. second-order accurate, even in L_inf. > > Perhaps for aligned coefficients; definitely not for unaligned > coefficients. > Could you elaborate what you mean by aligned/unaligned coefficients? Do you mean anisotropic diffusion coefficient? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 1 13:50:02 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Mar 2016 13:50:02 -0600 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <87si0a5ajs.fsf@jedbrown.org> Message-ID: On Tue, Mar 1, 2016 at 1:48 PM, Mohammad Mirzadeh wrote: > On Tue, Mar 1, 2016 at 1:15 PM, Jed Brown wrote: > >> Mohammad Mirzadeh writes: >> >> > I am not familiar with the terminology used here. What does the >> refluxing >> > mean? >> >> The Chombo/BoxLib family of methods evaluate fluxes between coarse grid >> cells overlaying refined grids, then later visit the fine grids and >> reevaluate those fluxes. The correction needs to be propagated back to >> the adjoining coarse grid cell to maintain conservation. It's an >> implementation detail that they call refluxing. >> > > Thanks for clarification. > > >> >> > Right. I think if the discretization is conservative, i.e. discretizing >> div >> > of grad, and is compact, i.e. only involves neighboring cells sharing a >> > common face, then it is possible to construct symmetric discretization. >> An >> > example, that I have used before in other contexts, is described here: >> > http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf >> >> It's unfortunate that this paper repeats some unfounded multigrid >> slander and then basically claims to have uniform convergence using >> incomplete Cholesky with CG. In reality, incomplete Cholesky is >> asymptotically no better than Jacobi. >> >> > An interesting observation is although the fluxes are only first order >> > accurate, the final solution to the linear system exhibits super >> > convergence, i.e. second-order accurate, even in L_inf. >> >> Perhaps for aligned coefficients; definitely not for unaligned >> coefficients. >> > > Could you elaborate what you mean by aligned/unaligned coefficients? Do > you mean anisotropic diffusion coefficient? > Jed (I think) means coefficients where the variation is aligned to the grid. For example, where coefficient jumps or large variation happens on cell boundaries. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Tue Mar 1 14:03:31 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Tue, 1 Mar 2016 15:03:31 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <436D1754-1D89-4240-B5FA-31A9A182034D@cims.nyu.edu> Message-ID: <3A3188D1-6EEC-4485-BD87-6C1CF0486ED1@cims.nyu.edu> > On Mar 1, 2016, at 2:41 PM, Mohammad Mirzadeh wrote: > > > > On Tue, Mar 1, 2016 at 2:07 PM, Boyce Griffith > wrote: > >> On Mar 1, 2016, at 12:06 PM, Mohammad Mirzadeh > wrote: >> >> Nice discussion. >> >> >> On Tue, Mar 1, 2016 at 10:16 AM, Boyce Griffith > wrote: >> >>> On Mar 1, 2016, at 9:59 AM, Mark Adams > wrote: >>> >>> >>> >>> On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith > wrote: >>> >>>> On Feb 29, 2016, at 5:36 PM, Mark Adams > wrote: >>>> >>>> >>>> GAMG is use for AMR problems like this a lot in BISICLES. >>>> >>>> Thanks for the reference. However, a quick look at their paper suggests they are using a finite volume discretization which should be symmetric and avoid all the shenanigans I'm going through! >>>> >>>> No, they are not symmetric. FV is even worse than vertex centered methods. The BCs and the C-F interfaces add non-symmetry. >>> >>> >>> If you use a different discretization, it is possible to make the c-f interface discretization symmetric --- but symmetry appears to come at a cost of the reduction in the formal order of accuracy in the flux along the c-f interface. I can probably dig up some code that would make it easy to compare. >>> >>> I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with refluxing, which I do not linearize. PETSc sums fluxes at faces directly, perhaps this IS symmetric? Toby might know. >> >> If you are talking about solving Poisson on a composite grid, then refluxing and summing up fluxes are probably the same procedure. >> >> I am not familiar with the terminology used here. What does the refluxing mean? >> >> >> Users of these kinds of discretizations usually want to use the conservative divergence at coarse-fine interfaces, and so the main question is how to set up the viscous/diffusive flux stencil at coarse-fine interfaces (or, equivalently, the stencil for evaluating ghost cell values at coarse-fine interfaces). It is possible to make the overall discretization symmetric if you use a particular stencil for the flux computation. I think this paper (http://www.ams.org/journals/mcom/1991-56-194/S0025-5718-1991-1066831-5/S0025-5718-1991-1066831-5.pdf ) is one place to look. (This stuff is related to "mimetic finite difference" discretizations of Poisson.) This coarse-fine interface discretization winds up being symmetric (although possibly only w.r.t. a weighted inner product --- I can't remember the details), but the fluxes are only first-order accurate at coarse-fine interfaces. >> >> >> Right. I think if the discretization is conservative, i.e. discretizing div of grad, and is compact, i.e. only involves neighboring cells sharing a common face, then it is possible to construct symmetric discretization. An example, that I have used before in other contexts, is described here: http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf >> >> An interesting observation is although the fluxes are only first order accurate, the final solution to the linear system exhibits super convergence, i.e. second-order accurate, even in L_inf. Similar behavior is observed with non-conservative, node-based finite difference discretizations. > > I don't know about that --- check out Table 1 in the paper you cite, which seems to indicate first-order convergence in all norms. > > Sorry my bad. That was the original work which was later extended in doi:10.1016/j.compfluid.2005.01.006 to second order (c.f. section 3.3) by using flux weighting in the traverse direction. I don't follow the argument about why it is a bad thing for the fine fluxes to have different values than the overlying coarse flux, but this probably works out to be the same as the Ewing discretization in certain cases (although possibly only in 2D with a refinement ratio of 2). > A (discrete) Green's functions argument explains why one gets higher-order convergence despite localized reductions in accuracy along the coarse-fine interface --- it has to do with the fact that errors from individual grid locations do not have that large of an effect on the solution, and these c-f interface errors are concentrated along on a lower dimensional surface in the domain. > > This intuitively makes sense and in fact when you plot the error, you do see spikes at the C-F interfaces. Do you know of a resource that does a rigorous analysis of the C-F treatment on the solution error? I don't know if I have seen anything that works out all the details for locally refined grids, but LeVeque's extremely readable book, Finite Difference Methods for Ordinary and Partial Differential Equations, works this out for the uniform grid case. If I were doing this on an AMR grid, I would split the domain into a coarse half and a fine half (just one c-f interface) and work out the discrete Green's functions. -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Tue Mar 1 15:06:00 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Tue, 1 Mar 2016 16:06:00 -0500 Subject: [petsc-users] PCFactorSetUpMatSolverPackage with SNES In-Reply-To: References: Message-ID: On Tue, Mar 1, 2016 at 1:56 PM, Matthew Knepley wrote: > On Tue, Mar 1, 2016 at 12:52 PM, David Knezevic < > david.knezevic at akselos.com> wrote: > >> Based on KSP ex52, I use PCFactorSetUpMatSolverPackage in the process of >> setting various MUMPS ictnl options. This works fine for me when I'm >> solving linear problems. >> >> I then wanted to use PCFactorSetUpMatSolverPackage with the PC from a >> SNES object. I tried to do this with the following code (after calling >> SNESCreate, SNESSetFunction, and SNESSetJacobian): >> >> KSP snes_ksp; >> SNESGetKSP(snes, &snes_ksp); >> PC snes_pc; >> KSPGetPC(snes_ksp, &snes_pc); >> PCFactorSetMatSolverPackage(snes_pc, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(snes_pc); >> >> However, I get a segfault on the call to PCFactorSetUpMatSolverPackage in >> this case. I was wondering what I need to do to make this work? >> >> Note that I want to set the MUMPS ictnl parameters via code rather than >> via the commandline since sometimes MUMPS fails (e.g. with error -9 due to >> a workspace size that is too small) and I need to automatically re-run the >> solve with different ictnl values when this happens. >> > > That is a good reason. However, I would organize this differently. I would > still set the type from the command line. > Then later in your code, after SNES is setup correctly, you get out the PC > and reset the icntl values if you have a > failure. Its very difficult to get the setup logic correct, and its one of > the most error prone parts of PETSc. RIght now, > the most reliable way to do things is to have all the information > available up front in options. > > Someday, I want to write a small DFA piece in PETSc that can encode all > this logic so that simple errors people > make can be diagnosed early with nice error messages. > > Thanks, > > Matt > OK, thanks for the info, I'll go with that approach. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 1 18:14:11 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Mar 2016 18:14:11 -0600 Subject: [petsc-users] PCFactorSetUpMatSolverPackage with SNES In-Reply-To: References: Message-ID: <245AA0F6-2E6B-4069-9086-284E6648D7DE@mcs.anl.gov> David, I have added an error checker so the code will produce a useful error message instead of just crashing here in the code. You can only call this routine AFTER the matrix has been provided to the KSP object. Normally with SNES this won't happen until after the solve has started (hence is not easy for you to put in your parameters) but adding the code I suggest below might work > On Mar 1, 2016, at 12:52 PM, David Knezevic wrote: > > Based on KSP ex52, I use PCFactorSetUpMatSolverPackage in the process of setting various MUMPS ictnl options. This works fine for me when I'm solving linear problems. > > I then wanted to use PCFactorSetUpMatSolverPackage with the PC from a SNES object. I tried to do this with the following code (after calling SNESCreate, SNESSetFunction, and SNESSetJacobian): > > KSP snes_ksp; > SNESGetKSP(snes, &snes_ksp); KSPSetOperators(ksp,mat,pmat); /* mat and pmat (which may be the same) are the Jacobian arguments to SNESSetJacobian(), the matrices must exist and types set but they don't have to have the correct Jacobian entries at this point (since the nonlinear solver has not started you cannot know the Jacobian entries yet.) > PC snes_pc; > KSPGetPC(snes_ksp, &snes_pc); PCSetType(snes_pc,PCLU); > PCFactorSetMatSolverPackage(snes_pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(snes_pc); Let me know if this does not work and where it goes wrong. > > However, I get a segfault on the call to PCFactorSetUpMatSolverPackage in this case. I was wondering what I need to do to make this work? > > Note that I want to set the MUMPS ictnl parameters via code rather than via the commandline since sometimes MUMPS fails (e.g. with error -9 due to a workspace size that is too small) and I need to automatically re-run the solve with different ictnl values when this happens. > > Thanks, > David > > From david.knezevic at akselos.com Tue Mar 1 18:43:19 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Tue, 1 Mar 2016 19:43:19 -0500 Subject: [petsc-users] PCFactorSetUpMatSolverPackage with SNES In-Reply-To: <245AA0F6-2E6B-4069-9086-284E6648D7DE@mcs.anl.gov> References: <245AA0F6-2E6B-4069-9086-284E6648D7DE@mcs.anl.gov> Message-ID: Hi Barry, I have added an error checker so the code will produce a useful error > message instead of just crashing here in the code. > This sounds helpful. I assume this is in the dev branch? I'm using 3.6.1, but I gather from this list that 3.7 will be out soon, so I'll switch to that once it's available. > You can only call this routine AFTER the matrix has been provided to > the KSP object. Normally with SNES this won't happen until after the solve > has started (hence is not easy for you to put in your parameters) but > adding the code I suggest below might work > OK, makes sense. Unfortunately the change below didn't help, I still get the same segfault. But it's not a big deal, since I went with Matt's suggestion which seems to be working well. Thanks, David > > On Mar 1, 2016, at 12:52 PM, David Knezevic > wrote: > > > > Based on KSP ex52, I use PCFactorSetUpMatSolverPackage in the process of > setting various MUMPS ictnl options. This works fine for me when I'm > solving linear problems. > > > > I then wanted to use PCFactorSetUpMatSolverPackage with the PC from a > SNES object. I tried to do this with the following code (after calling > SNESCreate, SNESSetFunction, and SNESSetJacobian): > > > > KSP snes_ksp; > > SNESGetKSP(snes, &snes_ksp); > KSPSetOperators(ksp,mat,pmat); > /* mat and pmat (which may be the same) are the Jacobian arguments to > SNESSetJacobian(), the matrices must exist and types set but they don't > have to have the correct Jacobian entries at this point (since the > nonlinear solver has not started you cannot know the Jacobian entries yet.) > > PC snes_pc; > > KSPGetPC(snes_ksp, &snes_pc); > PCSetType(snes_pc,PCLU); > > PCFactorSetMatSolverPackage(snes_pc, MATSOLVERMUMPS); > > PCFactorSetUpMatSolverPackage(snes_pc); > > Let me know if this does not work and where it goes wrong. > > > > However, I get a segfault on the call to PCFactorSetUpMatSolverPackage > in this case. I was wondering what I need to do to make this work? > > > > Note that I want to set the MUMPS ictnl parameters via code rather than > via the commandline since sometimes MUMPS fails (e.g. with error -9 due to > a workspace size that is too small) and I need to automatically re-run the > solve with different ictnl values when this happens. > > > > Thanks, > > David > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 1 19:52:43 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Mar 2016 19:52:43 -0600 Subject: [petsc-users] PCFactorSetUpMatSolverPackage with SNES In-Reply-To: References: <245AA0F6-2E6B-4069-9086-284E6648D7DE@mcs.anl.gov> Message-ID: > On Mar 1, 2016, at 6:43 PM, David Knezevic wrote: > > Hi Barry, > > I have added an error checker so the code will produce a useful error message instead of just crashing here in the code. > > This sounds helpful. I assume this is in the dev branch? No it should work in 3.6.1 > I'm using 3.6.1, but I gather from this list that 3.7 will be out soon, so I'll switch to that once it's available. > > > You can only call this routine AFTER the matrix has been provided to the KSP object. Normally with SNES this won't happen until after the solve has started (hence is not easy for you to put in your parameters) but adding the code I suggest below might work > > > OK, makes sense. Unfortunately the change below didn't help, I still get the same segfault. But it's not a big deal, since I went with Matt's suggestion which seems to be working well. > > Thanks, > David > > > > > On Mar 1, 2016, at 12:52 PM, David Knezevic wrote: > > > > Based on KSP ex52, I use PCFactorSetUpMatSolverPackage in the process of setting various MUMPS ictnl options. This works fine for me when I'm solving linear problems. > > > > I then wanted to use PCFactorSetUpMatSolverPackage with the PC from a SNES object. I tried to do this with the following code (after calling SNESCreate, SNESSetFunction, and SNESSetJacobian): > > > > KSP snes_ksp; > > SNESGetKSP(snes, &snes_ksp); > KSPSetOperators(ksp,mat,pmat); > /* mat and pmat (which may be the same) are the Jacobian arguments to SNESSetJacobian(), the matrices must exist and types set but they don't have to have the correct Jacobian entries at this point (since the nonlinear solver has not started you cannot know the Jacobian entries yet.) > > PC snes_pc; > > KSPGetPC(snes_ksp, &snes_pc); > PCSetType(snes_pc,PCLU); > > PCFactorSetMatSolverPackage(snes_pc, MATSOLVERMUMPS); > > PCFactorSetUpMatSolverPackage(snes_pc); > > Let me know if this does not work and where it goes wrong. > > > > However, I get a segfault on the call to PCFactorSetUpMatSolverPackage in this case. I was wondering what I need to do to make this work? > > > > Note that I want to set the MUMPS ictnl parameters via code rather than via the commandline since sometimes MUMPS fails (e.g. with error -9 due to a workspace size that is too small) and I need to automatically re-run the solve with different ictnl values when this happens. > > > > Thanks, > > David > > > > > > From mirzadeh at gmail.com Tue Mar 1 20:39:44 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 1 Mar 2016 21:39:44 -0500 Subject: [petsc-users] Neumann BC with non-symmetric matrix In-Reply-To: <3A3188D1-6EEC-4485-BD87-6C1CF0486ED1@cims.nyu.edu> References: <715E1F2D-F735-4419-811A-590E2218516F@mcs.anl.gov> <3C5B4DBF-17B5-434A-8BAD-C3D8DEE30279@cims.nyu.edu> <436D1754-1D89-4240-B5FA-31A9A182034D@cims.nyu.edu> <3A3188D1-6EEC-4485-BD87-6C1CF0486ED1@cims.nyu.edu> Message-ID: On Tue, Mar 1, 2016 at 3:03 PM, Boyce Griffith wrote: > > On Mar 1, 2016, at 2:41 PM, Mohammad Mirzadeh wrote: > > > > On Tue, Mar 1, 2016 at 2:07 PM, Boyce Griffith > wrote: > >> >> On Mar 1, 2016, at 12:06 PM, Mohammad Mirzadeh >> wrote: >> >> Nice discussion. >> >> >> On Tue, Mar 1, 2016 at 10:16 AM, Boyce Griffith >> wrote: >> >>> >>> On Mar 1, 2016, at 9:59 AM, Mark Adams wrote: >>> >>> >>> >>> On Mon, Feb 29, 2016 at 5:42 PM, Boyce Griffith >>> wrote: >>> >>>> >>>> On Feb 29, 2016, at 5:36 PM, Mark Adams wrote: >>>> >>>> >>>>>> GAMG is use for AMR problems like this a lot in BISICLES. >>>>>> >>>>> >>>>> Thanks for the reference. However, a quick look at their paper >>>>> suggests they are using a finite volume discretization which should be >>>>> symmetric and avoid all the shenanigans I'm going through! >>>>> >>>> >>>> No, they are not symmetric. FV is even worse than vertex centered >>>> methods. The BCs and the C-F interfaces add non-symmetry. >>>> >>>> >>>> If you use a different discretization, it is possible to make the c-f >>>> interface discretization symmetric --- but symmetry appears to come at a >>>> cost of the reduction in the formal order of accuracy in the flux along the >>>> c-f interface. I can probably dig up some code that would make it easy to >>>> compare. >>>> >>> >>> I don't know. Chombo/Boxlib have a stencil for C-F and do F-C with >>> refluxing, which I do not linearize. PETSc sums fluxes at faces directly, >>> perhaps this IS symmetric? Toby might know. >>> >>> >>> If you are talking about solving Poisson on a composite grid, then >>> refluxing and summing up fluxes are probably the same procedure. >>> >> >> I am not familiar with the terminology used here. What does the refluxing >> mean? >> >> >>> >>> Users of these kinds of discretizations usually want to use the >>> conservative divergence at coarse-fine interfaces, and so the main question >>> is how to set up the viscous/diffusive flux stencil at coarse-fine >>> interfaces (or, equivalently, the stencil for evaluating ghost cell values >>> at coarse-fine interfaces). It is possible to make the overall >>> discretization symmetric if you use a particular stencil for the flux >>> computation. I think this paper ( >>> http://www.ams.org/journals/mcom/1991-56-194/S0025-5718-1991-1066831-5/S0025-5718-1991-1066831-5.pdf) >>> is one place to look. (This stuff is related to "mimetic finite difference" >>> discretizations of Poisson.) This coarse-fine interface discretization >>> winds up being symmetric (although possibly only w.r.t. a weighted inner >>> product --- I can't remember the details), but the fluxes are only >>> first-order accurate at coarse-fine interfaces. >>> >>> >> Right. I think if the discretization is conservative, i.e. discretizing >> div of grad, and is compact, i.e. only involves neighboring cells sharing a >> common face, then it is possible to construct symmetric discretization. An >> example, that I have used before in other contexts, is described here: >> http://physbam.stanford.edu/~fedkiw/papers/stanford2004-02.pdf >> >> An interesting observation is although the fluxes are only first order >> accurate, the final solution to the linear system exhibits super >> convergence, i.e. second-order accurate, even in L_inf. Similar behavior is >> observed with non-conservative, node-based finite difference >> discretizations. >> >> >> I don't know about that --- check out Table 1 in the paper you cite, >> which seems to indicate first-order convergence in all norms. >> > > Sorry my bad. That was the original work which was later extended in > doi:10.1016/j.compfluid.2005.01.006 > to second order (c.f. > section 3.3) by using flux weighting in the traverse direction. > > > I don't follow the argument about why it is a bad thing for the fine > fluxes to have different values than the overlying coarse flux, but this > probably works out to be the same as the Ewing discretization in certain > cases (although possibly only in 2D with a refinement ratio of 2). > > In general, you can show that by weighting fluxes in the traverse direction (a.k.a using the coarse-grid flux), you can eliminates the leading-order truncation error (which turns out to be O(1)) in that direction. It seems, to me, this is the reason you go from 1st to 2nd second-order accuracy. Unfortunately this is not pointed out in the paper and while I cannot speak for the authors, I think what motivated them to use the coarse flux was to keep the normal velocity continuous across C-F interface after the projection step. It looks like that throughout the AMR literature people have encountered the super-convergence over and over again and while most of the "justifications" makes sense in an intuitive way, I don't think I have seen any rigorously analysis to explain the phenomena. Of course it may just be that I have not searched thoroughly :) A (discrete) Green's functions argument explains why one gets higher-order >> convergence despite localized reductions in accuracy along the coarse-fine >> interface --- it has to do with the fact that errors from individual grid >> locations do not have that large of an effect on the solution, and these >> c-f interface errors are concentrated along on a lower dimensional surface >> in the domain. >> > > This intuitively makes sense and in fact when you plot the error, you do > see spikes at the C-F interfaces. Do you know of a resource that does a > rigorous analysis of the C-F treatment on the solution error? > > > I don't know if I have seen anything that works out all the details for > locally refined grids, but LeVeque's extremely readable book, *Finite > Difference Methods for Ordinary and Partial Differential Equations*, > works this out for the uniform grid case. If I were doing this on an AMR > grid, I would split the domain into a coarse half and a fine half (just one > c-f interface) and work out the discrete Green's functions. > Thanks for the reference. I have always wanted an excuse to read this book! :) > > -- Boyce > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Tue Mar 1 22:19:42 2016 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 2 Mar 2016 12:19:42 +0800 Subject: [petsc-users] Error - Out of memory. This could be due to allocating too large an object or bleeding by not properly ... In-Reply-To: <256DA688-D052-4F80-AD4F-06B8295FED32@mcs.anl.gov> References: <56CD0C7C.5060101@gmail.com> <56CD61C7.3040200@gmail.com> <56CDC961.40004@gmail.com> <56CDCC13.4070807@gmail.com> <56CEA03A.2000407@gmail.com> <43AC24CD-4727-4C4C-B016-6927CD206425@mcs.anl.gov> <56CFFB60.7020702@gmail.com> <256DA688-D052-4F80-AD4F-06B8295FED32@mcs.anl.gov> Message-ID: <56D669DE.80303@gmail.com> On 26/2/2016 9:21 PM, Barry Smith wrote: >> On Feb 26, 2016, at 1:14 AM, TAY wee-beng wrote: >> >> >> On 26/2/2016 1:56 AM, Barry Smith wrote: >>> Run a much smaller problem for a few time steps, making sure you free all the objects at the end, with the option -malloc_dump this will print all the memory that was not freed and hopefully help you track down which objects you forgot to free. >>> >>> Barry >> Hi, >> >> I run a smaller problem and lots of things are shown in the log. How can I know which exactly are not freed from the memory? > Everything in in the log represents unfreed memory. You need to hunt through all the objects you create and make sure you destroy all of them. > > Barry Hi, I have some questions. [0]Total space allocated 2274656 bytes [ 0]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c [ 0]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes/petsc-3.6.3/src/vec/is/utils/isltog.c [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2655 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2654 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c [ 0]1440 bytes VecScatterCreate_PtoS() line 2463 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c [ 0]1440 bytes VecScatterCreate_PtoS() line 2462 in /home/wtay/Codes/ 1. What does the [0] means? I get from [0] to [23]? 2. I defined a variable globally: DM da_cu_types Then I use at each time step: /call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,(IIB_I_end_domain(1) - IIB_I_sta_domain(1) + 1),(IIB_I_end_domain(2) - IIB_I_sta_domain(2) + 1),&// // //(IIB_I_end_domain(3) - IIB_I_sta_domain(3) + 1),PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width_IIB,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_cu_types,ierr)// // //call DMDAGetInfo(da_cu_types,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,num_procs_xyz_IIB(1),num_procs_xyz_IIB(2),num_procs_xyz_IIB(3),&// // //PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,ierr)// // //call DMDAGetCorners(da_cu_types,start_ijk_IIB(1),start_ijk_IIB(2),start_ijk_IIB(3),width_ijk_IIB(1),width_ijk_IIB(2),width_ijk_IIB(3),ierr)// // //call DMDAGetGhostCorners(da_cu_types,start_ijk_ghost_IIB(1),start_ijk_ghost_IIB(2),start_ijk_ghost_IIB(3),width_ijk_ghost_IIB(1),width_ijk_ghost_IIB(2),width_ijk_ghost_IIB(3),ierr)/ The purpose is just to get the starting and ending inidices for each cpu partition. This is done every time step for a moving body case since the /IIB_I_sta_domain/ and /IIB_I_end_domain/ changes After getting all the info, must I call DMDestroy(da_cu_types,ierr)? Is it possible to update and use the new /IIB_I_sta_domain/ and /IIB_I_end_domain /without the need to create and destroy the DM? I thought that may save some time since it's done at every time step. Thanks > >> Is this info helpful? Or should I run in a single core? >> >> Thanks >>>> On Feb 25, 2016, at 12:33 AM, TAY wee-beng wrote: >>>> >>>> Hi, >>>> >>>> I ran the code and it hangs again. However, adding -malloc_test doesn't seem to do any thing. The output (attached) is the same w/o it. >>>> >>>> Wonder if there's anything else I can do. >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 24/2/2016 11:33 PM, Matthew Knepley wrote: >>>>> On Wed, Feb 24, 2016 at 9:28 AM, TAY wee-beng wrote: >>>>> >>>>> On 24/2/2016 11:18 PM, Matthew Knepley wrote: >>>>>> On Wed, Feb 24, 2016 at 9:16 AM, TAY wee-beng wrote: >>>>>> >>>>>> On 24/2/2016 9:12 PM, Matthew Knepley wrote: >>>>>>> On Wed, Feb 24, 2016 at 1:54 AM, TAY wee-beng wrote: >>>>>>> >>>>>>> On 24/2/2016 10:28 AM, Matthew Knepley wrote: >>>>>>>> On Tue, Feb 23, 2016 at 7:50 PM, TAY wee-beng wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I got this error (also attached, full) when running my code. It happens after a few thousand time steps. >>>>>>>> >>>>>>>> The strange thing is that for 2 different clusters, it stops at 2 different time steps. >>>>>>>> >>>>>>>> I wonder if it's related to DM since this happens after I added DM into my code. >>>>>>>> >>>>>>>> In this case, how can I find out the error? I'm thinking valgrind may take very long and gives too many false errors. >>>>>>>> >>>>>>>> It is very easy to find leaks. You just run a few steps with -malloc_dump and see what is left over. >>>>>>>> >>>>>>>> Matt >>>>>>> Hi Matt, >>>>>>> >>>>>>> Do you mean running my a.out with the -malloc_dump and stop after a few time steps? >>>>>>> >>>>>>> What and how should I "see" then? >>>>>>> >>>>>>> -malloc_dump outputs all unfreed memory to the screen after PetscFinalize(), so you should see the leak. >>>>>>> I guess it might be possible to keep creating things that you freed all at once at the end, but that is less likely. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>> Hi, >>>>>> >>>>>> I got the output. I have zipped it since it's rather big. So it seems to be from DM routines but can you help me where the error is from? >>>>>> >>>>>> Its really hard to tell by looking at it. What I do is remove things until there is no leak, then progressively >>>>>> put thing back in until I have the culprit. Then you can think about what is not destroyed. >>>>>> >>>>>> Matt >>>>> Ok so let me get this clear. When it shows: >>>>> >>>>> [21]Total space allocated 1728961264 bytes >>>>> [21]1861664 bytes MatCheckCompressedRow() line 60 in /home/wtay/Codes/petsc-3.6.3/src/mat/utils/compressedrow.c >>>>> [21]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c >>>>> [21]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes >>>>> >>>>> .... >>>>> >>>>> Does it mean that it's simply allocating space ie normal? Or does it show that there's memory leak ie error? >>>>> >>>>> I gave the wrong option. That dumps everything. Lets just look at the leaks with -malloc_test. >>>>> >>>>> Sorry about that, >>>>> >>>>> Matt >>>>> If it's error, should I zoom in and debug around this time at this region? >>>>> >>>>> Thanks >>>>>> Thanks. >>>>>>> >>>>>>>> -- >>>>>>>> Thank you >>>>>>>> >>>>>>>> Yours sincerely, >>>>>>>> >>>>>>>> TAY wee-beng >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 1 23:00:56 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Mar 2016 23:00:56 -0600 Subject: [petsc-users] Error - Out of memory. This could be due to allocating too large an object or bleeding by not properly ... In-Reply-To: <56D669DE.80303@gmail.com> References: <56CD0C7C.5060101@gmail.com> <56CD61C7.3040200@gmail.com> <56CDC961.40004@gmail.com> <56CDCC13.4070807@gmail.com> <56CEA03A.2000407@gmail.com> <43AC24CD-4727-4C4C-B016-6927CD206425@mcs.anl.gov> <56CFFB60.7020702@gmail.com> <256DA688-D052-4F80-AD4F-06B8295FED32@mcs.anl.gov> <56D669DE.80303@gmail.com> Message-ID: <7871E5CD-DD16-4467-9C1B-88B6CB8BADD2@mcs.anl.gov> > On Mar 1, 2016, at 10:19 PM, TAY wee-beng wrote: > > > On 26/2/2016 9:21 PM, Barry Smith wrote: >>> On Feb 26, 2016, at 1:14 AM, TAY wee-beng >>> wrote: >>> >>> >>> On 26/2/2016 1:56 AM, Barry Smith wrote: >>> >>>> Run a much smaller problem for a few time steps, making sure you free all the objects at the end, with the option -malloc_dump this will print all the memory that was not freed and hopefully help you track down which objects you forgot to free. >>>> >>>> Barry >>>> >>> Hi, >>> >>> I run a smaller problem and lots of things are shown in the log. How can I know which exactly are not freed from the memory? >>> >> Everything in in the log represents unfreed memory. You need to hunt through all the objects you create and make sure you destroy all of them. >> >> Barry >> > Hi, > > I have some questions. > > [0]Total space allocated 2274656 bytes > [ 0]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c > [ 0]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes/petsc-3.6.3/src/vec/is/utils/isltog.c > [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2655 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c > [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2654 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c > [ 0]1440 bytes VecScatterCreate_PtoS() line 2463 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c > [ 0]1440 bytes VecScatterCreate_PtoS() line 2462 in /home/wtay/Codes/ > > 1. What does the [0] means? I get from [0] to [23]? It is the MPI process reporting the memory usage > > 2. I defined a variable globally: > > DM da_cu_types > > Then I use at each time step: > > call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,(IIB_I_end_domain(1) - IIB_I_sta_domain(1) + 1),(IIB_I_end_domain(2) - IIB_I_sta_domain(2) + 1),& > > (IIB_I_end_domain(3) - IIB_I_sta_domain(3) + 1),PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width_IIB,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_cu_types,ierr) > > call DMDAGetInfo(da_cu_types,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,num_procs_xyz_IIB(1),num_procs_xyz_IIB(2),num_procs_xyz_IIB(3),& > > PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,ierr) > > call DMDAGetCorners(da_cu_types,start_ijk_IIB(1),start_ijk_IIB(2),start_ijk_IIB(3),width_ijk_IIB(1),width_ijk_IIB(2),width_ijk_IIB(3),ierr) > > call DMDAGetGhostCorners(da_cu_types,start_ijk_ghost_IIB(1),start_ijk_ghost_IIB(2),start_ijk_ghost_IIB(3),width_ijk_ghost_IIB(1),width_ijk_ghost_IIB(2),width_ijk_ghost_IIB(3),ierr) > > The purpose is just to get the starting and ending inidices for each cpu partition. This is done every time step for a moving body case since the IIB_I_sta_domain and IIB_I_end_domain changes > > After getting all the info, must I call DMDestroy(da_cu_types,ierr)? Yes, otherwise you will get more and more DM taking up memory > > Is it possible to update and use the new IIB_I_sta_domain and IIB_I_end_domain without the need to create and destroy the DM? I thought that may save some time since it's done at every time step. If the information you pass to the DMCreate changes then you need to create it again. Barry > > Thanks > >> >>> Is this info helpful? Or should I run in a single core? >>> >>> Thanks >>> >>>>> On Feb 25, 2016, at 12:33 AM, TAY wee-beng >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I ran the code and it hangs again. However, adding -malloc_test doesn't seem to do any thing. The output (attached) is the same w/o it. >>>>> >>>>> Wonder if there's anything else I can do. >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 24/2/2016 11:33 PM, Matthew Knepley wrote: >>>>> >>>>>> On Wed, Feb 24, 2016 at 9:28 AM, TAY wee-beng >>>>>> wrote: >>>>>> >>>>>> On 24/2/2016 11:18 PM, Matthew Knepley wrote: >>>>>> >>>>>>> On Wed, Feb 24, 2016 at 9:16 AM, TAY wee-beng >>>>>>> wrote: >>>>>>> >>>>>>> On 24/2/2016 9:12 PM, Matthew Knepley wrote: >>>>>>> >>>>>>>> On Wed, Feb 24, 2016 at 1:54 AM, TAY wee-beng >>>>>>>> wrote: >>>>>>>> >>>>>>>> On 24/2/2016 10:28 AM, Matthew Knepley wrote: >>>>>>>> >>>>>>>>> On Tue, Feb 23, 2016 at 7:50 PM, TAY wee-beng >>>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I got this error (also attached, full) when running my code. It happens after a few thousand time steps. >>>>>>>>> >>>>>>>>> The strange thing is that for 2 different clusters, it stops at 2 different time steps. >>>>>>>>> >>>>>>>>> I wonder if it's related to DM since this happens after I added DM into my code. >>>>>>>>> >>>>>>>>> In this case, how can I find out the error? I'm thinking valgrind may take very long and gives too many false errors. >>>>>>>>> >>>>>>>>> It is very easy to find leaks. You just run a few steps with -malloc_dump and see what is left over. >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>> Hi Matt, >>>>>>>> >>>>>>>> Do you mean running my a.out with the -malloc_dump and stop after a few time steps? >>>>>>>> >>>>>>>> What and how should I "see" then? >>>>>>>> >>>>>>>> -malloc_dump outputs all unfreed memory to the screen after PetscFinalize(), so you should see the leak. >>>>>>>> I guess it might be possible to keep creating things that you freed all at once at the end, but that is less likely. >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I got the output. I have zipped it since it's rather big. So it seems to be from DM routines but can you help me where the error is from? >>>>>>> >>>>>>> Its really hard to tell by looking at it. What I do is remove things until there is no leak, then progressively >>>>>>> put thing back in until I have the culprit. Then you can think about what is not destroyed. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>> Ok so let me get this clear. When it shows: >>>>>> >>>>>> [21]Total space allocated 1728961264 bytes >>>>>> [21]1861664 bytes MatCheckCompressedRow() line 60 in /home/wtay/Codes/petsc-3.6.3/src/mat/utils/compressedrow.c >>>>>> [21]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c >>>>>> [21]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes >>>>>> >>>>>> .... >>>>>> >>>>>> Does it mean that it's simply allocating space ie normal? Or does it show that there's memory leak ie error? >>>>>> >>>>>> I gave the wrong option. That dumps everything. Lets just look at the leaks with -malloc_test. >>>>>> >>>>>> Sorry about that, >>>>>> >>>>>> Matt >>>>>> If it's error, should I zoom in and debug around this time at this region? >>>>>> >>>>>> Thanks >>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> -- >>>>>>>>> Thank you >>>>>>>>> >>>>>>>>> Yours sincerely, >>>>>>>>> >>>>>>>>> TAY wee-beng >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>> >>> > From Sander.Arens at UGent.be Wed Mar 2 05:13:30 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Wed, 2 Mar 2016 12:13:30 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem Message-ID: Hi, I'm trying to set a mass matrix preconditioner for the Schur complement of an incompressible finite elasticity problem. I tried using the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in the function evaluation after Newton iteration 1. (Btw, I'm using the next branch). Is this because I didn't use PetscDSSetJacobianPreconditioner for the other blocks (which uses the Jacobian itself for preconditioning)? If so, how can I tell Petsc to use the Jacobian for those blocks? I guess when using PetscDSSetJacobianPreconditioner the preconditioner is recomputed at every Newton step, so for a constant mass matrix this might not be ideal. How can I avoid recomputing this at every Newton iteration? Thanks, Sander -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 2 05:25:41 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 Mar 2016 05:25:41 -0600 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens wrote: > Hi, > > I'm trying to set a mass matrix preconditioner for the Schur complement of > an incompressible finite elasticity problem. I tried using the command > PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, NULL, NULL, > NULL) (field 1 is the Lagrange multiplier field). > However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in the > function evaluation after Newton iteration 1. (Btw, I'm using the next > branch). > > Is this because I didn't use PetscDSSetJacobianPreconditioner for the > other blocks (which uses the Jacobian itself for preconditioning)? If so, > how can I tell Petsc to use the Jacobian for those blocks? > 1) I put that code in very recently, and do not even have sufficient test, so it may be buggy 2) If you are using FieldSplit, you can control which blocks come from A and which come from the preconditioner P http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat > I guess when using PetscDSSetJacobianPreconditioner the preconditioner is > recomputed at every Newton step, so for a constant mass matrix this might > not be ideal. How can I avoid recomputing this at every Newton iteration? > Maybe we need another flag like http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html or we need to expand http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html to separately cover the preconditioner matrix. However, both matrices are computed by one call so this would involve interface changes to user code, which we do not like to do. Right now it seems like a small optimization. I would want to wait and see whether it would really be maningful. Thanks, Matt > Thanks, > Sander > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From praveenpetsc at gmail.com Wed Mar 2 09:01:23 2016 From: praveenpetsc at gmail.com (praveen kumar) Date: Wed, 2 Mar 2016 20:31:23 +0530 Subject: [petsc-users] user defined grids Message-ID: Dear all, I am employing PETSc for DD in a serial fortran code and I would be using the solver from serial code itself. In my programme there is subroutine which reads an input file for the grids. I have two questions: 1. The input file contains : No. of segments in each direction, Length of each segment, Grid expansion ratio for each segment, dx(min) and dx(max) (min and max size of sub-division for each segment), No. of uniform sub-divisions in each segment. Will I be able to include all these details in DMDAcreate3D? Is there any example? If no, then is there any way to retain the input file section and still use PETSc? 2. Moreover application requires that, I call the Grid subroutine after some fixed number of iterations. Would you suggest how to fix the above two? Thanks, Praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From praveenpetsc at gmail.com Wed Mar 2 09:06:56 2016 From: praveenpetsc at gmail.com (praveen kumar) Date: Wed, 2 Mar 2016 20:36:56 +0530 Subject: [petsc-users] user defined grids In-Reply-To: References: Message-ID: Forgot to mention, it is a structured cartesian non-uniform mesh. On Wed, Mar 2, 2016 at 8:31 PM, praveen kumar wrote: > Dear all, > > I am employing PETSc for DD in a serial fortran code and I would be using > the solver from serial code itself. In my programme there is subroutine > which reads an input file for the grids. I have two questions: > > > 1. The input file contains : No. of segments in each direction, Length of > each segment, Grid expansion ratio for each segment, dx(min) and dx(max) > (min and max size of sub-division for each segment), No. of uniform > sub-divisions in each segment. Will I be able to include all these details > in DMDAcreate3D? Is there any example? If no, then is there any way to > retain the input file section and still use PETSc? > > > 2. Moreover application requires that, I call the Grid subroutine after > some fixed number of iterations. > > > Would you suggest how to fix the above two? > > > Thanks, > > Praveen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 2 12:51:40 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 2 Mar 2016 12:51:40 -0600 Subject: [petsc-users] user defined grids In-Reply-To: References: Message-ID: <13BD5997-57EE-4095-9D76-56BB1B308C22@mcs.anl.gov> > On Mar 2, 2016, at 9:01 AM, praveen kumar wrote: > > Dear all, > > I am employing PETSc for DD in a serial fortran code and I would be using the solver from serial code itself. In my programme there is subroutine which reads an input file for the grids. I have two questions: > > > > 1. The input file contains : No. of segments in each direction, Length of each segment, Grid expansion ratio for each segment, dx(min) and dx(max) (min and max size of sub-division for each segment), I don't understand what this grid expansion ratio means. > No. of uniform sub-divisions in each segment. Will I be able to include all these details in DMDAcreate3D? The DMDACreate3d() only defines the topology of the mesh (number of mesh points in each direction etc), not the geometry. If you want to provide geometry information (i.e. length of grid segments) then you use DMDAGetCoordinateArray() to set the local values. See DMDAVecGetArray for the form of the "coordinates" array. > Is there any example? If no, then is there any way to retain the input file section and still use PETSc? Should be possible but you will need to do a little programming and poking around to use all the information. Barry > > > > 2. Moreover application requires that, I call the Grid subroutine after some fixed number of iterations. > > > > Would you suggest how to fix the above two? > > > > Thanks, > > Praveen > From bsmith at mcs.anl.gov Wed Mar 2 14:00:05 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 2 Mar 2016 14:00:05 -0600 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: References: Message-ID: <60D9FAF4-181C-4E71-BCE4-12556FCEDF60@mcs.anl.gov> Sorry for the delay. Using master you can now do ./ex2 -ts_monitor binary:filename -ts_monitor_solution binary:filename and it will save each time step followed by each vector solution in the same file called filename. The following python script (without error handling and end of file handling) will read in the time steps and solutions import PetscBinaryIO import numpy io = PetscBinaryIO.PetscBinaryIO() fh = open('binaryoutput') while True: ts = numpy.fromfile(fh, dtype='>d', count=1) print ts objecttype = io.readObjectType(fh) if objecttype == 'Vec': v = io.readVec(fh) print v I think this what you need/want. As Hong noted you could also use -ts_save_trajectory 1 -tstrajectory_type visualization and it will store the timestep and the vector (each timestep/vector is in the same file, one file per timestep.) you could then use a variation of the above python code to read from those files (I think the vector comes first in the trajectory file). Barry > On Mar 1, 2016, at 1:26 AM, Ed Bueler wrote: > > Barry -- > > I am reading the resulting file successfully using > > import struct > import sys > f = open('timesteps','r') > while True: > try: > bytes = f.read(8) > except: > print "f.read() failed" > sys.exit(1) > if len(bytes) > 0: > print struct.unpack('>d',bytes)[0] > else: > break > f.close() > > However, was there a more elegant intended method? I am disturbed by the apparent need to specify big-endian-ness (= '>d') in the struct.unpack() call. > > Ed > > > On Mon, Feb 29, 2016 at 9:39 PM, Ed Bueler wrote: > Barry -- > > Will try it. > > > ... since, presumably, other more powerful IO tools exist that would be used for "real" problems? > > I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. In fact petsc binary seems fairly convenient for that. On the other hand, I am not sure I've ever done anything "real". ;-) > > Anyone out there: Are there a good *convenient* tools for saving space/time-series (= movies) from PETSc TS? I want to add frames and movies from PETSc into slides, etc. I can think of NetCDF but it seems not-very-convenient, and I am worried not well-supported from PETSc. Is setting up TS with events (=TSSetEventMonitor()) and writing separate snapshot files the preferred scalable usage, despite the extra effort compared to "-ts_monitor_solution binary:foo.dat"? > > Ed > > > On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: > > Ed, > > I have added a branch barry/feature-ts-monitor-binary that supports -ts_monitor binary:timesteps that will store in simple binary format each of the time steps associated with each solution. This in conjugation with -ts_monitor_solution binary:solutions will give you two files you can read in. But note that timesteps is a simple binary file of double precision numbers you should read in directly in python, you cannot use PetscBinaryIO.py which is what you will use to read in the solutions file. > > Barry > > Currently PETSc has a binary file format where we can save Vec, Mat, IS, each is marked with a type id for PetscBinaryIO.py to detect, we do not have type ids for simple double precision numbers or arrays of numbers. This is why I have no way of saving the time steps in a way that PetscBinaryIO.py could read them in currently. I don't know how far we want to go in "spiffing up" the PETSc binary format to do more elaborate things since, presumably, other more power IO tools exist that would be used for "real" problems? > > > > On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: > > > > Dear PETSc -- > > > > I have a short C ode code that uses TS to solve y' = g(t,y) where y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does adaptive time-stepping; thus using -ts_monitor shows times at stdout: > > > > $ ./ode -ts_monitor > > solving from t0 = 0.000 with initial time step dt = 0.10000 ... > > 0 TS dt 0.1 time 0. > > 1 TS dt 0.170141 time 0.1 > > 2 TS dt 0.169917 time 0.270141 > > 3 TS dt 0.171145 time 0.440058 > > 4 TS dt 0.173931 time 0.611203 > > 5 TS dt 0.178719 time 0.785134 > > 6 TS dt 0.0361473 time 0.963853 > > 7 TS dt 0.188252 time 1. > > error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 > > > > I want to output the trajectory in PETSc binary and plot it in python using bin/PetscBinaryIO.py. Clearly I need the times shown above to do that. > > > > Note "-ts_monitor_solution binary:XX" gives me a binary file with only y values in it, but not the corresponding times. > > > > My question is, how to get those times in either the same binary file (preferred) or separate binary files? I have tried > > > > $ ./ode -ts_monitor binary:foo.dat # invalid > > $ ./ode -ts_monitor_solution binary:bar.dat # no t in file > > $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no t in file > > > > without success. (I am not sure what the boolean option -ts_save_trajectory does, by the way.) > > > > Thanks! > > > > Ed > > > > PS Sorry if this is a "RTFM" question, but so far I can't find the documentation. > > > > > > -- > > Ed Bueler > > Dept of Math and Stat and Geophysical Institute > > University of Alaska Fairbanks > > Fairbanks, AK 99775-6660 > > 301C Chapman and 410D Elvey > > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > > > > -- > Ed Bueler > Dept of Math and Stat and Geophysical Institute > University of Alaska Fairbanks > Fairbanks, AK 99775-6660 > 301C Chapman and 410D Elvey > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > > > -- > Ed Bueler > Dept of Math and Stat and Geophysical Institute > University of Alaska Fairbanks > Fairbanks, AK 99775-6660 > 301C Chapman and 410D Elvey > 907 474-7693 and 907 474-7199 (fax 907 474-5394) From bsmith at mcs.anl.gov Wed Mar 2 14:00:19 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 2 Mar 2016 14:00:19 -0600 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: <6FBEBBF0-C694-41D5-98E5-324DE8CC73DD@anl.gov> References: <2795BBE1-C04B-45CA-B99C-2F36DD9BB7B4@mcs.anl.gov> <6FBEBBF0-C694-41D5-98E5-324DE8CC73DD@anl.gov> Message-ID: Merged branch into next for testing > On Mar 1, 2016, at 11:18 AM, Hong Zhang wrote: > > Hi Barry, > > Actually the TSTrajectory object can save both the solution and the corresponding time information into binary files. Although it is designed specifically for adjoint checkpointing, it does have the side benefit to assist in visualization of simulation trajectories. For adjoint checkpointing, not only solution is saved, but also stage values are saved. So I created a new TSTrajectory type particularly for visualization purpose, which can be access at my branch hongzh/tstrajectory_visualization. > > One can enable this feature using the command line options > > -ts_save_trajectory 1 -tstrajectory_type visualization > > The full trajectory will be saved into a folder with multiple files, one file corresponding to one time step. Then we can use a MATLAB script, such as PetscBinaryRead.m, to read these binary files. But the default script coming with PETSc needs to be modified a little bit. Because the trajectory is always saved as a solution vector, followed by a PetscReal variable, I suggest to use the following code in MATLAB: > > if header == 1211214 % Petsc Vec Object > %% Read state vector > m = double(read(fd,1,indices)); > x = zeros(m,1); > v = read(fd,m,precision); > x(:) = v; > %% Read time > t = read(fd,1,precision); > end > > Shri (copied in this email) has successfully used this approach to do visualization. Perhaps we can include this feature in the new release if it is useful to some users. > > Hong > > >> On Mar 1, 2016, at 10:32 AM, Barry Smith wrote: >> >>> >>> On Mar 1, 2016, at 12:39 AM, Ed Bueler wrote: >>> >>> Barry -- >>> >>> Will try it. >>> >>>> ... since, presumably, other more powerful IO tools exist that would be used for "real" problems? >>> >>> I know there are tools for snapshotting from PETSc, e.g. VecView to .vtk. In fact petsc binary seems fairly convenient for that. On the other hand, I am not sure I've ever done anything "real". ;-) >>> >>> Anyone out there: Are there a good *convenient* tools for saving space/time-series (= movies) from PETSc TS? I want to add frames and movies from PETSc into slides, etc. I can think of NetCDF but it seems not-very-convenient, and I am worried not well-supported from PETSc. Is setting up TS with events (=TSSetEventMonitor()) and writing separate snapshot files the preferred scalable usage, despite the extra effort compared to "-ts_monitor_solution binary:foo.dat"? >> >> Ed, >> >> As I said in my previous email since we don't have a way of indicating plain double precision numbers in our binary files it is not possible to put both the vectors and the time steps in the same file without augmenting our file format. >> >> Barry >> >> >> >>> >>> Ed >>> >>> >>> On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: >>> >>> Ed, >>> >>> I have added a branch barry/feature-ts-monitor-binary that supports -ts_monitor binary:timesteps that will store in simple binary format each of the time steps associated with each solution. This in conjugation with -ts_monitor_solution binary:solutions will give you two files you can read in. But note that timesteps is a simple binary file of double precision numbers you should read in directly in python, you cannot use PetscBinaryIO.py which is what you will use to read in the solutions file. >>> >>> Barry >>> >>> Currently PETSc has a binary file format where we can save Vec, Mat, IS, each is marked with a type id for PetscBinaryIO.py to detect, we do not have type ids for simple double precision numbers or arrays of numbers. This is why I have no way of saving the time steps in a way that PetscBinaryIO.py could read them in currently. I don't know how far we want to go in "spiffing up" the PETSc binary format to do more elaborate things since, presumably, other more power IO tools exist that would be used for "real" problems? >>> >>> >>>> On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: >>>> >>>> Dear PETSc -- >>>> >>>> I have a short C ode code that uses TS to solve y' = g(t,y) where y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does adaptive time-stepping; thus using -ts_monitor shows times at stdout: >>>> >>>> $ ./ode -ts_monitor >>>> solving from t0 = 0.000 with initial time step dt = 0.10000 ... >>>> 0 TS dt 0.1 time 0. >>>> 1 TS dt 0.170141 time 0.1 >>>> 2 TS dt 0.169917 time 0.270141 >>>> 3 TS dt 0.171145 time 0.440058 >>>> 4 TS dt 0.173931 time 0.611203 >>>> 5 TS dt 0.178719 time 0.785134 >>>> 6 TS dt 0.0361473 time 0.963853 >>>> 7 TS dt 0.188252 time 1. >>>> error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 >>>> >>>> I want to output the trajectory in PETSc binary and plot it in python using bin/PetscBinaryIO.py. Clearly I need the times shown above to do that. >>>> >>>> Note "-ts_monitor_solution binary:XX" gives me a binary file with only y values in it, but not the corresponding times. >>>> >>>> My question is, how to get those times in either the same binary file (preferred) or separate binary files? I have tried >>>> >>>> $ ./ode -ts_monitor binary:foo.dat # invalid >>>> $ ./ode -ts_monitor_solution binary:bar.dat # no t in file >>>> $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no t in file >>>> >>>> without success. (I am not sure what the boolean option -ts_save_trajectory does, by the way.) >>>> >>>> Thanks! >>>> >>>> Ed >>>> >>>> PS Sorry if this is a "RTFM" question, but so far I can't find the documentation. >>>> >>>> >>>> -- >>>> Ed Bueler >>>> Dept of Math and Stat and Geophysical Institute >>>> University of Alaska Fairbanks >>>> Fairbanks, AK 99775-6660 >>>> 301C Chapman and 410D Elvey >>>> 907 474-7693 and 907 474-7199 (fax 907 474-5394) >>> >>> >>> >>> >>> -- >>> Ed Bueler >>> Dept of Math and Stat and Geophysical Institute >>> University of Alaska Fairbanks >>> Fairbanks, AK 99775-6660 >>> 301C Chapman and 410D Elvey >>> 907 474-7693 and 907 474-7199 (fax 907 474-5394) > From elbueler at alaska.edu Wed Mar 2 14:33:32 2016 From: elbueler at alaska.edu (Ed Bueler) Date: Wed, 2 Mar 2016 11:33:32 -0900 Subject: [petsc-users] how to get full trajectory from TS into petsc binary In-Reply-To: <60D9FAF4-181C-4E71-BCE4-12556FCEDF60@mcs.anl.gov> References: <60D9FAF4-181C-4E71-BCE4-12556FCEDF60@mcs.anl.gov> Message-ID: Barry -- Works for me! Note that import numpy import PetscBinaryIO fh = open('t.dat','r') t = numpy.fromfile(fh, dtype='>d') fh.close() io = PetscBinaryIO.PetscBinaryIO() y = numpy.array(io.readBinaryFile('y.dat')).transpose() reads the entire trajectory generated by -ts_monitor binary:t.dat -ts_monitor_solution binary:y.dat Then import matplotlib.pyplot as plt for k in range(np.shape(y)[0]): plt.plot(t,y[k],label='y[%d]' % k) plt.xlabel('t') plt.legend() plots it with labels. This is only a reasonable for small-dimension ODEs; other vis. methods make more sense for PDEs. It is hard to beat the convenience of this way of storing trajectories and doing quick visualizations from python, so I'll probably stick to it. Much appreciated! Ed On Wed, Mar 2, 2016 at 11:00 AM, Barry Smith wrote: > > Sorry for the delay. Using master you can now do > > ./ex2 -ts_monitor binary:filename -ts_monitor_solution binary:filename > > and it will save each time step followed by each vector solution in the > same file called filename. > > The following python script (without error handling and end of file > handling) will read in the time steps and solutions > > import PetscBinaryIO > import numpy > io = PetscBinaryIO.PetscBinaryIO() > fh = open('binaryoutput') > while True: > ts = numpy.fromfile(fh, dtype='>d', count=1) > print ts > objecttype = io.readObjectType(fh) > if objecttype == 'Vec': > v = io.readVec(fh) > print v > > I think this what you need/want. > > As Hong noted you could also use -ts_save_trajectory 1 > -tstrajectory_type visualization and it will store the timestep and the > vector (each timestep/vector is in the same file, one file per timestep.) > you could then use a variation of the above python code to read from those > files (I think the vector comes first in the trajectory file). > > Barry > > > On Mar 1, 2016, at 1:26 AM, Ed Bueler wrote: > > > > Barry -- > > > > I am reading the resulting file successfully using > > > > import struct > > import sys > > f = open('timesteps','r') > > while True: > > try: > > bytes = f.read(8) > > except: > > print "f.read() failed" > > sys.exit(1) > > if len(bytes) > 0: > > print struct.unpack('>d',bytes)[0] > > else: > > break > > f.close() > > > > However, was there a more elegant intended method? I am disturbed by > the apparent need to specify big-endian-ness (= '>d') in the > struct.unpack() call. > > > > Ed > > > > > > On Mon, Feb 29, 2016 at 9:39 PM, Ed Bueler wrote: > > Barry -- > > > > Will try it. > > > > > ... since, presumably, other more powerful IO tools exist that would > be used for "real" problems? > > > > I know there are tools for snapshotting from PETSc, e.g. VecView to > .vtk. In fact petsc binary seems fairly convenient for that. On the other > hand, I am not sure I've ever done anything "real". ;-) > > > > Anyone out there: Are there a good *convenient* tools for saving > space/time-series (= movies) from PETSc TS? I want to add frames and > movies from PETSc into slides, etc. I can think of NetCDF but it seems > not-very-convenient, and I am worried not well-supported from PETSc. Is > setting up TS with events (=TSSetEventMonitor()) and writing separate > snapshot files the preferred scalable usage, despite the extra effort > compared to "-ts_monitor_solution binary:foo.dat"? > > > > Ed > > > > > > On Mon, Feb 29, 2016 at 8:53 PM, Barry Smith wrote: > > > > Ed, > > > > I have added a branch barry/feature-ts-monitor-binary that supports > -ts_monitor binary:timesteps that will store in simple binary format each > of the time steps associated with each solution. This in conjugation with > -ts_monitor_solution binary:solutions will give you two files you can read > in. But note that timesteps is a simple binary file of double precision > numbers you should read in directly in python, you cannot use > PetscBinaryIO.py which is what you will use to read in the solutions file. > > > > Barry > > > > Currently PETSc has a binary file format where we can save Vec, Mat, IS, > each is marked with a type id for PetscBinaryIO.py to detect, we do not > have type ids for simple double precision numbers or arrays of numbers. > This is why I have no way of saving the time steps in a way that > PetscBinaryIO.py could read them in currently. I don't know how far we want > to go in "spiffing up" the PETSc binary format to do more elaborate things > since, presumably, other more power IO tools exist that would be used for > "real" problems? > > > > > > > On Feb 29, 2016, at 3:24 PM, Ed Bueler wrote: > > > > > > Dear PETSc -- > > > > > > I have a short C ode code that uses TS to solve y' = g(t,y) where > y(t) is a 2-dim'l vector. My code defaults to -ts_type rk so it does > adaptive time-stepping; thus using -ts_monitor shows times at stdout: > > > > > > $ ./ode -ts_monitor > > > solving from t0 = 0.000 with initial time step dt = 0.10000 ... > > > 0 TS dt 0.1 time 0. > > > 1 TS dt 0.170141 time 0.1 > > > 2 TS dt 0.169917 time 0.270141 > > > 3 TS dt 0.171145 time 0.440058 > > > 4 TS dt 0.173931 time 0.611203 > > > 5 TS dt 0.178719 time 0.785134 > > > 6 TS dt 0.0361473 time 0.963853 > > > 7 TS dt 0.188252 time 1. > > > error at tf = 1.000 : |y-y_exact|_inf = 0.000144484 > > > > > > I want to output the trajectory in PETSc binary and plot it in python > using bin/PetscBinaryIO.py. Clearly I need the times shown above to do > that. > > > > > > Note "-ts_monitor_solution binary:XX" gives me a binary file with only > y values in it, but not the corresponding times. > > > > > > My question is, how to get those times in either the same binary file > (preferred) or separate binary files? I have tried > > > > > > $ ./ode -ts_monitor binary:foo.dat # invalid > > > $ ./ode -ts_monitor_solution binary:bar.dat # no t in file > > > $ ./ode -ts_monitor_solution binary:baz.dat -ts_save_trajectory # no > t in file > > > > > > without success. (I am not sure what the boolean option > -ts_save_trajectory does, by the way.) > > > > > > Thanks! > > > > > > Ed > > > > > > PS Sorry if this is a "RTFM" question, but so far I can't find the > documentation. > > > > > > > > > -- > > > Ed Bueler > > > Dept of Math and Stat and Geophysical Institute > > > University of Alaska Fairbanks > > > Fairbanks, AK 99775-6660 > > > 301C Chapman and 410D Elvey > > > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > > > > > > > > > -- > > Ed Bueler > > Dept of Math and Stat and Geophysical Institute > > University of Alaska Fairbanks > > Fairbanks, AK 99775-6660 > > 301C Chapman and 410D Elvey > > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > > > > > > > -- > > Ed Bueler > > Dept of Math and Stat and Geophysical Institute > > University of Alaska Fairbanks > > Fairbanks, AK 99775-6660 > > 301C Chapman and 410D Elvey > > 907 474-7693 and 907 474-7199 (fax 907 474-5394) > > -- Ed Bueler Dept of Math and Stat and Geophysical Institute University of Alaska Fairbanks Fairbanks, AK 99775-6660 301C Chapman and 410D Elvey 907 474-7693 and 907 474-7199 (fax 907 474-5394) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kalan019 at umn.edu Wed Mar 2 16:09:50 2016 From: kalan019 at umn.edu (Vasileios Kalantzis) Date: Wed, 2 Mar 2016 16:09:50 -0600 Subject: [petsc-users] Pardiso in PETSc Message-ID: Hi everyone, I have read some previous posts on combining PARDISO with PETSc but still I am not sure on how to combine the two libraries. So I thought of asking what is the best approach. I tried to separately install PARDISO and use it in my PETSc program but the compiling could not complete because of the openmp flags that I pass (at least that is my understanding). Should I try to find a workaround for the above problem or there is a simpler way? I have MKL installed in my system and re-configured with --with-mkl-pardiso-dir="$MKLROOT" where MKLROOT is MKL's installation directory. Any possible hint is much appreciated! -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 2 16:17:08 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 Mar 2016 16:17:08 -0600 Subject: [petsc-users] Pardiso in PETSc In-Reply-To: References: Message-ID: On Wed, Mar 2, 2016 at 4:09 PM, Vasileios Kalantzis wrote: > Hi everyone, > > I have read some previous posts on combining PARDISO > with PETSc but still I am not sure on how to combine the > two libraries. So I thought of asking what is the best approach. > I tried to separately install PARDISO and use it in my PETSc > program but the compiling could not complete because of > the openmp flags that I pass (at least that is my understanding). > If there is a problem, you have to mail or we do not know what happened. > Should I try to find a workaround for the above problem or > there is a simpler way? I have MKL installed in my system > and re-configured with --with-mkl-pardiso-dir="$MKLROOT" > where MKLROOT is MKL's installation directory. > > Any possible hint is much appreciated! > If you configured with it, then you can just use -pc_factor_mat_solver_package mkl_pardiso -pc_type lu Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 2 16:21:21 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 2 Mar 2016 16:21:21 -0600 Subject: [petsc-users] Pardiso in PETSc In-Reply-To: References: Message-ID: You can check config/examples/arch-pardiso.py on how we build petsc with pardiso. And run targerts runex2_mkl_pardiso_lu runex2_mkl_pardiso_cholesky in ksp/ksp/examples/tutorials/makefile Satish On Wed, 2 Mar 2016, Matthew Knepley wrote: > On Wed, Mar 2, 2016 at 4:09 PM, Vasileios Kalantzis > wrote: > > > Hi everyone, > > > > I have read some previous posts on combining PARDISO > > with PETSc but still I am not sure on how to combine the > > two libraries. So I thought of asking what is the best approach. > > I tried to separately install PARDISO and use it in my PETSc > > program but the compiling could not complete because of > > the openmp flags that I pass (at least that is my understanding). > > > > If there is a problem, you have to mail or we do not know what happened. > > > > Should I try to find a workaround for the above problem or > > there is a simpler way? I have MKL installed in my system > > and re-configured with --with-mkl-pardiso-dir="$MKLROOT" > > where MKLROOT is MKL's installation directory. > > > > Any possible hint is much appreciated! > > > > If you configured with it, then you can just use > > -pc_factor_mat_solver_package mkl_pardiso -pc_type lu > > Thanks, > > Matt > > From kalan019 at umn.edu Wed Mar 2 16:25:48 2016 From: kalan019 at umn.edu (Vasileios Kalantzis) Date: Wed, 2 Mar 2016 16:25:48 -0600 Subject: [petsc-users] Pardiso in PETSc In-Reply-To: References: Message-ID: Matt, Satish, Thanks for your feedback -- this is most helpful, I will follow your advice and let you know if I still can not make it, On Wed, Mar 2, 2016 at 4:21 PM, Satish Balay wrote: > You can check config/examples/arch-pardiso.py on how we build petsc with > pardiso. > > And run targerts runex2_mkl_pardiso_lu runex2_mkl_pardiso_cholesky in > ksp/ksp/examples/tutorials/makefile > > Satish > > On Wed, 2 Mar 2016, Matthew Knepley wrote: > > > On Wed, Mar 2, 2016 at 4:09 PM, Vasileios Kalantzis > > wrote: > > > > > Hi everyone, > > > > > > I have read some previous posts on combining PARDISO > > > with PETSc but still I am not sure on how to combine the > > > two libraries. So I thought of asking what is the best approach. > > > I tried to separately install PARDISO and use it in my PETSc > > > program but the compiling could not complete because of > > > the openmp flags that I pass (at least that is my understanding). > > > > > > > If there is a problem, you have to mail or we do not know what happened. > > > > > > > Should I try to find a workaround for the above problem or > > > there is a simpler way? I have MKL installed in my system > > > and re-configured with --with-mkl-pardiso-dir="$MKLROOT" > > > where MKLROOT is MKL's installation directory. > > > > > > Any possible hint is much appreciated! > > > > > > > If you configured with it, then you can just use > > > > -pc_factor_mat_solver_package mkl_pardiso -pc_type lu > > > > Thanks, > > > > Matt > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Mar 2 16:28:49 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 2 Mar 2016 16:28:49 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation Message-ID: Dear all, Using the firedrake project, I am solving this simple mixed poisson problem: mesh = UnitCubeMesh(40,40,40) V = FunctionSpace(mesh,"RT",1) Q = FunctionSpace(mesh,"DG",0) W = V*Q v, p = TrialFunctions(W) w, q = TestFunctions(W) f = Function(Q) f.interpolate(Expression("12*pi*pi*sin(pi*x[0]*2)*sin(pi*x[1]*2)*sin(2*pi*x[2])")) a = dot(v,w)*dx - p*div(w)*dx + div(v)*q*dx L = f*q*dx u = Function(W) solve(a==L,u,solver_parameters={...}) This problem has 1161600 degrees of freedom. The solver_parameters are: -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type: upper -pc_fieldsplit_schur_precondition selfp -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type bjacobi -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type hypre/ml/gamg for the last option, I compared the wall-clock timings for hypre, ml,and gamg. Here are the strong-scaling results (across 64 cores, 8 cores per Intel Xeon E5-2670 node) for hypre, ml, and gamg: hypre: 1 core: 47.5 s, 12 solver iters 2 cores: 34.1 s, 15 solver iters 4 cores: 21.5 s, 15 solver iters 8 cores: 16.6 s, 15 solver iters 16 cores: 10.2 s, 15 solver iters 24 cores: 7.66 s, 15 solver iters 32 cores: 6.31 s, 15 solver iters 40 cores: 5.68 s, 15 solver iters 48 cores: 5.36 s, 16 solver iters 56 cores: 5.12 s, 16 solver iters 64 cores: 4.99 s, 16 solver iters ml: 1 core: 4.44 s, 14 solver iters 2 cores: 2.85 s, 16 solver iters 4 cores: 1.6 s, 17 solver iters 8 cores: 0.966 s, 17 solver iters 16 cores: 0.585 s, 18 solver iters 24 cores: 0.440 s, 18 solver iters 32 cores: 0.375 s, 18 solver iters 40 cores: 0.332 s, 18 solver iters 48 cores: 0.307 s, 17 solver iters 56 cores: 0.290 s, 18 solver iters 64 cores: 0.281 s, 18 solver items gamg: 1 core: 613 s, 12 solver iters 2 cores: 204 s, 15 solver iters 4 cores: 77.1 s, 15 solver iters 8 cores: 38.1 s, 15 solver iters 16 cores: 15.9 s, 16 solver iters 24 cores: 9.24 s, 16 solver iters 32 cores: 5.92 s, 16 solver iters 40 cores: 4.72 s, 16 solver iters 48 cores: 3.89 s, 16 solver iters 56 cores: 3.65 s, 16 solver iters 64 cores: 3.46 s, 16 solver iters The performance difference between ML and HYPRE makes sense to me, but what I am really confused about is GAMG. It seems GAMG is really slow on a single core but something internally is causing it to speed up super-linearly as I increase the number of MPI processes. Shouldn't ML and GAMG have the same performance? I am not sure what log outputs to give you guys, but for starters, below is -ksp_view for the single core case with GAMG KSP Object:(solver_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object:(solver_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization UPPER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (solver_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_0_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (solver_fieldsplit_0_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_0_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=777600, cols=777600 package used to perform factorization: petsc total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (solver_fieldsplit_0_) 1 MPI processes type: seqaij rows=777600, cols=777600 total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (solver_fieldsplit_0_) 1 MPI processes type: seqaij rows=777600, cols=777600 total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (solver_fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_) 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=5 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices GAMG specific options Threshold for dropping small values from graph 0. AGG specific options Symmetric graph false Coarse grid solver -- level ------------------------------- KSP Object: (solver_fieldsplit_1_mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=9, cols=9 package used to perform factorization: petsc total: nonzeros=81, allocated nonzeros=81 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=9, cols=9 total: nonzeros=81, allocated nonzeros=81 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=9, cols=9 total: nonzeros=81, allocated nonzeros=81 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (solver_fieldsplit_1_mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0999525, max = 1.09948 Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (solver_fieldsplit_1_mg_levels_1_esteig_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_mg_levels_1_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=207, cols=207 total: nonzeros=42849, allocated nonzeros=42849 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 42 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (solver_fieldsplit_1_mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0996628, max = 1.09629 Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (solver_fieldsplit_1_mg_levels_2_esteig_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_mg_levels_2_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=5373, cols=5373 total: nonzeros=28852043, allocated nonzeros=28852043 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 1481 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (solver_fieldsplit_1_mg_levels_3_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0994294, max = 1.09372 Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (solver_fieldsplit_1_mg_levels_3_esteig_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_mg_levels_3_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=52147, cols=52147 total: nonzeros=38604909, allocated nonzeros=38604909 total number of mallocs used during MatSetValues calls =2 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (solver_fieldsplit_1_mg_levels_4_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.158979, max = 1.74876 Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (solver_fieldsplit_1_mg_levels_4_esteig_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (solver_fieldsplit_1_mg_levels_4_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix followed by preconditioner matrix: Mat Object: (solver_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=384000, cols=384000 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (solver_fieldsplit_1_) 1 MPI processes type: seqaij rows=384000, cols=384000 total: nonzeros=384000, allocated nonzeros=384000 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=384000, cols=777600 total: nonzeros=1919999, allocated nonzeros=1919999 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (solver_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_0_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (solver_fieldsplit_0_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_0_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=777600, cols=777600 package used to perform factorization: petsc total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (solver_fieldsplit_0_) 1 MPI processes type: seqaij rows=777600, cols=777600 total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (solver_fieldsplit_0_) 1 MPI processes type: seqaij rows=777600, cols=777600 total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: 1 MPI processes type: seqaij rows=777600, cols=384000 total: nonzeros=1919999, allocated nonzeros=1919999 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: 1 MPI processes type: seqaij rows=384000, cols=384000 total: nonzeros=3416452, allocated nonzeros=3416452 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix followed by preconditioner matrix: Mat Object: (solver_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=384000, cols=384000 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (solver_fieldsplit_1_) 1 MPI processes type: seqaij rows=384000, cols=384000 total: nonzeros=384000, allocated nonzeros=384000 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=384000, cols=777600 total: nonzeros=1919999, allocated nonzeros=1919999 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (solver_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_0_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (solver_fieldsplit_0_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (solver_fieldsplit_0_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=777600, cols=777600 package used to perform factorization: petsc total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (solver_fieldsplit_0_) 1 MPI processes type: seqaij rows=777600, cols=777600 total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (solver_fieldsplit_0_) 1 MPI processes type: seqaij rows=777600, cols=777600 total: nonzeros=5385600, allocated nonzeros=5385600 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: 1 MPI processes type: seqaij rows=777600, cols=384000 total: nonzeros=1919999, allocated nonzeros=1919999 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: 1 MPI processes type: seqaij rows=384000, cols=384000 total: nonzeros=3416452, allocated nonzeros=3416452 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=1161600, cols=1161600 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="solver_fieldsplit_0_", type=seqaij, rows=777600, cols=777600 (0,1) : type=seqaij, rows=777600, cols=384000 (1,0) : type=seqaij, rows=384000, cols=777600 (1,1) : prefix="solver_fieldsplit_1_", type=seqaij, rows=384000, cols=384000 Any insight as to what's happening? Btw this firedrake/petsc-mapdes is from way back in october 2015 (yes much has changed since but reinstalling/updating firedrake and petsc on LANL's firewall HPC machines is a big pain in the ass). Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 2 17:11:35 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 2 Mar 2016 17:11:35 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: Message-ID: <84C7405D-8471-4A13-8D9F-B841EECC243F@mcs.anl.gov> Justin, Do you have the -log_summary output for these runs? Barry > On Mar 2, 2016, at 4:28 PM, Justin Chang wrote: > > Dear all, > > Using the firedrake project, I am solving this simple mixed poisson problem: > > mesh = UnitCubeMesh(40,40,40) > V = FunctionSpace(mesh,"RT",1) > Q = FunctionSpace(mesh,"DG",0) > W = V*Q > > v, p = TrialFunctions(W) > w, q = TestFunctions(W) > > f = Function(Q) > f.interpolate(Expression("12*pi*pi*sin(pi*x[0]*2)*sin(pi*x[1]*2)*sin(2*pi*x[2])")) > > a = dot(v,w)*dx - p*div(w)*dx + div(v)*q*dx > L = f*q*dx > > u = Function(W) > solve(a==L,u,solver_parameters={...}) > > This problem has 1161600 degrees of freedom. The solver_parameters are: > > -ksp_type gmres > -pc_type fieldsplit > -pc_fieldsplit_type schur > -pc_fieldsplit_schur_fact_type: upper > -pc_fieldsplit_schur_precondition selfp > -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type bjacobi > -fieldsplit_1_ksp_type preonly > -fieldsplit_1_pc_type hypre/ml/gamg > > for the last option, I compared the wall-clock timings for hypre, ml,and gamg. Here are the strong-scaling results (across 64 cores, 8 cores per Intel Xeon E5-2670 node) for hypre, ml, and gamg: > > hypre: > 1 core: 47.5 s, 12 solver iters > 2 cores: 34.1 s, 15 solver iters > 4 cores: 21.5 s, 15 solver iters > 8 cores: 16.6 s, 15 solver iters > 16 cores: 10.2 s, 15 solver iters > 24 cores: 7.66 s, 15 solver iters > 32 cores: 6.31 s, 15 solver iters > 40 cores: 5.68 s, 15 solver iters > 48 cores: 5.36 s, 16 solver iters > 56 cores: 5.12 s, 16 solver iters > 64 cores: 4.99 s, 16 solver iters > > ml: > 1 core: 4.44 s, 14 solver iters > 2 cores: 2.85 s, 16 solver iters > 4 cores: 1.6 s, 17 solver iters > 8 cores: 0.966 s, 17 solver iters > 16 cores: 0.585 s, 18 solver iters > 24 cores: 0.440 s, 18 solver iters > 32 cores: 0.375 s, 18 solver iters > 40 cores: 0.332 s, 18 solver iters > 48 cores: 0.307 s, 17 solver iters > 56 cores: 0.290 s, 18 solver iters > 64 cores: 0.281 s, 18 solver items > > gamg: > 1 core: 613 s, 12 solver iters > 2 cores: 204 s, 15 solver iters > 4 cores: 77.1 s, 15 solver iters > 8 cores: 38.1 s, 15 solver iters > 16 cores: 15.9 s, 16 solver iters > 24 cores: 9.24 s, 16 solver iters > 32 cores: 5.92 s, 16 solver iters > 40 cores: 4.72 s, 16 solver iters > 48 cores: 3.89 s, 16 solver iters > 56 cores: 3.65 s, 16 solver iters > 64 cores: 3.46 s, 16 solver iters > > The performance difference between ML and HYPRE makes sense to me, but what I am really confused about is GAMG. It seems GAMG is really slow on a single core but something internally is causing it to speed up super-linearly as I increase the number of MPI processes. Shouldn't ML and GAMG have the same performance? I am not sure what log outputs to give you guys, but for starters, below is -ksp_view for the single core case with GAMG > > KSP Object:(solver_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object:(solver_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization UPPER > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (solver_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_0_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (solver_fieldsplit_0_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_0_sub_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=777600, cols=777600 > package used to perform factorization: petsc > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=777600, cols=777600 > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=777600, cols=777600 > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (solver_fieldsplit_1_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_) 1 MPI processes > type: gamg > MG: type is MULTIPLICATIVE, levels=5 cycles=v > Cycles per PCApply=1 > Using Galerkin computed coarse grid matrices > GAMG specific options > Threshold for dropping small values from graph 0. > AGG specific options > Symmetric graph false > Coarse grid solver -- level ------------------------------- > KSP Object: (solver_fieldsplit_1_mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_mg_coarse_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=9, cols=9 > package used to perform factorization: petsc > total: nonzeros=81, allocated nonzeros=81 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=9, cols=9 > total: nonzeros=81, allocated nonzeros=81 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=9, cols=9 > total: nonzeros=81, allocated nonzeros=81 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2 nodes, limit used is 5 > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (solver_fieldsplit_1_mg_levels_1_) 1 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.0999525, max = 1.09948 > Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] > KSP Object: (solver_fieldsplit_1_mg_levels_1_esteig_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_mg_levels_1_) 1 MPI processes > type: sor > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=207, cols=207 > total: nonzeros=42849, allocated nonzeros=42849 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 42 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (solver_fieldsplit_1_mg_levels_2_) 1 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.0996628, max = 1.09629 > Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] > KSP Object: (solver_fieldsplit_1_mg_levels_2_esteig_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_mg_levels_2_) 1 MPI processes > type: sor > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=5373, cols=5373 > total: nonzeros=28852043, allocated nonzeros=28852043 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 1481 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 3 ------------------------------- > KSP Object: (solver_fieldsplit_1_mg_levels_3_) 1 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.0994294, max = 1.09372 > Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] > KSP Object: (solver_fieldsplit_1_mg_levels_3_esteig_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_mg_levels_3_) 1 MPI processes > type: sor > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=52147, cols=52147 > total: nonzeros=38604909, allocated nonzeros=38604909 > total number of mallocs used during MatSetValues calls =2 > not using I-node routines > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 4 ------------------------------- > KSP Object: (solver_fieldsplit_1_mg_levels_4_) 1 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.158979, max = 1.74876 > Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] > KSP Object: (solver_fieldsplit_1_mg_levels_4_esteig_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_1_mg_levels_4_) 1 MPI processes > type: sor > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. > linear system matrix followed by preconditioner matrix: > Mat Object: (solver_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=384000, cols=384000 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (solver_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=384000, cols=384000 > total: nonzeros=384000, allocated nonzeros=384000 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=384000, cols=777600 > total: nonzeros=1919999, allocated nonzeros=1919999 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP of A00 > KSP Object: (solver_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_0_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (solver_fieldsplit_0_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_0_sub_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=777600, cols=777600 > package used to perform factorization: petsc > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=777600, cols=777600 > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=777600, cols=777600 > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=777600, cols=384000 > total: nonzeros=1919999, allocated nonzeros=1919999 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Mat Object: 1 MPI processes > type: seqaij > rows=384000, cols=384000 > total: nonzeros=3416452, allocated nonzeros=3416452 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix followed by preconditioner matrix: > Mat Object: (solver_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=384000, cols=384000 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (solver_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=384000, cols=384000 > total: nonzeros=384000, allocated nonzeros=384000 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=384000, cols=777600 > total: nonzeros=1919999, allocated nonzeros=1919999 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP of A00 > KSP Object: (solver_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_0_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (solver_fieldsplit_0_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (solver_fieldsplit_0_sub_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=777600, cols=777600 > package used to perform factorization: petsc > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=777600, cols=777600 > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=777600, cols=777600 > total: nonzeros=5385600, allocated nonzeros=5385600 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=777600, cols=384000 > total: nonzeros=1919999, allocated nonzeros=1919999 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Mat Object: 1 MPI processes > type: seqaij > rows=384000, cols=384000 > total: nonzeros=3416452, allocated nonzeros=3416452 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: nest > rows=1161600, cols=1161600 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="solver_fieldsplit_0_", type=seqaij, rows=777600, cols=777600 > (0,1) : type=seqaij, rows=777600, cols=384000 > (1,0) : type=seqaij, rows=384000, cols=777600 > (1,1) : prefix="solver_fieldsplit_1_", type=seqaij, rows=384000, cols=384000 > > Any insight as to what's happening? Btw this firedrake/petsc-mapdes is from way back in october 2015 (yes much has changed since but reinstalling/updating firedrake and petsc on LANL's firewall HPC machines is a big pain in the ass). > > Thanks, > Justin From jychang48 at gmail.com Wed Mar 2 19:15:48 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 2 Mar 2016 19:15:48 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: <84C7405D-8471-4A13-8D9F-B841EECC243F@mcs.anl.gov> References: <84C7405D-8471-4A13-8D9F-B841EECC243F@mcs.anl.gov> Message-ID: Barry, Attached are the log_summary output for each preconditioner. Thanks, Justin On Wednesday, March 2, 2016, Barry Smith wrote: > > Justin, > > Do you have the -log_summary output for these runs? > > Barry > > > On Mar 2, 2016, at 4:28 PM, Justin Chang wrote: > > > > Dear all, > > > > Using the firedrake project, I am solving this simple mixed poisson > problem: > > > > mesh = UnitCubeMesh(40,40,40) > > V = FunctionSpace(mesh,"RT",1) > > Q = FunctionSpace(mesh,"DG",0) > > W = V*Q > > > > v, p = TrialFunctions(W) > > w, q = TestFunctions(W) > > > > f = Function(Q) > > > f.interpolate(Expression("12*pi*pi*sin(pi*x[0]*2)*sin(pi*x[1]*2)*sin(2*pi*x[2])")) > > > > a = dot(v,w)*dx - p*div(w)*dx + div(v)*q*dx > > L = f*q*dx > > > > u = Function(W) > > solve(a==L,u,solver_parameters={...}) > > > > This problem has 1161600 degrees of freedom. The solver_parameters are: > > > > -ksp_type gmres > > -pc_type fieldsplit > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type: upper > > -pc_fieldsplit_schur_precondition selfp > > -fieldsplit_0_ksp_type preonly > > -fieldsplit_0_pc_type bjacobi > > -fieldsplit_1_ksp_type preonly > > -fieldsplit_1_pc_type hypre/ml/gamg > > > > for the last option, I compared the wall-clock timings for hypre, ml,and > gamg. Here are the strong-scaling results (across 64 cores, 8 cores per > Intel Xeon E5-2670 node) for hypre, ml, and gamg: > > > > hypre: > > 1 core: 47.5 s, 12 solver iters > > 2 cores: 34.1 s, 15 solver iters > > 4 cores: 21.5 s, 15 solver iters > > 8 cores: 16.6 s, 15 solver iters > > 16 cores: 10.2 s, 15 solver iters > > 24 cores: 7.66 s, 15 solver iters > > 32 cores: 6.31 s, 15 solver iters > > 40 cores: 5.68 s, 15 solver iters > > 48 cores: 5.36 s, 16 solver iters > > 56 cores: 5.12 s, 16 solver iters > > 64 cores: 4.99 s, 16 solver iters > > > > ml: > > 1 core: 4.44 s, 14 solver iters > > 2 cores: 2.85 s, 16 solver iters > > 4 cores: 1.6 s, 17 solver iters > > 8 cores: 0.966 s, 17 solver iters > > 16 cores: 0.585 s, 18 solver iters > > 24 cores: 0.440 s, 18 solver iters > > 32 cores: 0.375 s, 18 solver iters > > 40 cores: 0.332 s, 18 solver iters > > 48 cores: 0.307 s, 17 solver iters > > 56 cores: 0.290 s, 18 solver iters > > 64 cores: 0.281 s, 18 solver items > > > > gamg: > > 1 core: 613 s, 12 solver iters > > 2 cores: 204 s, 15 solver iters > > 4 cores: 77.1 s, 15 solver iters > > 8 cores: 38.1 s, 15 solver iters > > 16 cores: 15.9 s, 16 solver iters > > 24 cores: 9.24 s, 16 solver iters > > 32 cores: 5.92 s, 16 solver iters > > 40 cores: 4.72 s, 16 solver iters > > 48 cores: 3.89 s, 16 solver iters > > 56 cores: 3.65 s, 16 solver iters > > 64 cores: 3.46 s, 16 solver iters > > > > The performance difference between ML and HYPRE makes sense to me, but > what I am really confused about is GAMG. It seems GAMG is really slow on a > single core but something internally is causing it to speed up > super-linearly as I increase the number of MPI processes. Shouldn't ML and > GAMG have the same performance? I am not sure what log outputs to give you > guys, but for starters, below is -ksp_view for the single core case with > GAMG > > > > KSP Object:(solver_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object:(solver_) 1 MPI processes > > type: fieldsplit > > FieldSplit with Schur preconditioner, factorization UPPER > > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses (lumped, if requested) A00's diagonal's > inverse > > Split info: > > Split number 0 Defined by IS > > Split number 1 Defined by IS > > KSP solver for A00 block > > KSP Object: (solver_fieldsplit_0_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_) 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following KSP and > PC objects: > > KSP Object: (solver_fieldsplit_0_sub_) 1 > MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_sub_) 1 MPI > processes > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 1., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > package used to perform factorization: petsc > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) 1 > MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP solver for S = A11 - A10 inv(A00) A01 > > KSP Object: (solver_fieldsplit_1_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_) 1 MPI processes > > type: gamg > > MG: type is MULTIPLICATIVE, levels=5 cycles=v > > Cycles per PCApply=1 > > Using Galerkin computed coarse grid matrices > > GAMG specific options > > Threshold for dropping small values from graph 0. > > AGG specific options > > Symmetric graph false > > Coarse grid solver -- level ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_coarse_) > 1 MPI processes > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_coarse_) > 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following KSP > and PC objects: > > KSP Object: > (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: > (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > > matrix ordering: nd > > factor fill ratio given 5., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=9, cols=9 > > package used to perform factorization: petsc > > total: nonzeros=81, allocated nonzeros=81 > > total number of mallocs used during MatSetValues > calls =0 > > using I-node routines: found 2 nodes, limit > used is 5 > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=9, cols=9 > > total: nonzeros=81, allocated nonzeros=81 > > total number of mallocs used during MatSetValues calls > =0 > > using I-node routines: found 2 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=9, cols=9 > > total: nonzeros=81, allocated nonzeros=81 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 2 nodes, limit used is 5 > > Down solver (pre-smoother) on level 1 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_1_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0999525, max = > 1.09948 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_1_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_1_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=207, cols=207 > > total: nonzeros=42849, allocated nonzeros=42849 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 42 nodes, limit used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 2 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_2_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0996628, max = > 1.09629 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_2_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_2_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=5373, cols=5373 > > total: nonzeros=28852043, allocated nonzeros=28852043 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 1481 nodes, limit used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 3 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_3_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0994294, max = > 1.09372 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_3_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_3_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=52147, cols=52147 > > total: nonzeros=38604909, allocated nonzeros=38604909 > > total number of mallocs used during MatSetValues calls =2 > > not using I-node routines > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 4 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_4_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.158979, max = > 1.74876 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_4_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_4_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix followed by preconditioner matrix: > > Mat Object: (solver_fieldsplit_1_) 1 > MPI processes > > type: schurcomplement > > rows=384000, cols=384000 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (solver_fieldsplit_1_) > 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=384000, allocated nonzeros=384000 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > A10 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=777600 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > KSP of A00 > > KSP Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the > following KSP and PC objects: > > KSP Object: > (solver_fieldsplit_0_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: > (solver_fieldsplit_0_sub_) 1 MPI processes > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 1., needed 1. > > Factored matrix follows: > > Mat Object: > 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > package used to perform factorization: > petsc > > total: nonzeros=5385600, allocated > nonzeros=5385600 > > total number of mallocs used during > MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: > (solver_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated > nonzeros=5385600 > > total number of mallocs used during > MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: > (solver_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > A01 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=777600, cols=384000 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=3416452, allocated nonzeros=3416452 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Up solver (post-smoother) same as down solver (pre-smoother) > > linear system matrix followed by preconditioner matrix: > > Mat Object: (solver_fieldsplit_1_) 1 MPI processes > > type: schurcomplement > > rows=384000, cols=384000 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (solver_fieldsplit_1_) > 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=384000, allocated nonzeros=384000 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > A10 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=777600 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP of A00 > > KSP Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following > KSP and PC objects: > > KSP Object: > (solver_fieldsplit_0_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: > (solver_fieldsplit_0_sub_) 1 MPI processes > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 1., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI > processes > > type: seqaij > > rows=777600, cols=777600 > > package used to perform factorization: petsc > > total: nonzeros=5385600, allocated > nonzeros=5385600 > > total number of mallocs used during > MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: > (solver_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > A01 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=777600, cols=384000 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=3416452, allocated nonzeros=3416452 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: nest > > rows=1161600, cols=1161600 > > Matrix object: > > type=nest, rows=2, cols=2 > > MatNest structure: > > (0,0) : prefix="solver_fieldsplit_0_", type=seqaij, rows=777600, > cols=777600 > > (0,1) : type=seqaij, rows=777600, cols=384000 > > (1,0) : type=seqaij, rows=384000, cols=777600 > > (1,1) : prefix="solver_fieldsplit_1_", type=seqaij, rows=384000, > cols=384000 > > > > Any insight as to what's happening? Btw this firedrake/petsc-mapdes is > from way back in october 2015 (yes much has changed since but > reinstalling/updating firedrake and petsc on LANL's firewall HPC machines > is a big pain in the ass). > > > > Thanks, > > Justin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 1: solving... ((1161600, 1161600), (1161600, 1161600)) Solver time: 6.176223e+02 Solver iterations: 12 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 1 processor, by jychang48 Wed Mar 2 17:54:07 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 6.314e+02 1.00000 6.314e+02 Objects: 4.870e+02 1.00000 4.870e+02 Flops: 2.621e+11 1.00000 2.621e+11 2.621e+11 Flops/sec: 4.151e+08 1.00000 4.151e+08 4.151e+08 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3802e+01 2.2% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: FEM: 6.1762e+02 97.8% 2.6212e+11 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 8 1.0 1.5719e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 2 1.0 2.5649e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexInterp 1 1.0 2.0957e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 15 0 0 0 0 0 DMPlexStratify 4 1.0 5.1086e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 4 0 0 0 0 0 SFSetGraph 7 1.0 2.6373e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM VecMDot 90 1.0 9.1943e-02 1.0 2.78e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3028 VecNorm 99 1.0 2.5040e-02 1.0 4.96e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1982 VecScale 187 1.0 4.3765e-02 1.0 6.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1456 VecCopy 61 1.0 1.4966e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 585 1.0 3.6946e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 2.4149e-03 1.0 4.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1694 VecAYPX 416 1.0 5.2025e-02 1.0 5.74e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1104 VecAXPBYCZ 208 1.0 3.7029e-02 1.0 1.15e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3102 VecMAXPY 99 1.0 1.0790e-01 1.0 3.24e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3002 VecPointwiseMult 44 1.0 7.9148e-03 1.0 4.86e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 614 VecScatterBegin 58 1.0 6.4158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSetRandom 4 1.0 1.7824e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 99 1.0 4.0050e-02 1.0 7.45e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1859 MatMult 415 1.0 1.0716e+01 1.0 1.50e+10 1.0 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 1398 MatMultAdd 175 1.0 6.3934e-01 1.0 8.98e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1404 MatMultTranspose 52 1.0 4.7943e-01 1.0 6.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1271 MatSolve 101 1.0 9.7955e-01 1.0 8.79e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 898 MatSOR 354 1.0 8.2436e+00 1.0 1.26e+10 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1531 MatLUFactorSym 1 1.0 1.5020e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 6.5916e-02 1.0 1.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 275 MatILUFactorSym 1 1.0 5.1070e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 5.2860e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 2.0680e-01 1.0 1.94e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 938 MatResidual 52 1.0 1.2122e+00 1.0 1.84e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1521 MatAssemblyBegin 41 1.0 3.8147e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 41 1.0 2.9745e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRow 2093181 1.0 1.3037e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 2 1.0 1.0967e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 4.8364e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 2 1.0 5.2271e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 2.7666e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 1.6762e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMult 5 1.0 2.0927e+00 1.0 1.79e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 85 MatMatMultSym 5 1.0 1.5816e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMultNum 5 1.0 5.1102e-01 1.0 1.79e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 349 MatPtAP 4 1.0 5.8895e+02 1.0 2.31e+11 1.0 0.0e+00 0.0e+00 0.0e+00 93 88 0 0 0 95 88 0 0 0 392 MatPtAPSymbolic 4 1.0 3.6005e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 57 0 0 0 0 58 0 0 0 0 0 MatPtAPNumeric 4 1.0 2.2890e+02 1.0 2.31e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 88 0 0 0 37 88 0 0 0 1010 MatTrnMatMult 1 1.0 1.8922e-01 1.0 1.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 100 MatTrnMatMultSym 1 1.0 1.3319e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatTrnMatMultNum 1 1.0 5.6026e-02 1.0 1.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 337 MatGetSymTrans 5 1.0 2.6924e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCGAMGGraph_AGG 4 1.0 2.3435e+00 1.0 1.42e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 61 PCGAMGCoarse_AGG 4 1.0 2.7229e-01 1.0 1.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 69 PCGAMGProl_AGG 4 1.0 7.6182e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCGAMGPOpt_AGG 4 1.0 3.7438e+00 1.0 1.75e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 468 GAMG: createProl 4 1.0 6.4585e+00 1.0 1.91e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 296 Graph 8 1.0 2.3078e+00 1.0 1.42e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 61 MIS/Agg 4 1.0 2.7763e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SA: col data 4 1.0 9.4485e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SA: frmProl0 4 1.0 7.2291e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SA: smooth 4 1.0 3.7438e+00 1.0 1.75e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 468 GAMG: partLevel 4 1.0 5.8895e+02 1.0 2.31e+11 1.0 0.0e+00 0.0e+00 0.0e+00 93 88 0 0 0 95 88 0 0 0 392 PCSetUp 5 1.0 5.9682e+02 1.0 2.33e+11 1.0 0.0e+00 0.0e+00 0.0e+00 95 89 0 0 0 97 89 0 0 0 390 PCSetUpOnBlocks 101 1.0 1.2241e-01 1.0 1.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 148 PCApply 13 1.0 6.1450e+02 1.0 2.61e+11 1.0 0.0e+00 0.0e+00 0.0e+00 97 99 0 0 0 99 99 0 0 0 424 KSPGMRESOrthog 90 1.0 1.8531e-01 1.0 5.57e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3004 KSPSetUp 18 1.0 3.1746e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 6.1623e+02 1.0 2.61e+11 1.0 0.0e+00 0.0e+00 0.0e+00 98100 0 0 0 100100 0 0 0 424 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 22 25 38642096 0. Section 26 8 5376 0. Vector 13 119 258907968 0. Vector Scatter 2 6 3984 0. Matrix 0 12 136713716 0. Preconditioner 0 11 11092 0. Krylov Solver 0 15 151752 0. Distributed Mesh 10 4 19256 0. GraphPartitioner 4 3 1836 0. Star Forest Bipartite Graph 23 12 9696 0. Discrete System 10 4 3456 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 28 18 14128 0. IS L to G Mapping 4 0 0 0. Vector 254 139 93471128 0. Vector Scatter 6 0 0 0. Matrix 33 16 1768633272 0. Matrix Coarsen 4 4 2544 0. Preconditioner 20 9 8536 0. Krylov Solver 20 5 122312 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 2: solving... ((579051, 1161600), (579051, 1161600)) Solver time: 2.062817e+02 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 2 processors, by jychang48 Wed Mar 2 17:57:49 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.205e+02 1.00001 2.205e+02 Objects: 9.470e+02 1.01719 9.390e+02 Flops: 1.113e+11 1.14917 1.041e+11 2.082e+11 Flops/sec: 5.049e+08 1.14915 4.722e+08 9.444e+08 MPI Messages: 1.065e+03 1.06928 1.030e+03 2.061e+03 MPI Message Lengths: 6.096e+08 1.34148 5.163e+05 1.064e+09 MPI Reductions: 1.005e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4211e+01 6.4% 0.0000e+00 0.0% 5.020e+02 24.4% 3.027e+05 58.6% 1.250e+02 12.4% 1: FEM: 2.0628e+02 93.6% 2.0822e+11 100.0% 1.559e+03 75.6% 2.135e+05 41.4% 8.790e+02 87.5% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 8.1745e-01 9.9 0.00e+00 0.0 1.2e+02 4.0e+00 4.4e+01 0 0 6 0 4 3 0 24 0 35 0 VecScatterBegin 2 1.0 1.5371e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 3.0994e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.6335e+00 1.1 0.00e+00 0.0 9.2e+01 5.5e+05 2.1e+01 1 0 4 5 2 11 0 18 8 17 0 Mesh Migration 2 1.0 1.6901e+00 1.0 0.00e+00 0.0 3.7e+02 1.4e+06 5.4e+01 1 0 18 49 5 12 0 75 83 43 0 DMPlexInterp 1 1.0 2.0401e+0041740.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 7 0 0 0 0 0 DMPlexDistribute 1 1.0 2.1610e+00 1.1 0.00e+00 0.0 1.7e+02 1.9e+06 2.5e+01 1 0 8 30 2 15 0 33 50 20 0 DMPlexDistCones 2 1.0 3.6283e-01 1.0 0.00e+00 0.0 5.4e+01 3.2e+06 4.0e+00 0 0 3 16 0 3 0 11 28 3 0 DMPlexDistLabels 2 1.0 8.6425e-01 1.0 0.00e+00 0.0 2.4e+02 1.2e+06 2.2e+01 0 0 12 28 2 6 0 48 47 18 0 DMPlexDistribOL 1 1.0 1.1811e+00 1.0 0.00e+00 0.0 3.1e+02 9.6e+05 5.0e+01 1 0 15 28 5 8 0 61 48 40 0 DMPlexDistField 3 1.0 4.4417e-02 1.1 0.00e+00 0.0 6.2e+01 3.5e+05 1.2e+01 0 0 3 2 1 0 0 12 3 10 0 DMPlexDistData 2 1.0 8.4055e-0126.5 0.00e+00 0.0 5.4e+01 4.0e+05 6.0e+00 0 0 3 2 1 3 0 11 3 5 0 DMPlexStratify 6 1.5 7.7436e-01 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 4 0 0 0 0 0 SFSetGraph 51 1.0 4.1998e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0 SFBcastBegin 95 1.0 9.1913e-01 3.3 0.00e+00 0.0 4.8e+02 1.2e+06 4.1e+01 0 0 23 56 4 4 0 96 96 33 0 SFBcastEnd 95 1.0 3.4255e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 3.0928e-03 1.2 0.00e+00 0.0 1.1e+01 1.3e+06 3.0e+00 0 0 1 1 0 0 0 2 2 2 0 SFReduceEnd 4 1.0 5.1863e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.0994e-05 6.2 0.00e+00 0.0 1.0e+00 4.2e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 5.9104e-04 3.8 0.00e+00 0.0 1.0e+00 4.2e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 3.1600e-03 6.4 0.00e+00 0.0 1.0e+01 4.0e+00 1.7e+01 0 0 0 0 2 0 0 1 0 2 0 BuildTwoSidedF 12 1.0 1.7691e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 95 1.0 7.7544e-02 1.2 1.88e+08 1.0 0.0e+00 0.0e+00 9.5e+01 0 0 0 0 9 0 0 0 0 11 4849 VecNorm 104 1.0 1.8694e-02 1.1 2.84e+07 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 12 3028 VecScale 210 1.0 2.2894e-02 1.0 3.77e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3289 VecCopy 73 1.0 7.2665e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 563 1.0 1.2248e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 1.0741e-03 1.0 2.05e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3808 VecAYPX 512 1.0 3.0463e-02 1.0 3.54e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2321 VecAXPBYCZ 256 1.0 2.1654e-02 1.0 7.08e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6529 VecMAXPY 104 1.0 6.4986e-02 1.0 2.15e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6594 VecAssemblyBegin 13 1.0 2.4629e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 13 1.0 2.4557e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 3.5381e-03 1.0 2.43e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1374 VecScatterBegin 936 1.0 4.1796e-02 1.0 0.00e+00 0.0 1.3e+03 2.4e+04 0.0e+00 0 0 61 3 0 0 0 81 7 0 0 VecScatterEnd 936 1.0 4.6248e-01 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSetRandom 4 1.0 8.8727e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 104 1.0 2.7498e-02 1.0 4.26e+07 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 12 3088 MatMult 495 1.0 6.5382e+00 1.0 9.15e+09 1.0 1.0e+03 2.8e+04 1.2e+02 3 9 49 3 12 3 9 64 6 14 2794 MatMultAdd 214 1.0 3.9983e-01 1.1 5.66e+08 1.1 1.7e+02 1.6e+04 0.0e+00 0 1 8 0 0 0 1 11 1 0 2751 MatMultTranspose 64 1.0 2.8789e-01 1.1 3.88e+08 1.1 1.1e+02 5.9e+03 0.0e+00 0 0 5 0 0 0 0 7 0 0 2580 MatSolve 122 1.2 6.5441e-01 1.0 5.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1612 MatSOR 428 1.0 4.2707e+00 1.0 6.26e+09 1.0 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 2925 MatLUFactorSym 1 1.0 1.5974e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 3.4417e-02 1.0 9.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 525 MatILUFactorSym 1 1.0 2.5373e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 1.0400e-01 1.0 9.99e+07 1.0 8.0e+00 2.8e+04 0.0e+00 0 0 0 0 0 0 0 1 0 0 1903 MatResidual 64 1.0 7.6395e-01 1.0 1.17e+09 1.0 1.3e+02 2.8e+04 0.0e+00 0 1 6 0 0 0 1 8 1 0 3061 MatAssemblyBegin 89 1.0 1.1247e+0138.9 0.00e+00 0.0 3.9e+01 5.9e+06 5.4e+01 3 0 2 22 5 3 0 3 52 6 0 MatAssemblyEnd 89 1.0 1.0761e+00 1.0 0.00e+00 0.0 8.0e+01 4.3e+03 2.0e+02 0 0 4 0 20 1 0 5 0 23 0 MatGetRow 1047432 1.0 1.0699e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 2 2.0 8.1062e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 6 1.0 2.5043e-01 1.0 0.00e+00 0.0 7.0e+00 1.0e+02 4.0e+01 0 0 0 0 4 0 0 0 0 5 0 MatGetOrdering 2 2.0 2.4710e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 1.5470e-02 1.1 0.00e+00 0.0 4.8e+01 8.1e+03 1.2e+01 0 0 2 0 1 0 0 3 0 1 0 MatZeroEntries 4 1.0 1.3583e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 1.2150e+00 1.0 0.00e+00 0.0 4.0e+00 6.7e+03 2.0e+01 1 0 0 0 2 1 0 0 0 2 0 MatMatMult 5 1.0 7.4322e+00 1.0 7.81e+07 1.0 5.6e+01 1.5e+04 8.0e+01 3 0 3 0 8 4 0 4 0 9 21 MatMatMultSym 5 1.0 7.1974e+00 1.0 0.00e+00 0.0 4.7e+01 1.3e+04 7.0e+01 3 0 2 0 7 3 0 3 0 8 0 MatMatMultNum 5 1.0 2.3470e-01 1.0 7.81e+07 1.0 9.0e+00 3.1e+04 1.0e+01 0 0 0 0 1 0 0 1 0 1 665 MatPtAP 4 1.0 1.8275e+02 1.0 9.39e+10 1.2 8.8e+01 4.3e+06 6.8e+01 83 83 4 36 7 89 83 6 86 8 949 MatPtAPSymbolic 4 1.0 1.1582e+02 1.0 0.00e+00 0.0 4.8e+01 3.1e+06 2.8e+01 53 0 2 14 3 56 0 3 34 3 0 MatPtAPNumeric 4 1.0 6.6931e+01 1.0 9.39e+10 1.2 4.0e+01 5.8e+06 4.0e+01 30 83 2 22 4 32 83 3 52 5 2592 MatTrnMatMult 1 1.0 2.4068e-01 1.0 9.46e+06 1.0 1.2e+01 4.3e+04 1.9e+01 0 0 1 0 2 0 0 1 0 2 79 MatTrnMatMultSym 1 1.0 1.5422e-01 1.0 0.00e+00 0.0 1.0e+01 2.4e+04 1.7e+01 0 0 0 0 2 0 0 1 0 2 0 MatTrnMatMultNum 1 1.0 8.6440e-02 1.0 9.46e+06 1.0 2.0e+00 1.4e+05 2.0e+00 0 0 0 0 0 0 0 0 0 0 219 MatGetLocalMat 16 1.0 1.1441e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 7.4965e-02 1.0 0.00e+00 0.0 6.0e+01 1.5e+06 0.0e+00 0 0 3 9 0 0 0 4 21 0 0 PCGAMGGraph_AGG 4 1.0 1.5076e+00 1.0 7.32e+07 1.0 2.4e+01 1.1e+04 4.8e+01 1 0 1 0 5 1 0 2 0 5 97 PCGAMGCoarse_AGG 4 1.0 2.9894e-01 1.0 9.46e+06 1.0 7.0e+01 2.0e+04 3.9e+01 0 0 3 0 4 0 0 4 0 4 63 PCGAMGProl_AGG 4 1.0 5.3907e-02 1.0 0.00e+00 0.0 4.1e+01 8.8e+03 8.0e+01 0 0 2 0 8 0 0 3 0 9 0 PCGAMGPOpt_AGG 4 1.0 8.3099e+00 1.0 8.88e+08 1.0 1.3e+02 2.3e+04 1.9e+02 4 1 6 0 19 4 1 8 1 21 214 GAMG: createProl 4 1.0 1.0182e+01 1.0 9.71e+08 1.0 2.6e+02 1.9e+04 3.6e+02 5 1 13 0 35 5 1 17 1 40 191 Graph 8 1.0 1.4890e+00 1.0 7.32e+07 1.0 2.4e+01 1.1e+04 4.8e+01 1 0 1 0 5 1 0 2 0 5 98 MIS/Agg 4 1.0 1.5540e-02 1.0 0.00e+00 0.0 4.8e+01 8.1e+03 1.2e+01 0 0 2 0 1 0 0 3 0 1 0 SA: col data 4 1.0 1.1244e-02 1.0 0.00e+00 0.0 1.6e+01 1.9e+04 2.4e+01 0 0 1 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 3.9084e-02 1.0 0.00e+00 0.0 2.5e+01 2.0e+03 4.0e+01 0 0 1 0 4 0 0 2 0 5 0 SA: smooth 4 1.0 8.3098e+00 1.0 8.88e+08 1.0 1.3e+02 2.3e+04 1.9e+02 4 1 6 0 19 4 1 8 1 21 214 GAMG: partLevel 4 1.0 1.8298e+02 1.0 9.39e+10 1.2 9.8e+01 3.9e+06 1.2e+02 83 83 5 36 12 89 83 6 86 14 948 repartition 1 1.0 2.5988e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 1 1.0 1.8120e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 Move A 1 1.0 2.2859e-01 1.0 0.00e+00 0.0 5.0e+00 1.3e+02 1.8e+01 0 0 0 0 2 0 0 0 0 2 0 Move P 1 1.0 1.9908e-04 1.0 0.00e+00 0.0 2.0e+00 2.2e+01 1.8e+01 0 0 0 0 2 0 0 0 0 2 0 PCSetUp 5 1.0 1.9420e+02 1.0 9.49e+10 1.2 3.8e+02 1.0e+06 5.6e+02 88 84 18 37 56 94 84 24 89 64 903 PCSetUpOnBlocks 122 1.0 6.2164e-02 1.0 9.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 291 PCApply 16 1.0 2.0400e+02 1.0 1.10e+11 1.2 1.4e+03 2.8e+05 5.9e+02 93 99 70 38 58 99 99 93 93 67 1012 KSPGMRESOrthog 95 1.0 1.3511e-01 1.1 3.77e+08 1.0 0.0e+00 0.0e+00 9.5e+01 0 0 0 0 9 0 0 0 0 11 5566 KSPSetUp 18 1.0 1.5130e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 2.0527e+02 1.0 1.11e+11 1.1 1.5e+03 2.8e+05 8.0e+02 93100 74 40 80 100100 98 96 91 1010 SFSetGraph 4 1.0 1.7595e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 17 1.0 3.2637e-03 5.2 0.00e+00 0.0 5.4e+01 1.2e+04 5.0e+00 0 0 3 0 0 0 0 3 0 1 0 SFBcastEnd 17 1.0 8.5759e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 79 84 49128340 0. IS L to G Mapping 3 3 23945692 0. Section 70 53 35616 0. Vector 15 141 180695472 0. Vector Scatter 2 15 13913664 0. Matrix 0 59 790177488 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 88 76 217792 0. IS L to G Mapping 4 0 0 0. Vector 346 208 53912024 0. Vector Scatter 37 18 19768 0. Matrix 137 65 1266465532 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 8.10623e-07 Average time for MPI_Barrier(): 8.10623e-07 Average time for zero size MPI_Send(): 2.98023e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 4: solving... ((288348, 1161600), (288348, 1161600)) Solver time: 7.881676e+01 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 4 processors, by jychang48 Wed Mar 2 17:59:23 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 9.180e+01 1.00003 9.180e+01 Objects: 9.550e+02 1.02358 9.390e+02 Flops: 3.967e+10 1.06829 3.820e+10 1.528e+11 Flops/sec: 4.321e+08 1.06831 4.161e+08 1.664e+09 MPI Messages: 2.896e+03 1.22894 2.581e+03 1.032e+04 MPI Message Lengths: 6.146e+08 1.57029 1.801e+05 1.860e+09 MPI Reductions: 1.010e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2979e+01 14.1% 0.0000e+00 0.0% 1.654e+03 16.0% 6.288e+04 34.9% 1.250e+02 12.4% 1: FEM: 7.8817e+01 85.9% 1.5278e+11 100.0% 8.671e+03 84.0% 1.173e+05 65.1% 8.840e+02 87.5% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 9.2481e-0116.9 0.00e+00 0.0 3.9e+02 4.0e+00 4.4e+01 1 0 4 0 4 5 0 24 0 35 0 VecScatterBegin 2 1.0 6.7115e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 5.2452e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.7374e+00 1.1 0.00e+00 0.0 3.8e+02 1.4e+05 2.1e+01 2 0 4 3 2 13 0 23 8 17 0 Mesh Migration 2 1.0 1.0188e+00 1.0 0.00e+00 0.0 1.1e+03 4.7e+05 5.4e+01 1 0 11 29 5 8 0 69 83 43 0 DMPlexInterp 1 1.0 2.0553e+0052887.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 DMPlexDistribute 1 1.0 2.0098e+00 1.1 0.00e+00 0.0 3.9e+02 8.1e+05 2.5e+01 2 0 4 17 2 15 0 23 49 20 0 DMPlexDistCones 2 1.0 2.3204e-01 1.0 0.00e+00 0.0 1.6e+02 1.1e+06 4.0e+00 0 0 2 10 0 2 0 10 28 3 0 DMPlexDistLabels 2 1.0 5.4382e-01 1.0 0.00e+00 0.0 7.2e+02 4.2e+05 2.2e+01 1 0 7 16 2 4 0 44 47 18 0 DMPlexDistribOL 1 1.0 7.6249e-01 1.0 0.00e+00 0.0 1.2e+03 2.8e+05 5.0e+01 1 0 11 17 5 6 0 70 49 40 0 DMPlexDistField 3 1.0 2.7687e-02 1.2 0.00e+00 0.0 2.0e+02 1.1e+05 1.2e+01 0 0 2 1 1 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.2141e-0140.2 0.00e+00 0.0 2.2e+02 1.0e+05 6.0e+00 1 0 2 1 1 5 0 14 4 5 0 DMPlexStratify 6 1.5 6.4549e-01 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 SFSetGraph 51 1.0 2.4213e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 SFBcastBegin 95 1.0 9.8293e-01 4.6 0.00e+00 0.0 1.6e+03 4.0e+05 4.1e+01 1 0 15 34 4 6 0 95 96 33 0 SFBcastEnd 95 1.0 3.0583e-01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 2.8150e-03 2.0 0.00e+00 0.0 4.9e+01 2.9e+05 3.0e+00 0 0 0 1 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 6.0437e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.9101e-05 7.8 0.00e+00 0.0 5.0e+00 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 2.9993e-04 2.9 0.00e+00 0.0 5.0e+00 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 3.0973e-03 3.2 0.00e+00 0.0 5.4e+01 4.0e+00 1.7e+01 0 0 1 0 2 0 0 1 0 2 0 BuildTwoSidedF 12 1.0 3.7432e-04 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 95 1.0 6.2907e-02 1.6 9.45e+07 1.0 0.0e+00 0.0e+00 9.5e+01 0 0 0 0 9 0 0 0 0 11 5977 VecNorm 104 1.0 1.1690e-02 1.3 1.42e+07 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 12 4843 VecScale 210 1.0 1.1790e-02 1.0 1.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6387 VecCopy 73 1.0 2.9068e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 563 1.0 6.0357e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 6.0582e-04 1.0 1.03e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6752 VecAYPX 512 1.0 1.6990e-02 1.0 1.77e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4161 VecAXPBYCZ 256 1.0 1.2354e-02 1.0 3.54e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11446 VecMAXPY 104 1.0 3.6181e-02 1.0 1.08e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11844 VecAssemblyBegin 13 1.0 4.4584e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 13 1.0 2.8133e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 1.8675e-03 1.0 1.22e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2603 VecScatterBegin 936 1.0 2.9592e-02 1.1 0.00e+00 0.0 7.1e+03 1.2e+04 0.0e+00 0 0 69 5 0 0 0 82 7 0 0 VecScatterEnd 936 1.0 2.7927e-01 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSetRandom 4 1.0 4.5090e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 104 1.0 1.6539e-02 1.2 2.13e+07 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 12 5135 MatMult 495 1.0 3.3400e+00 1.0 4.53e+09 1.0 5.7e+03 1.4e+04 1.2e+02 4 12 55 4 12 4 12 66 7 14 5284 MatMultAdd 214 1.0 1.9689e-01 1.0 2.56e+08 1.0 8.9e+02 6.8e+03 0.0e+00 0 1 9 0 0 0 1 10 1 0 5084 MatMultTranspose 64 1.0 1.5643e-01 1.3 1.67e+08 1.1 5.9e+02 3.0e+03 0.0e+00 0 0 6 0 0 0 0 7 0 0 4115 MatSolve 122 1.2 3.5537e-01 1.1 2.65e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2957 MatSOR 428 1.0 2.0880e+00 1.1 2.71e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 3 7 0 0 0 4989 MatLUFactorSym 1 1.0 3.5048e-05 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.8154e-02 1.0 4.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 995 MatILUFactorSym 1 1.0 1.2837e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 1.3118e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 5.5241e-02 1.0 4.79e+07 1.0 4.6e+01 1.4e+04 0.0e+00 0 0 0 0 0 0 0 1 0 0 3375 MatResidual 64 1.0 3.8316e-01 1.0 5.79e+08 1.0 7.4e+02 1.4e+04 0.0e+00 0 1 7 1 0 0 1 8 1 0 5882 MatAssemblyBegin 89 1.0 2.4723e+00 7.3 0.00e+00 0.0 2.1e+02 3.2e+06 5.4e+01 2 0 2 36 5 2 0 2 56 6 0 MatAssemblyEnd 89 1.0 1.3361e+00 1.0 0.00e+00 0.0 4.3e+02 1.8e+03 2.0e+02 1 0 4 0 20 2 0 5 0 23 0 MatGetRow 524034 1.0 1.0427e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 2 2.0 8.1062e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 6 1.0 1.0785e-02 1.0 0.00e+00 0.0 2.1e+01 4.8e+01 4.0e+01 0 0 0 0 4 0 0 0 0 5 0 MatGetOrdering 2 2.0 1.2300e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 8.7881e-03 1.1 0.00e+00 0.0 3.2e+02 3.0e+03 1.7e+01 0 0 3 0 2 0 0 4 0 2 0 MatZeroEntries 4 1.0 7.0884e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 1.2048e+00 1.0 0.00e+00 0.0 2.0e+01 2.7e+03 2.0e+01 1 0 0 0 2 2 0 0 0 2 0 MatMatMult 5 1.0 2.5813e+00 1.0 3.86e+07 1.0 3.2e+02 7.3e+03 8.0e+01 3 0 3 0 8 3 0 4 0 9 58 MatMatMultSym 5 1.0 2.4731e+00 1.0 0.00e+00 0.0 2.6e+02 5.9e+03 7.0e+01 3 0 3 0 7 3 0 3 0 8 0 MatMatMultNum 5 1.0 1.0818e-01 1.0 3.86e+07 1.0 5.1e+01 1.5e+04 1.0e+01 0 0 0 0 1 0 0 1 0 1 1394 MatPtAP 4 1.0 6.7203e+01 1.0 3.18e+10 1.1 5.1e+02 2.1e+06 6.8e+01 73 79 5 59 7 85 79 6 90 8 1799 MatPtAPSymbolic 4 1.0 4.1790e+01 1.0 0.00e+00 0.0 2.8e+02 1.5e+06 2.8e+01 46 0 3 22 3 53 0 3 34 3 0 MatPtAPNumeric 4 1.0 2.5414e+01 1.0 3.18e+10 1.1 2.3e+02 2.9e+06 4.0e+01 28 79 2 36 4 32 79 3 56 5 4756 MatTrnMatMult 1 1.0 1.3047e-01 1.0 4.76e+06 1.0 6.0e+01 1.7e+04 1.9e+01 0 0 1 0 2 0 0 1 0 2 146 MatTrnMatMultSym 1 1.0 8.3536e-02 1.0 0.00e+00 0.0 5.0e+01 9.7e+03 1.7e+01 0 0 0 0 2 0 0 1 0 2 0 MatTrnMatMultNum 1 1.0 4.6928e-02 1.0 4.76e+06 1.0 1.0e+01 5.4e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 405 MatGetLocalMat 16 1.0 5.5166e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 1.3093e-01 1.2 0.00e+00 0.0 3.4e+02 7.3e+05 0.0e+00 0 0 3 13 0 0 0 4 21 0 0 PCGAMGGraph_AGG 4 1.0 1.1715e+00 1.0 3.62e+07 1.0 1.3e+02 5.5e+03 4.8e+01 1 0 1 0 5 1 0 2 0 5 120 PCGAMGCoarse_AGG 4 1.0 1.6066e-01 1.0 4.76e+06 1.0 4.3e+02 6.9e+03 4.4e+01 0 0 4 0 4 0 0 5 0 5 118 PCGAMGProl_AGG 4 1.0 2.7573e-02 1.0 0.00e+00 0.0 2.1e+02 3.5e+03 8.0e+01 0 0 2 0 8 0 0 2 0 9 0 PCGAMGPOpt_AGG 4 1.0 3.2287e+00 1.0 4.39e+08 1.0 7.4e+02 1.1e+04 1.9e+02 4 1 7 0 19 4 1 8 1 21 530 GAMG: createProl 4 1.0 4.5962e+00 1.0 4.80e+08 1.0 1.5e+03 8.4e+03 3.6e+02 5 1 15 1 36 6 1 17 1 41 407 Graph 8 1.0 1.1613e+00 1.0 3.62e+07 1.0 1.3e+02 5.5e+03 4.8e+01 1 0 1 0 5 1 0 2 0 5 121 MIS/Agg 4 1.0 8.8501e-03 1.1 0.00e+00 0.0 3.2e+02 3.0e+03 1.7e+01 0 0 3 0 2 0 0 4 0 2 0 SA: col data 4 1.0 5.9159e-03 1.0 0.00e+00 0.0 8.8e+01 7.3e+03 2.4e+01 0 0 1 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 2.0285e-02 1.0 0.00e+00 0.0 1.2e+02 8.0e+02 4.0e+01 0 0 1 0 4 0 0 1 0 5 0 SA: smooth 4 1.0 3.2287e+00 1.0 4.39e+08 1.0 7.4e+02 1.1e+04 1.9e+02 4 1 7 0 19 4 1 8 1 21 530 GAMG: partLevel 4 1.0 6.7203e+01 1.0 3.18e+10 1.1 5.4e+02 2.0e+06 1.2e+02 73 79 5 59 12 85 79 6 90 14 1799 repartition 1 1.0 5.2929e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 1 1.0 4.6968e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 Move A 1 1.0 2.0814e-04 1.0 0.00e+00 0.0 1.5e+01 6.0e+01 1.8e+01 0 0 0 0 2 0 0 0 0 2 0 Move P 1 1.0 1.8096e-04 1.0 0.00e+00 0.0 6.0e+00 2.0e+01 1.8e+01 0 0 0 0 2 0 0 0 0 2 0 PCSetUp 5 1.0 7.2710e+01 1.0 3.22e+10 1.1 2.1e+03 5.2e+05 5.7e+02 79 80 21 60 56 92 80 25 92 64 1689 PCSetUpOnBlocks 122 1.0 3.2053e-02 1.1 4.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 563 PCApply 16 1.0 7.7192e+01 1.0 3.92e+10 1.1 8.2e+03 1.4e+05 5.9e+02 84 99 79 63 59 98 99 94 97 67 1955 KSPGMRESOrthog 95 1.0 9.5074e-02 1.3 1.89e+08 1.0 0.0e+00 0.0e+00 9.5e+01 0 0 0 0 9 0 0 0 0 11 7910 KSPSetUp 18 1.0 6.6414e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 7.8229e+01 1.0 3.94e+10 1.1 8.6e+03 1.4e+05 8.0e+02 85 99 83 64 80 99 99 99 98 91 1941 SFSetGraph 4 1.0 2.3985e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 22 1.0 3.1741e-03 2.8 0.00e+00 0.0 3.5e+02 4.2e+03 5.0e+00 0 0 3 0 0 0 0 4 0 1 0 SFBcastEnd 22 1.0 3.9601e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 87 92 35927140 0. IS L to G Mapping 3 3 18881016 0. Section 70 53 35616 0. Vector 15 141 92771016 0. Vector Scatter 2 15 6936792 0. Matrix 0 59 361266968 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 88 76 260344 0. IS L to G Mapping 4 0 0 0. Vector 346 208 27367352 0. Vector Scatter 37 18 19744 0. Matrix 137 65 608134404 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 1.57356e-06 Average time for zero size MPI_Send(): 1.96695e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 8: solving... ((143102, 1161600), (143102, 1161600)) Solver time: 4.012547e+01 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 8 processors, by jychang48 Wed Mar 2 18:00:17 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 5.307e+01 1.00010 5.307e+01 Objects: 1.002e+03 1.03512 9.730e+02 Flops: 1.814e+10 1.32210 1.527e+10 1.221e+11 Flops/sec: 3.418e+08 1.32197 2.877e+08 2.301e+09 MPI Messages: 5.818e+03 1.51879 4.441e+03 3.553e+04 MPI Message Lengths: 5.914e+08 2.04577 8.271e+04 2.938e+09 MPI Reductions: 1.063e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2942e+01 24.4% 0.0000e+00 0.0% 5.296e+03 14.9% 1.912e+04 23.1% 1.250e+02 11.8% 1: FEM: 4.0126e+01 75.6% 1.2213e+11 100.0% 3.023e+04 85.1% 6.359e+04 76.9% 9.370e+02 88.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 9.8084e-0133.7 0.00e+00 0.0 1.2e+03 4.0e+00 4.4e+01 2 0 3 0 4 6 0 23 0 35 0 VecScatterBegin 2 1.0 3.1185e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 3.8147e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.8528e+00 1.1 0.00e+00 0.0 1.4e+03 4.1e+04 2.1e+01 3 0 4 2 2 14 0 26 8 17 0 Mesh Migration 2 1.0 6.9385e-01 1.0 0.00e+00 0.0 3.4e+03 1.6e+05 5.4e+01 1 0 10 19 5 5 0 65 82 43 0 DMPlexInterp 1 1.0 2.1209e+0058911.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 DMPlexDistribute 1 1.0 2.0432e+00 1.1 0.00e+00 0.0 1.0e+03 3.2e+05 2.5e+01 4 0 3 11 2 16 0 19 47 20 0 DMPlexDistCones 2 1.0 1.6185e-01 1.0 0.00e+00 0.0 4.9e+02 3.8e+05 4.0e+00 0 0 1 6 0 1 0 9 27 3 0 DMPlexDistLabels 2 1.0 3.9578e-01 1.0 0.00e+00 0.0 2.1e+03 1.5e+05 2.2e+01 1 0 6 11 2 3 0 40 47 18 0 DMPlexDistribOL 1 1.0 5.2214e-01 1.0 0.00e+00 0.0 3.9e+03 8.8e+04 5.0e+01 1 0 11 12 5 4 0 73 50 40 0 DMPlexDistField 3 1.0 2.2155e-02 1.2 0.00e+00 0.0 6.4e+02 3.8e+04 1.2e+01 0 0 2 1 1 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.7258e-0152.8 0.00e+00 0.0 8.5e+02 3.0e+04 6.0e+00 2 0 2 1 1 6 0 16 4 5 0 DMPlexStratify 6 1.5 6.0035e-01 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFSetGraph 51 1.0 1.4183e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFBcastBegin 95 1.0 1.0219e+00 6.0 0.00e+00 0.0 5.0e+03 1.3e+05 4.1e+01 2 0 14 22 4 7 0 95 96 33 0 SFBcastEnd 95 1.0 3.0369e-01 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 3.5670e-03 3.6 0.00e+00 0.0 1.8e+02 8.2e+04 3.0e+00 0 0 1 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 6.7761e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 6.9857e-0511.3 0.00e+00 0.0 1.9e+01 7.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 2.3389e-04 2.6 0.00e+00 0.0 1.9e+01 7.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 2.7728e-03 6.0 0.00e+00 0.0 1.7e+02 4.0e+00 1.7e+01 0 0 0 0 2 0 0 1 0 2 0 BuildTwoSidedF 12 1.0 4.5395e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 95 1.0 4.5413e-02 1.8 4.73e+07 1.0 0.0e+00 0.0e+00 9.5e+01 0 0 0 0 9 0 0 0 0 10 8279 VecNorm 104 1.0 7.2548e-03 1.2 7.12e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 7803 VecScale 210 1.0 6.3441e-03 1.0 9.46e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11871 VecCopy 73 1.0 1.6530e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 566 1.0 1.8410e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 3.9196e-04 1.2 5.14e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10436 VecAYPX 512 1.0 1.1320e-02 1.2 8.85e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6245 VecAXPBYCZ 256 1.0 7.8399e-03 1.1 1.77e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 18034 VecMAXPY 104 1.0 2.3948e-02 1.1 5.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 17894 VecAssemblyBegin 14 1.0 5.6481e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 5.9128e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 1.1752e-03 1.1 6.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4136 VecScatterBegin 937 1.0 2.5381e-02 1.1 0.00e+00 0.0 2.5e+04 6.9e+03 0.0e+00 0 0 70 6 0 0 0 82 8 0 0 VecScatterEnd 937 1.0 1.5833e-01 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSetRandom 4 1.0 2.2795e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 104 1.0 1.0659e-02 1.1 1.07e+07 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 7967 MatMult 495 1.0 1.7763e+00 1.0 2.19e+09 1.1 2.0e+04 7.9e+03 1.2e+02 3 14 57 5 12 4 14 67 7 13 9571 MatMultAdd 214 1.0 1.1394e-01 1.1 1.18e+08 1.1 3.1e+03 3.3e+03 0.0e+00 0 1 9 0 0 0 1 10 0 0 8065 MatMultTranspose 64 1.0 7.4735e-02 1.3 7.34e+07 1.1 2.0e+03 1.8e+03 0.0e+00 0 0 6 0 0 0 0 7 0 0 7514 MatSolve 122 1.2 1.9528e-01 1.1 1.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 5355 MatSOR 428 1.0 1.1337e+00 1.0 1.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 8 0 0 0 3 8 0 0 0 8134 MatLUFactorSym 1 1.0 2.3842e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 9.0170e-03 1.0 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2003 MatILUFactorSym 1 1.0 3.6991e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 7.0534e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 3.9479e-02 1.0 2.27e+07 1.1 1.6e+02 8.1e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 4452 MatResidual 64 1.0 2.0590e-01 1.0 2.80e+08 1.1 2.6e+03 8.1e+03 0.0e+00 0 2 7 1 0 1 2 9 1 0 10516 MatAssemblyBegin 93 1.0 3.5671e+0011.2 0.00e+00 0.0 7.3e+02 1.7e+06 5.8e+01 4 0 2 43 5 6 0 2 56 6 0 MatAssemblyEnd 93 1.0 1.8181e+00 1.0 0.00e+00 0.0 1.6e+03 9.3e+02 2.2e+02 3 0 4 0 20 4 0 5 0 23 0 MatGetRow 262026 1.0 1.0316e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 MatGetRowIJ 2 2.0 1.0014e-05 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 1.3851e-02 1.0 0.00e+00 0.0 1.4e+02 3.6e+03 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 3.9101e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 6.0310e-03 1.1 0.00e+00 0.0 9.5e+02 1.8e+03 1.7e+01 0 0 3 0 2 0 0 3 0 2 0 MatZeroEntries 4 1.0 3.4907e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 1.2078e+00 1.0 0.00e+00 0.0 7.6e+01 1.1e+03 2.0e+01 2 0 0 0 2 3 0 0 0 2 0 MatMatMult 5 1.0 1.0170e+00 1.0 1.87e+07 1.1 1.1e+03 4.1e+03 8.0e+01 2 0 3 0 8 3 0 4 0 9 143 MatMatMultSym 5 1.0 9.6113e-01 1.0 0.00e+00 0.0 9.4e+02 3.3e+03 7.0e+01 2 0 3 0 7 2 0 3 0 7 0 MatMatMultNum 5 1.0 5.5820e-02 1.0 1.87e+07 1.1 1.8e+02 8.2e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 2602 MatPtAP 4 1.0 3.3189e+01 1.0 1.43e+10 1.4 1.8e+03 1.1e+06 6.8e+01 63 75 5 70 6 83 75 6 91 7 2778 MatPtAPSymbolic 4 1.0 1.9334e+01 1.0 0.00e+00 0.0 9.7e+02 8.1e+05 2.8e+01 36 0 3 27 3 48 0 3 35 3 0 MatPtAPNumeric 4 1.0 1.3856e+01 1.0 1.43e+10 1.4 8.5e+02 1.5e+06 4.0e+01 26 75 2 43 4 35 75 3 56 4 6653 MatTrnMatMult 1 1.0 7.2624e-02 1.0 2.41e+06 1.0 2.3e+02 7.1e+03 1.9e+01 0 0 1 0 2 0 0 1 0 2 263 MatTrnMatMultSym 1 1.0 4.6745e-02 1.0 0.00e+00 0.0 1.9e+02 4.1e+03 1.7e+01 0 0 1 0 2 0 0 1 0 2 0 MatTrnMatMultNum 1 1.0 2.5871e-02 1.0 2.41e+06 1.0 3.8e+01 2.3e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 737 MatGetLocalMat 16 1.0 2.7739e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 1.5730e-01 1.1 0.00e+00 0.0 1.2e+03 4.0e+05 0.0e+00 0 0 3 16 0 0 0 4 21 0 0 PCGAMGGraph_AGG 4 1.0 1.0277e+00 1.0 1.75e+07 1.1 4.2e+02 3.5e+03 4.8e+01 2 0 1 0 5 3 0 1 0 5 132 PCGAMGCoarse_AGG 4 1.0 9.0071e-02 1.0 2.41e+06 1.0 1.4e+03 3.6e+03 4.4e+01 0 0 4 0 4 0 0 5 0 5 212 PCGAMGProl_AGG 4 1.0 1.5908e-02 1.0 0.00e+00 0.0 6.6e+02 1.8e+03 8.0e+01 0 0 2 0 8 0 0 2 0 9 0 PCGAMGPOpt_AGG 4 1.0 1.5742e+00 1.0 2.12e+08 1.1 2.6e+03 6.6e+03 1.9e+02 3 1 7 1 18 4 1 9 1 20 1045 GAMG: createProl 4 1.0 2.7134e+00 1.0 2.32e+08 1.1 5.0e+03 4.9e+03 3.6e+02 5 1 14 1 34 7 1 17 1 38 663 Graph 8 1.0 1.0201e+00 1.0 1.75e+07 1.1 4.2e+02 3.5e+03 4.8e+01 2 0 1 0 5 3 0 1 0 5 133 MIS/Agg 4 1.0 6.1002e-03 1.1 0.00e+00 0.0 9.5e+02 1.8e+03 1.7e+01 0 0 3 0 2 0 0 3 0 2 0 SA: col data 4 1.0 3.9160e-03 1.0 0.00e+00 0.0 2.6e+02 3.9e+03 2.4e+01 0 0 1 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 1.1169e-02 1.0 0.00e+00 0.0 4.0e+02 3.9e+02 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 1.5742e+00 1.0 2.12e+08 1.1 2.6e+03 6.6e+03 1.9e+02 3 1 7 1 18 4 1 9 1 20 1045 GAMG: partLevel 4 1.0 3.3198e+01 1.0 1.43e+10 1.4 2.0e+03 1.0e+06 1.7e+02 63 75 6 70 16 83 75 7 91 19 2777 repartition 2 1.0 1.0920e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 1.3280e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 1.1239e-03 1.0 0.00e+00 0.0 7.4e+01 6.6e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 7.6461e-03 1.0 0.00e+00 0.0 6.2e+01 9.9e+01 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 3.6754e+01 1.0 1.45e+10 1.4 7.3e+03 2.9e+05 6.2e+02 69 77 21 71 58 92 77 24 92 66 2558 PCSetUpOnBlocks 122 1.0 1.3308e-02 1.0 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1357 PCApply 16 1.0 3.8816e+01 1.0 1.79e+10 1.3 2.8e+04 7.8e+04 6.4e+02 73 98 80 76 61 97 98 94 98 69 3097 KSPGMRESOrthog 95 1.0 6.7224e-02 1.5 9.47e+07 1.0 0.0e+00 0.0e+00 9.5e+01 0 1 0 0 9 0 1 0 0 10 11186 KSPSetUp 18 1.0 3.2098e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 3.9741e+01 1.0 1.80e+10 1.3 3.0e+04 7.5e+04 8.6e+02 75 99 84 76 81 99 99 99 99 92 3049 SFSetGraph 4 1.0 2.6512e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 22 1.0 2.9793e-03 3.8 0.00e+00 0.0 1.1e+03 2.4e+03 5.0e+00 0 0 3 0 0 0 0 4 0 1 0 SFBcastEnd 22 1.0 7.0238e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 103 110 29169724 0. IS L to G Mapping 3 3 16320748 0. Section 70 53 35616 0. Vector 15 141 48823152 0. Vector Scatter 2 15 3450888 0. Matrix 0 52 170292328 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 277596 0. IS L to G Mapping 4 0 0 0. Vector 352 214 14068176 0. Vector Scatter 40 21 23168 0. Matrix 145 80 299175352 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 3.24249e-06 Average time for zero size MPI_Send(): 1.49012e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 16: solving... ((70996, 1161600), (70996, 1161600)) Solver time: 1.734145e+01 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 16 processors, by jychang48 Wed Mar 2 18:00:52 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 3.188e+01 1.00047 3.187e+01 Objects: 1.028e+03 1.05761 9.805e+02 Flops: 6.348e+09 1.50259 5.466e+09 8.746e+10 Flops/sec: 1.991e+08 1.50208 1.715e+08 2.744e+09 MPI Messages: 9.623e+03 1.72687 7.338e+03 1.174e+05 MPI Message Lengths: 3.833e+08 2.99434 2.872e+04 3.371e+09 MPI Reductions: 1.076e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4530e+01 45.6% 0.0000e+00 0.0% 1.470e+04 12.5% 6.199e+03 21.6% 1.250e+02 11.6% 1: FEM: 1.7342e+01 54.4% 8.7456e+10 100.0% 1.027e+05 87.5% 2.252e+04 78.4% 9.500e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1578e+00 8.9 0.00e+00 0.0 3.5e+03 4.0e+00 4.4e+01 3 0 3 0 4 7 0 24 0 35 0 VecScatterBegin 2 1.0 8.3208e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 5.0068e-06 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.9160e+00 1.1 0.00e+00 0.0 4.4e+03 1.4e+04 2.1e+01 6 0 4 2 2 13 0 30 9 17 0 Mesh Migration 2 1.0 5.1724e-01 1.0 0.00e+00 0.0 8.9e+03 6.6e+04 5.4e+01 2 0 8 18 5 4 0 61 81 43 0 DMPlexInterp 1 1.0 2.1073e+0058535.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 DMPlexDistribute 1 1.0 2.1091e+00 1.1 0.00e+00 0.0 2.9e+03 1.1e+05 2.5e+01 7 0 2 10 2 14 0 20 44 20 0 DMPlexDistCones 2 1.0 1.2136e-01 1.1 0.00e+00 0.0 1.3e+03 1.5e+05 4.0e+00 0 0 1 6 0 1 0 9 27 3 0 DMPlexDistLabels 2 1.0 3.1182e-01 1.0 0.00e+00 0.0 5.5e+03 6.1e+04 2.2e+01 1 0 5 10 2 2 0 38 46 18 0 DMPlexDistribOL 1 1.0 3.4499e-01 1.0 0.00e+00 0.0 1.1e+04 3.5e+04 5.0e+01 1 0 9 11 5 2 0 72 52 40 0 DMPlexDistField 3 1.0 3.0422e-02 1.9 0.00e+00 0.0 1.7e+03 1.5e+04 1.2e+01 0 0 1 1 1 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.8079e-0161.7 0.00e+00 0.0 2.9e+03 9.6e+03 6.0e+00 3 0 3 1 1 6 0 20 4 5 0 DMPlexStratify 6 1.5 5.6268e-0115.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 8.5973e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFBcastBegin 95 1.0 1.1731e+00 3.9 0.00e+00 0.0 1.4e+04 5.0e+04 4.1e+01 3 0 12 21 4 7 0 95 97 33 0 SFBcastEnd 95 1.0 2.9754e-01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 7.7155e-0312.1 0.00e+00 0.0 5.0e+02 3.0e+04 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.9478e-03 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.9114e-0517.2 0.00e+00 0.0 5.4e+01 3.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.9598e-04 3.1 0.00e+00 0.0 5.4e+01 3.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 3.8493e-03 8.3 0.00e+00 0.0 4.7e+02 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 3.8695e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 4.4748e-02 3.3 2.61e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 0 0 0 9 0 0 0 0 10 9234 VecNorm 105 1.0 6.0410e-03 1.4 3.73e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 9757 VecScale 217 1.0 3.4852e-03 1.1 4.99e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 22716 VecCopy 77 1.0 9.4819e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 9.9552e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 1.9503e-04 1.1 2.58e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 20979 VecAYPX 544 1.0 5.8689e-03 1.2 4.72e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12804 VecAXPBYCZ 272 1.0 3.8891e-03 1.2 9.44e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 38645 VecMAXPY 105 1.0 1.0443e-02 1.1 2.96e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 44820 VecAssemblyBegin 14 1.0 4.9472e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 6.8665e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 5.2500e-04 1.2 3.05e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9262 VecScatterBegin 988 1.0 2.9767e-02 1.3 0.00e+00 0.0 8.6e+04 3.5e+03 0.0e+00 0 0 73 9 0 0 0 84 11 0 0 VecScatterEnd 988 1.0 2.4255e-01 8.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSetRandom 4 1.0 1.1868e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 8.3659e-03 1.3 5.59e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 10569 MatMult 521 1.0 9.2801e-01 1.1 1.15e+09 1.3 7.1e+04 3.9e+03 1.3e+02 3 19 61 8 12 5 19 70 11 14 17559 MatMultAdd 227 1.0 8.0478e-02 1.5 5.58e+07 1.1 9.7e+03 1.8e+03 0.0e+00 0 1 8 1 0 0 1 9 1 0 10703 MatMultTranspose 68 1.0 4.1057e-02 1.5 3.17e+07 1.2 6.3e+03 9.3e+02 0.0e+00 0 1 5 0 0 0 1 6 0 0 11713 MatSolve 129 1.2 1.0344e-01 1.1 7.01e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 10602 MatSOR 452 1.0 5.2970e-01 1.1 5.58e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 10 0 0 0 3 10 0 0 0 15905 MatLUFactorSym 1 1.0 3.1948e-05 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 4.5488e-03 1.1 1.17e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3964 MatILUFactorSym 1 1.0 1.7991e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 3.1865e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 1.8671e-02 1.2 1.09e+07 1.3 5.5e+02 4.0e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 8353 MatResidual 68 1.0 1.1264e-01 1.1 1.48e+08 1.3 9.4e+03 4.0e+03 0.0e+00 0 2 8 1 0 1 2 9 1 0 18463 MatAssemblyBegin 93 1.0 1.7310e+00 6.5 0.00e+00 0.0 2.3e+03 5.8e+05 5.8e+01 3 0 2 40 5 5 0 2 51 6 0 MatAssemblyEnd 93 1.0 1.6349e+00 1.3 0.00e+00 0.0 4.9e+03 4.9e+02 2.2e+02 5 0 4 0 20 9 0 5 0 23 0 MatGetRow 131316 1.0 5.1258e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 MatGetRowIJ 2 2.0 9.0599e-06 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 8.2927e-03 1.0 0.00e+00 0.0 2.4e+02 2.6e+03 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 2.1505e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 5.2981e-03 1.6 0.00e+00 0.0 2.9e+03 9.7e+02 2.0e+01 0 0 2 0 2 0 0 3 0 2 0 MatZeroEntries 4 1.0 1.5755e-02 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 6.0479e-01 1.0 0.00e+00 0.0 2.2e+02 6.2e+02 2.0e+01 2 0 0 0 2 3 0 0 0 2 0 MatMatMult 5 1.0 3.3290e-01 1.0 9.34e+06 1.3 3.7e+03 2.0e+03 8.0e+01 1 0 3 0 7 2 0 4 0 8 397 MatMatMultSym 5 1.0 3.0637e-01 1.0 0.00e+00 0.0 3.1e+03 1.6e+03 7.0e+01 1 0 3 0 7 2 0 3 0 7 0 MatMatMultNum 5 1.0 2.6513e-02 1.0 9.34e+06 1.3 6.0e+02 4.1e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 4987 MatPtAP 4 1.0 1.3960e+01 1.0 4.49e+09 1.7 6.4e+03 3.6e+05 6.8e+01 44 67 5 68 6 80 67 6 87 7 4228 MatPtAPSymbolic 4 1.0 8.5832e+00 1.0 0.00e+00 0.0 3.3e+03 2.9e+05 2.8e+01 27 0 3 28 3 49 0 3 36 3 0 MatPtAPNumeric 4 1.0 5.3769e+00 1.0 4.49e+09 1.7 3.0e+03 4.4e+05 4.0e+01 17 67 3 40 4 31 67 3 51 4 10978 MatTrnMatMult 1 1.0 3.8761e-02 1.0 1.22e+06 1.0 6.5e+02 4.0e+03 1.9e+01 0 0 1 0 2 0 0 1 0 2 495 MatTrnMatMultSym 1 1.0 2.4785e-02 1.0 0.00e+00 0.0 5.4e+02 2.3e+03 1.7e+01 0 0 0 0 2 0 0 1 0 2 0 MatTrnMatMultNum 1 1.0 1.3965e-02 1.0 1.22e+06 1.0 1.1e+02 1.3e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 1374 MatGetLocalMat 16 1.0 1.1930e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 9.5529e-02 1.3 0.00e+00 0.0 4.1e+03 1.5e+05 0.0e+00 0 0 3 19 0 1 0 4 24 0 0 PCGAMGGraph_AGG 4 1.0 5.0889e-01 1.0 8.72e+06 1.3 1.3e+03 1.9e+03 4.8e+01 2 0 1 0 4 3 0 1 0 5 240 PCGAMGCoarse_AGG 4 1.0 5.0001e-02 1.0 1.22e+06 1.0 4.1e+03 1.9e+03 4.7e+01 0 0 3 0 4 0 0 4 0 5 384 PCGAMGProl_AGG 4 1.0 1.1742e-02 1.0 0.00e+00 0.0 1.8e+03 1.0e+03 8.0e+01 0 0 2 0 7 0 0 2 0 8 0 PCGAMGPOpt_AGG 4 1.0 6.1055e-01 1.0 1.05e+08 1.3 8.8e+03 3.2e+03 1.9e+02 2 2 8 1 17 4 2 9 1 20 2450 GAMG: createProl 4 1.0 1.1821e+00 1.0 1.15e+08 1.3 1.6e+04 2.5e+03 3.6e+02 4 2 14 1 34 7 2 16 2 38 1385 Graph 8 1.0 5.0667e-01 1.0 8.72e+06 1.3 1.3e+03 1.9e+03 4.8e+01 2 0 1 0 4 3 0 1 0 5 241 MIS/Agg 4 1.0 5.3558e-03 1.6 0.00e+00 0.0 2.9e+03 9.7e+02 2.0e+01 0 0 2 0 2 0 0 3 0 2 0 SA: col data 4 1.0 2.2843e-03 1.0 0.00e+00 0.0 7.2e+02 2.3e+03 2.4e+01 0 0 1 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 8.7664e-03 1.0 0.00e+00 0.0 1.1e+03 2.3e+02 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 6.1054e-01 1.0 1.05e+08 1.3 8.8e+03 3.2e+03 1.9e+02 2 2 8 1 17 4 2 9 1 20 2450 GAMG: partLevel 4 1.0 1.3967e+01 1.0 4.49e+09 1.7 6.6e+03 3.5e+05 1.7e+02 44 67 6 68 16 81 67 6 87 18 4226 repartition 2 1.0 2.9707e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 1.8883e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 1.3578e-03 1.0 0.00e+00 0.0 1.1e+02 5.3e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 4.5681e-03 1.0 0.00e+00 0.0 1.3e+02 1.1e+02 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 1.5573e+01 1.0 4.60e+09 1.7 2.3e+04 1.0e+05 6.2e+02 49 69 20 70 58 90 69 23 89 65 3898 PCSetUpOnBlocks 129 1.0 6.7956e-03 1.1 1.17e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2654 PCApply 17 1.0 1.6612e+01 1.0 6.21e+09 1.5 9.8e+04 2.7e+04 6.5e+02 52 98 83 77 60 96 98 95 98 68 5138 KSPGMRESOrthog 96 1.0 5.4426e-02 2.4 5.23e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 15184 KSPSetUp 18 1.0 1.6465e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 1.7078e+01 1.0 6.28e+09 1.5 1.0e+05 2.6e+04 8.7e+02 54 99 87 78 81 98 99 99 99 92 5059 SFSetGraph 4 1.0 2.7418e-04 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 25 1.0 4.1673e-03 4.9 0.00e+00 0.0 3.2e+03 1.3e+03 5.0e+00 0 0 3 0 0 0 0 3 0 1 0 SFBcastEnd 25 1.0 6.3801e-04 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 125 132 25606724 0. IS L to G Mapping 3 3 15014432 0. Section 70 53 35616 0. Vector 15 141 26937184 0. Vector Scatter 2 15 1720344 0. Matrix 0 52 71903496 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 239952 0. IS L to G Mapping 4 0 0 0. Vector 356 218 7355832 0. Vector Scatter 40 21 23096 0. Matrix 145 80 132079708 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 5.38826e-06 Average time for zero size MPI_Send(): 1.68383e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 24: solving... ((47407, 1161600), (47407, 1161600)) Solver time: 1.005248e+01 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 24 processors, by jychang48 Wed Mar 2 18:01:20 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.430e+01 1.00050 2.430e+01 Objects: 1.042e+03 1.07202 9.828e+02 Flops: 3.581e+09 1.56180 2.922e+09 7.012e+10 Flops/sec: 1.474e+08 1.56148 1.202e+08 2.886e+09 MPI Messages: 1.260e+04 1.77245 9.542e+03 2.290e+05 MPI Message Lengths: 3.024e+08 3.27966 1.567e+04 3.589e+09 MPI Reductions: 1.076e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4244e+01 58.6% 0.0000e+00 0.0% 2.618e+04 11.4% 3.318e+03 21.2% 1.250e+02 11.6% 1: FEM: 1.0052e+01 41.4% 7.0118e+10 100.0% 2.028e+05 88.6% 1.235e+04 78.8% 9.500e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1165e+00 9.7 0.00e+00 0.0 6.2e+03 4.0e+00 4.4e+01 4 0 3 0 4 7 0 24 0 35 0 VecScatterBegin 2 1.0 6.1035e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.0014e-0510.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.9826e+00 1.1 0.00e+00 0.0 8.7e+03 7.7e+03 2.1e+01 8 0 4 2 2 14 0 33 9 17 0 Mesh Migration 2 1.0 4.5927e-01 1.0 0.00e+00 0.0 1.5e+04 4.0e+04 5.4e+01 2 0 7 17 5 3 0 58 81 43 0 DMPlexInterp 1 1.0 2.1183e+0057320.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 DMPlexDistribute 1 1.0 2.1988e+00 1.1 0.00e+00 0.0 5.7e+03 5.7e+04 2.5e+01 9 0 2 9 2 15 0 22 43 20 0 DMPlexDistCones 2 1.0 1.0621e-01 1.1 0.00e+00 0.0 2.2e+03 9.3e+04 4.0e+00 0 0 1 6 0 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.8685e-01 1.0 0.00e+00 0.0 9.3e+03 3.7e+04 2.2e+01 1 0 4 10 2 2 0 35 46 18 0 DMPlexDistribOL 1 1.0 2.6254e-01 1.0 0.00e+00 0.0 1.8e+04 2.2e+04 5.0e+01 1 0 8 11 5 2 0 70 53 40 0 DMPlexDistField 3 1.0 2.6364e-02 1.8 0.00e+00 0.0 3.0e+03 9.4e+03 1.2e+01 0 0 1 1 1 0 0 11 4 10 0 DMPlexDistData 2 1.0 9.9430e-0127.9 0.00e+00 0.0 6.0e+03 5.0e+03 6.0e+00 4 0 3 1 1 6 0 23 4 5 0 DMPlexStratify 6 1.5 5.5264e-0122.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 6.1109e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1261e+00 4.0 0.00e+00 0.0 2.5e+04 2.9e+04 4.1e+01 4 0 11 20 4 7 0 96 97 33 0 SFBcastEnd 95 1.0 2.9690e-01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 7.7369e-0312.0 0.00e+00 0.0 8.7e+02 1.8e+04 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.3060e-03 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.6968e-0521.9 0.00e+00 0.0 9.4e+01 2.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.3208e-04 3.2 0.00e+00 0.0 9.4e+01 2.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 3.1033e-03 5.2 0.00e+00 0.0 8.3e+02 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 4.2391e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 3.3720e-02 3.3 1.75e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 12253 VecNorm 105 1.0 4.9245e-03 1.4 2.49e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 11969 VecScale 217 1.0 2.3873e-03 1.1 3.33e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33163 VecCopy 77 1.0 7.2002e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 6.5861e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 1.4710e-04 1.2 1.73e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 27810 VecAYPX 544 1.0 3.7701e-03 1.1 3.16e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 19928 VecAXPBYCZ 272 1.0 2.6333e-03 1.2 6.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 57063 VecMAXPY 105 1.0 5.9943e-03 1.2 1.98e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 78080 VecAssemblyBegin 14 1.0 5.3096e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 6.6996e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 3.2210e-04 1.2 2.04e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15093 VecScatterBegin 988 1.0 3.2264e-02 1.4 0.00e+00 0.0 1.7e+05 2.3e+03 0.0e+00 0 0 74 11 0 0 0 84 14 0 0 VecScatterEnd 988 1.0 1.3531e-01 5.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSetRandom 4 1.0 8.0204e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 6.9528e-03 1.2 3.74e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 12716 MatMult 521 1.0 6.0050e-01 1.1 7.20e+08 1.3 1.4e+05 2.6e+03 1.3e+02 2 22 63 10 12 6 22 71 13 14 25122 MatMultAdd 227 1.0 6.2568e-02 1.7 3.48e+07 1.1 1.7e+04 1.2e+03 0.0e+00 0 1 8 1 0 1 1 9 1 0 12729 MatMultTranspose 68 1.0 3.3125e-02 2.0 1.88e+07 1.2 1.1e+04 6.3e+02 0.0e+00 0 1 5 0 0 0 1 6 0 0 12559 MatSolve 129 1.2 7.0106e-02 1.1 4.68e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 15570 MatSOR 452 1.0 3.2208e-01 1.1 3.43e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 11 0 0 0 3 11 0 0 0 23760 MatLUFactorSym 1 1.0 2.0027e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 2.9938e-03 1.1 7.77e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6005 MatILUFactorSym 1 1.0 1.2600e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 2.0496e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 1.1414e-02 1.3 6.76e+06 1.2 1.1e+03 2.6e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 12464 MatResidual 68 1.0 7.3167e-02 1.1 9.23e+07 1.3 1.9e+04 2.6e+03 0.0e+00 0 3 8 1 0 1 3 9 2 0 26129 MatAssemblyBegin 93 1.0 1.0436e+00 7.0 0.00e+00 0.0 4.5e+03 3.0e+05 5.8e+01 3 0 2 38 5 6 0 2 48 6 0 MatAssemblyEnd 93 1.0 1.0572e+00 1.1 0.00e+00 0.0 9.3e+03 3.2e+02 2.2e+02 4 0 4 0 20 10 0 5 0 23 0 MatGetRow 87698 1.0 3.4188e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 MatGetRowIJ 2 2.0 7.8678e-06 8.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 6.2807e-03 1.0 0.00e+00 0.0 3.4e+02 1.7e+03 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 1.7095e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 3.9628e-03 1.5 0.00e+00 0.0 5.1e+03 7.0e+02 2.0e+01 0 0 2 0 2 0 0 2 0 2 0 MatZeroEntries 4 1.0 1.0336e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 4.0474e-01 1.0 0.00e+00 0.0 3.7e+02 4.4e+02 2.0e+01 2 0 0 0 2 4 0 0 0 2 0 MatMatMult 5 1.0 1.8609e-01 1.0 5.84e+06 1.3 7.5e+03 1.3e+03 8.0e+01 1 0 3 0 7 2 0 4 0 8 657 MatMatMultSym 5 1.0 1.6933e-01 1.0 0.00e+00 0.0 6.3e+03 1.1e+03 7.0e+01 1 0 3 0 7 2 0 3 0 7 0 MatMatMultNum 5 1.0 1.6730e-02 1.0 5.84e+06 1.3 1.2e+03 2.6e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 7314 MatPtAP 4 1.0 7.8254e+00 1.0 2.42e+09 1.9 1.3e+04 1.8e+05 6.8e+01 32 62 6 67 6 78 62 6 84 7 5593 MatPtAPSymbolic 4 1.0 4.7722e+00 1.0 0.00e+00 0.0 6.8e+03 1.5e+05 2.8e+01 20 0 3 29 3 47 0 3 37 3 0 MatPtAPNumeric 4 1.0 3.0532e+00 1.0 2.42e+09 1.9 6.4e+03 2.1e+05 4.0e+01 13 62 3 38 4 30 62 3 48 4 14334 MatTrnMatMult 1 1.0 2.6716e-02 1.0 8.18e+05 1.0 1.1e+03 2.8e+03 1.9e+01 0 0 0 0 2 0 0 1 0 2 721 MatTrnMatMultSym 1 1.0 1.7252e-02 1.0 0.00e+00 0.0 9.4e+02 1.6e+03 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 MatTrnMatMultNum 1 1.0 9.4512e-03 1.0 8.18e+05 1.0 1.9e+02 8.9e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 2039 MatGetLocalMat 16 1.0 7.0903e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 8.4786e-02 1.2 0.00e+00 0.0 8.2e+03 8.7e+04 0.0e+00 0 0 4 20 0 1 0 4 25 0 0 PCGAMGGraph_AGG 4 1.0 3.3958e-01 1.0 5.43e+06 1.3 2.4e+03 1.4e+03 4.8e+01 1 0 1 0 4 3 0 1 0 5 331 PCGAMGCoarse_AGG 4 1.0 3.4880e-02 1.0 8.18e+05 1.0 7.1e+03 1.4e+03 4.7e+01 0 0 3 0 4 0 0 4 0 5 552 PCGAMGProl_AGG 4 1.0 1.0176e-02 1.0 0.00e+00 0.0 3.0e+03 7.7e+02 8.0e+01 0 0 1 0 7 0 0 2 0 8 0 PCGAMGPOpt_AGG 4 1.0 3.7025e-01 1.0 6.59e+07 1.3 1.8e+04 2.1e+03 1.9e+02 2 2 8 1 17 4 2 9 1 20 3737 GAMG: createProl 4 1.0 7.5580e-01 1.0 7.22e+07 1.3 3.1e+04 1.7e+03 3.6e+02 3 2 13 1 34 8 2 15 2 38 2005 Graph 8 1.0 3.3810e-01 1.0 5.43e+06 1.3 2.4e+03 1.4e+03 4.8e+01 1 0 1 0 4 3 0 1 0 5 333 MIS/Agg 4 1.0 4.0288e-03 1.5 0.00e+00 0.0 5.1e+03 7.0e+02 2.0e+01 0 0 2 0 2 0 0 2 0 2 0 SA: col data 4 1.0 1.8752e-03 1.0 0.00e+00 0.0 1.3e+03 1.6e+03 2.4e+01 0 0 1 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 7.7009e-03 1.0 0.00e+00 0.0 1.8e+03 1.7e+02 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 3.7022e-01 1.0 6.59e+07 1.3 1.8e+04 2.1e+03 1.9e+02 2 2 8 1 17 4 2 9 1 20 3737 GAMG: partLevel 4 1.0 7.8308e+00 1.0 2.42e+09 1.9 1.4e+04 1.8e+05 1.7e+02 32 62 6 67 16 78 62 7 85 18 5589 repartition 2 1.0 2.1386e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 2.1815e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 1.4272e-03 1.0 0.00e+00 0.0 1.5e+02 3.7e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 3.1939e-03 1.0 0.00e+00 0.0 1.9e+02 1.1e+02 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 8.8722e+00 1.0 2.49e+09 1.9 4.5e+04 5.4e+04 6.2e+02 37 65 20 68 58 88 65 22 87 65 5107 PCSetUpOnBlocks 129 1.0 4.6492e-03 1.1 7.77e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3867 PCApply 17 1.0 9.5174e+00 1.0 3.49e+09 1.6 1.9e+05 1.4e+04 6.5e+02 39 97 85 78 60 95 97 96 98 68 7139 KSPGMRESOrthog 96 1.0 3.9721e-02 2.6 3.50e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 20804 KSPSetUp 18 1.0 1.1928e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 9.8317e+00 1.0 3.53e+09 1.6 2.0e+05 1.4e+04 8.7e+02 40 98 88 78 81 98 98 99 99 92 7018 SFSetGraph 4 1.0 2.1601e-04 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 25 1.0 3.2976e-03 3.6 0.00e+00 0.0 5.6e+03 9.2e+02 5.0e+00 0 0 2 0 0 0 0 3 0 1 0 SFBcastEnd 25 1.0 1.0622e-03 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 139 146 24070156 0. IS L to G Mapping 3 3 14448024 0. Section 70 53 35616 0. Vector 15 141 19734088 0. Vector Scatter 2 15 1154208 0. Matrix 0 52 42870916 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 195904 0. IS L to G Mapping 4 0 0 0. Vector 356 218 5052384 0. Vector Scatter 40 21 23080 0. Matrix 145 80 79765836 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 1.71661e-05 Average time for zero size MPI_Send(): 1.41064e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 32: solving... ((35155, 1161600), (35155, 1161600)) Solver time: 6.467208e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 32 processors, by jychang48 Wed Mar 2 18:01:44 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.074e+01 1.00048 2.074e+01 Objects: 1.068e+03 1.09877 9.844e+02 Flops: 2.198e+09 1.49553 1.837e+09 5.879e+10 Flops/sec: 1.060e+08 1.49497 8.860e+07 2.835e+09 MPI Messages: 1.553e+04 1.84528 1.145e+04 3.664e+05 MPI Message Lengths: 2.863e+08 4.24531 9.989e+03 3.660e+09 MPI Reductions: 1.078e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4269e+01 68.8% 0.0000e+00 0.0% 3.925e+04 10.7% 2.153e+03 21.5% 1.250e+02 11.6% 1: FEM: 6.4673e+00 31.2% 5.8792e+10 100.0% 3.271e+05 89.3% 7.837e+03 78.5% 9.520e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1607e+0012.7 0.00e+00 0.0 9.3e+03 4.0e+00 4.4e+01 5 0 3 0 4 7 0 24 0 35 0 VecScatterBegin 2 1.0 4.7922e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 6.9141e-06 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.0361e+00 1.1 0.00e+00 0.0 1.4e+04 5.1e+03 2.1e+01 10 0 4 2 2 14 0 36 9 17 0 Mesh Migration 2 1.0 4.3835e-01 1.0 0.00e+00 0.0 2.2e+04 2.9e+04 5.4e+01 2 0 6 17 5 3 0 56 80 43 0 DMPlexInterp 1 1.0 2.1146e+0064271.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2676e+00 1.1 0.00e+00 0.0 9.4e+03 3.5e+04 2.5e+01 11 0 3 9 2 16 0 24 41 20 0 DMPlexDistCones 2 1.0 1.0199e-01 1.1 0.00e+00 0.0 3.2e+03 6.6e+04 4.0e+00 0 0 1 6 0 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.7904e-01 1.0 0.00e+00 0.0 1.3e+04 2.7e+04 2.2e+01 1 0 4 10 2 2 0 34 45 18 0 DMPlexDistribOL 1 1.0 2.2615e-01 1.0 0.00e+00 0.0 2.7e+04 1.6e+04 5.0e+01 1 0 7 12 5 2 0 68 54 40 0 DMPlexDistField 3 1.0 3.1294e-02 2.0 0.00e+00 0.0 4.3e+03 6.8e+03 1.2e+01 0 0 1 1 1 0 0 11 4 10 0 DMPlexDistData 2 1.0 9.9918e-0173.6 0.00e+00 0.0 1.0e+04 3.2e+03 6.0e+00 5 0 3 1 1 7 0 26 4 5 0 DMPlexStratify 6 1.5 5.4843e-0129.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 5.1525e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1663e+00 4.4 0.00e+00 0.0 3.8e+04 2.0e+04 4.1e+01 5 0 10 21 4 7 0 96 97 33 0 SFBcastEnd 95 1.0 3.0130e-01 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 8.7683e-0318.0 0.00e+00 0.0 1.3e+03 1.3e+04 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.8024e-03 5.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.8862e-0512.5 0.00e+00 0.0 1.4e+02 2.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.3995e-04 3.2 0.00e+00 0.0 1.4e+02 2.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 1.9822e-03 2.9 0.00e+00 0.0 1.2e+03 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 4.4608e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 3.1041e-02 3.4 1.32e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 13312 VecNorm 105 1.0 5.3015e-03 1.5 1.87e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 11119 VecScale 217 1.0 1.9186e-03 1.1 2.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 41267 VecCopy 77 1.0 5.4383e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 4.6587e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 1.0824e-04 1.2 1.30e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 37802 VecAYPX 544 1.0 3.0854e-03 1.2 2.37e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 24362 VecAXPBYCZ 272 1.0 1.8950e-03 1.2 4.74e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 79333 VecMAXPY 105 1.0 4.1659e-03 1.2 1.49e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 112362 VecAssemblyBegin 14 1.0 5.6028e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 6.9141e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 2.4009e-04 1.2 1.53e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 20258 VecScatterBegin 988 1.0 3.2709e-02 1.6 0.00e+00 0.0 2.8e+05 1.7e+03 0.0e+00 0 0 75 13 0 0 0 84 16 0 0 VecScatterEnd 988 1.0 1.0493e-01 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSetRandom 4 1.0 5.8675e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 7.1294e-03 1.3 2.81e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 12403 MatMult 521 1.0 4.4204e-01 1.1 5.11e+08 1.3 2.4e+05 1.9e+03 1.3e+02 2 24 65 12 12 6 24 72 15 14 32017 MatMultAdd 227 1.0 5.3586e-02 1.9 2.44e+07 1.1 2.6e+04 9.6e+02 0.0e+00 0 1 7 1 0 1 1 8 1 0 14062 MatMultTranspose 68 1.0 2.3165e-02 1.9 1.23e+07 1.1 1.7e+04 4.9e+02 0.0e+00 0 1 5 0 0 0 1 5 0 0 16104 MatSolve 129 1.2 5.2656e-02 1.2 3.51e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 20642 MatSOR 452 1.0 2.2063e-01 1.1 2.33e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 12 0 0 0 3 12 0 0 0 32021 MatLUFactorSym 1 1.0 3.3855e-05 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 2.1760e-03 1.1 5.87e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8254 MatILUFactorSym 1 1.0 9.6488e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 1.2136e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 9.3570e-03 1.7 4.72e+06 1.3 1.8e+03 1.9e+03 0.0e+00 0 0 1 0 0 0 0 1 0 0 14120 MatResidual 68 1.0 5.2496e-02 1.2 6.51e+07 1.3 3.1e+04 1.9e+03 0.0e+00 0 3 9 2 0 1 3 10 2 0 33952 MatAssemblyBegin 93 1.0 5.5953e-01 4.5 0.00e+00 0.0 7.1e+03 1.8e+05 5.8e+01 2 0 2 36 5 5 0 2 45 6 0 MatAssemblyEnd 93 1.0 8.5411e-01 1.1 0.00e+00 0.0 1.5e+04 2.4e+02 2.2e+02 4 0 4 0 20 13 0 5 0 23 0 MatGetRow 65844 1.0 2.5644e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 MatGetRowIJ 2 2.0 8.1062e-06 8.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 5.0519e-03 1.0 0.00e+00 0.0 4.5e+02 1.2e+03 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 1.4186e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 3.4940e-03 1.5 0.00e+00 0.0 8.3e+03 5.3e+02 2.2e+01 0 0 2 0 2 0 0 3 0 2 0 MatZeroEntries 4 1.0 7.4160e-03 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 3.0466e-01 1.0 0.00e+00 0.0 5.4e+02 3.6e+02 2.0e+01 1 0 0 0 2 5 0 0 0 2 0 MatMatMult 5 1.0 1.1690e-01 1.0 4.14e+06 1.3 1.2e+04 9.7e+02 8.0e+01 1 0 3 0 7 2 0 4 0 8 982 MatMatMultSym 5 1.0 1.0482e-01 1.0 0.00e+00 0.0 1.0e+04 7.8e+02 7.0e+01 1 0 3 0 6 2 0 3 0 7 0 MatMatMultNum 5 1.0 1.2022e-02 1.0 4.14e+06 1.3 2.0e+03 1.9e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 9545 MatPtAP 4 1.0 4.8059e+00 1.0 1.35e+09 1.8 2.2e+04 1.1e+05 6.8e+01 23 58 6 64 6 74 58 7 82 7 7077 MatPtAPSymbolic 4 1.0 2.8555e+00 1.0 0.00e+00 0.0 1.1e+04 9.4e+04 2.8e+01 14 0 3 29 3 44 0 3 37 3 0 MatPtAPNumeric 4 1.0 1.9504e+00 1.0 1.35e+09 1.8 1.1e+04 1.2e+05 4.0e+01 9 58 3 36 4 30 58 3 45 4 17438 MatTrnMatMult 1 1.0 2.0930e-02 1.0 6.19e+05 1.1 1.6e+03 2.3e+03 1.9e+01 0 0 0 0 2 0 0 0 0 2 924 MatTrnMatMultSym 1 1.0 1.3424e-02 1.0 0.00e+00 0.0 1.4e+03 1.3e+03 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 MatTrnMatMultNum 1 1.0 7.4990e-03 1.0 6.19e+05 1.1 2.7e+02 7.2e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 2579 MatGetLocalMat 16 1.0 4.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 6.7384e-02 1.3 0.00e+00 0.0 1.3e+04 5.4e+04 0.0e+00 0 0 4 20 0 1 0 4 26 0 0 PCGAMGGraph_AGG 4 1.0 2.5093e-01 1.0 3.83e+06 1.3 3.7e+03 1.0e+03 4.8e+01 1 0 1 0 4 4 0 1 0 5 418 PCGAMGCoarse_AGG 4 1.0 2.7763e-02 1.0 6.19e+05 1.1 1.1e+04 1.0e+03 4.9e+01 0 0 3 0 5 0 0 3 0 5 697 PCGAMGProl_AGG 4 1.0 8.3046e-03 1.0 0.00e+00 0.0 4.4e+03 6.2e+02 8.0e+01 0 0 1 0 7 0 0 1 0 8 0 PCGAMGPOpt_AGG 4 1.0 2.5502e-01 1.0 4.67e+07 1.3 3.0e+04 1.5e+03 1.9e+02 1 2 8 1 17 4 2 9 2 20 5087 GAMG: createProl 4 1.0 5.4260e-01 1.0 5.11e+07 1.3 4.9e+04 1.3e+03 3.6e+02 3 2 13 2 34 8 2 15 2 38 2620 Graph 8 1.0 2.4971e-01 1.0 3.83e+06 1.3 3.7e+03 1.0e+03 4.8e+01 1 0 1 0 4 4 0 1 0 5 420 MIS/Agg 4 1.0 3.5570e-03 1.4 0.00e+00 0.0 8.3e+03 5.3e+02 2.2e+01 0 0 2 0 2 0 0 3 0 2 0 SA: col data 4 1.0 1.6532e-03 1.1 0.00e+00 0.0 1.9e+03 1.3e+03 2.4e+01 0 0 1 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 6.1510e-03 1.0 0.00e+00 0.0 2.5e+03 1.4e+02 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 2.5500e-01 1.0 4.67e+07 1.3 3.0e+04 1.5e+03 1.9e+02 1 2 8 1 17 4 2 9 2 20 5087 GAMG: partLevel 4 1.0 4.8109e+00 1.0 1.35e+09 1.8 2.2e+04 1.1e+05 1.7e+02 23 58 6 64 16 74 58 7 82 18 7070 repartition 2 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 3.9124e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 1.4398e-03 1.0 0.00e+00 0.0 1.9e+02 2.6e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 2.5690e-03 1.0 0.00e+00 0.0 2.5e+02 9.6e+01 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 5.5706e+00 1.0 1.41e+09 1.8 7.3e+04 3.3e+04 6.2e+02 27 60 20 66 58 86 60 22 85 66 6367 PCSetUpOnBlocks 129 1.0 3.5369e-03 1.1 5.87e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5078 PCApply 17 1.0 6.0198e+00 1.0 2.13e+09 1.5 3.1e+05 9.0e+03 6.5e+02 29 96 86 77 60 93 96 96 98 68 9397 KSPGMRESOrthog 96 1.0 3.5287e-02 2.7 2.63e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 23420 KSPSetUp 18 1.0 1.0297e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 6.2599e+00 1.0 2.16e+09 1.5 3.2e+05 8.8e+03 8.7e+02 30 98 89 78 81 97 98 99 99 92 9205 SFSetGraph 4 1.0 2.0409e-04 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 27 1.0 2.2466e-03 2.6 0.00e+00 0.0 9.1e+03 6.9e+02 5.0e+00 0 0 2 0 0 0 0 3 0 1 0 SFBcastEnd 27 1.0 5.7459e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 165 172 23657268 0. IS L to G Mapping 3 3 14326164 0. Section 70 53 35616 0. Vector 15 141 16031216 0. Vector Scatter 2 15 860160 0. Matrix 0 52 30817568 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 197036 0. IS L to G Mapping 4 0 0 0. Vector 356 218 3951520 0. Vector Scatter 40 21 23048 0. Matrix 145 80 57676072 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 6.96182e-06 Average time for zero size MPI_Send(): 1.65403e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 40: solving... ((27890, 1161600), (27890, 1161600)) Solver time: 5.067154e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 40 processors, by jychang48 Wed Mar 2 18:02:07 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.954e+01 1.00033 1.953e+01 Objects: 1.090e+03 1.12371 9.856e+02 Flops: 1.585e+09 1.51148 1.325e+09 5.299e+10 Flops/sec: 8.113e+07 1.51121 6.782e+07 2.713e+09 MPI Messages: 1.975e+04 2.40864 1.332e+04 5.330e+05 MPI Message Lengths: 2.951e+08 6.39097 7.450e+03 3.971e+09 MPI Reductions: 1.077e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4467e+01 74.1% 0.0000e+00 0.0% 5.371e+04 10.1% 1.521e+03 20.4% 1.250e+02 11.6% 1: FEM: 5.0672e+00 25.9% 5.2989e+10 100.0% 4.793e+05 89.9% 5.929e+03 79.6% 9.510e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1854e+0014.3 0.00e+00 0.0 1.3e+04 4.0e+00 4.4e+01 6 0 2 0 4 7 0 24 0 35 0 VecScatterBegin 2 1.0 2.8849e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 8.1062e-06 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1183e+00 1.1 0.00e+00 0.0 2.0e+04 3.6e+03 2.1e+01 11 0 4 2 2 15 0 38 9 17 0 Mesh Migration 2 1.0 4.1828e-01 1.0 0.00e+00 0.0 2.9e+04 2.2e+04 5.4e+01 2 0 5 16 5 3 0 54 80 43 0 DMPlexInterp 1 1.0 2.1106e+0061906.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.3542e+00 1.1 0.00e+00 0.0 1.4e+04 2.3e+04 2.5e+01 12 0 3 8 2 16 0 26 40 20 0 DMPlexDistCones 2 1.0 9.7619e-02 1.2 0.00e+00 0.0 4.3e+03 5.1e+04 4.0e+00 0 0 1 5 0 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.6993e-01 1.0 0.00e+00 0.0 1.7e+04 2.1e+04 2.2e+01 1 0 3 9 2 2 0 32 45 18 0 DMPlexDistribOL 1 1.0 2.0237e-01 1.0 0.00e+00 0.0 3.6e+04 1.2e+04 5.0e+01 1 0 7 11 5 1 0 66 55 40 0 DMPlexDistField 3 1.0 2.9826e-02 2.2 0.00e+00 0.0 5.7e+03 5.2e+03 1.2e+01 0 0 1 1 1 0 0 11 4 10 0 DMPlexDistData 2 1.0 1.0467e+0079.2 0.00e+00 0.0 1.5e+04 2.2e+03 6.0e+00 5 0 3 1 1 7 0 28 4 5 0 DMPlexStratify 6 1.5 5.4628e-0136.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 4.2505e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1899e+00 4.6 0.00e+00 0.0 5.1e+04 1.5e+04 4.1e+01 6 0 10 20 4 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0756e-01 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 8.8451e-0319.5 0.00e+00 0.0 1.7e+03 9.4e+03 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.1884e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.7207e-0524.8 0.00e+00 0.0 1.9e+02 1.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.4687e-04 3.1 0.00e+00 0.0 1.9e+02 1.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 2.3334e-03 3.5 0.00e+00 0.0 1.7e+03 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 4.2319e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 2.6845e-02 3.8 1.05e+07 1.1 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 15392 VecNorm 105 1.0 4.4451e-03 1.5 1.50e+06 1.1 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 13260 VecScale 217 1.0 1.6758e-03 1.2 2.01e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 47243 VecCopy 77 1.0 5.0402e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 3.8776e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 9.5606e-05 1.2 1.04e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 42794 VecAYPX 544 1.0 2.5859e-03 1.2 1.90e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 29060 VecAXPBYCZ 272 1.0 1.5566e-03 1.2 3.80e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 96549 VecMAXPY 105 1.0 3.2706e-03 1.2 1.19e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 143109 VecAssemblyBegin 14 1.0 5.3811e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 5.7459e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 2.1434e-04 1.3 1.23e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 22685 VecScatterBegin 988 1.0 3.5461e-02 2.1 0.00e+00 0.0 4.0e+05 1.3e+03 0.0e+00 0 0 76 14 0 1 0 84 17 0 0 VecScatterEnd 988 1.0 1.2415e-01 5.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecSetRandom 4 1.0 5.0902e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 6.2478e-03 1.3 2.25e+06 1.1 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 14152 MatMult 521 1.0 3.6698e-01 1.2 4.16e+08 1.4 3.5e+05 1.5e+03 1.3e+02 2 26 66 13 12 7 26 73 16 14 36848 MatMultAdd 227 1.0 5.2911e-02 2.1 1.89e+07 1.1 3.5e+04 7.9e+02 0.0e+00 0 1 7 1 0 1 1 7 1 0 13698 MatMultTranspose 68 1.0 2.4873e-02 2.5 9.31e+06 1.1 2.4e+04 4.0e+02 0.0e+00 0 1 4 0 0 0 1 5 0 0 13844 MatSolve 129 1.2 4.1506e-02 1.2 2.81e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 26104 MatSOR 452 1.0 1.6256e-01 1.2 1.78e+08 1.3 0.0e+00 0.0e+00 0.0e+00 1 12 0 0 0 3 12 0 0 0 40040 MatLUFactorSym 1 1.0 4.1962e-05 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.7710e-03 1.1 4.73e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10116 MatILUFactorSym 1 1.0 8.9598e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 1.1864e-02 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 6.0964e-03 1.6 3.80e+06 1.4 2.8e+03 1.5e+03 0.0e+00 0 0 1 0 0 0 0 1 0 0 20551 MatResidual 68 1.0 4.3995e-02 1.2 5.31e+07 1.5 4.7e+04 1.5e+03 0.0e+00 0 3 9 2 0 1 3 10 2 0 38524 MatAssemblyBegin 93 1.0 4.5627e-01 4.6 0.00e+00 0.0 1.0e+04 1.4e+05 5.8e+01 1 0 2 35 5 5 0 2 44 6 0 MatAssemblyEnd 93 1.0 7.9526e-01 1.1 0.00e+00 0.0 2.1e+04 1.9e+02 2.2e+02 4 0 4 0 20 15 0 4 0 23 0 MatGetRow 52698 1.0 2.0620e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 MatGetRowIJ 2 2.0 1.0014e-05 8.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 5.0244e-03 1.0 0.00e+00 0.0 5.5e+02 1.1e+03 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 1.1802e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 3.3238e-03 1.3 0.00e+00 0.0 1.1e+04 4.3e+02 2.1e+01 0 0 2 0 2 0 0 2 0 2 0 MatZeroEntries 4 1.0 5.5079e-03 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 2.4477e-01 1.0 0.00e+00 0.0 7.2e+02 3.0e+02 2.0e+01 1 0 0 0 2 5 0 0 0 2 0 MatMatMult 5 1.0 8.6163e-02 1.0 3.37e+06 1.4 1.8e+04 7.6e+02 8.0e+01 0 0 3 0 7 2 0 4 0 8 1272 MatMatMultSym 5 1.0 7.6432e-02 1.0 0.00e+00 0.0 1.5e+04 6.2e+02 7.0e+01 0 0 3 0 6 2 0 3 0 7 0 MatMatMultNum 5 1.0 9.7311e-03 1.0 3.37e+06 1.4 2.9e+03 1.5e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 11263 MatPtAP 4 1.0 3.7207e+00 1.0 9.62e+08 1.9 3.3e+04 7.8e+04 6.8e+01 19 56 6 65 6 73 56 7 81 7 7908 MatPtAPSymbolic 4 1.0 2.1426e+00 1.0 0.00e+00 0.0 1.7e+04 7.0e+04 2.8e+01 11 0 3 29 3 42 0 3 37 3 0 MatPtAPNumeric 4 1.0 1.5781e+00 1.0 9.62e+08 1.9 1.6e+04 8.6e+04 4.0e+01 8 56 3 35 4 31 56 3 44 4 18646 MatTrnMatMult 1 1.0 1.6947e-02 1.0 4.99e+05 1.1 2.2e+03 1.9e+03 1.9e+01 0 0 0 0 2 0 0 0 0 2 1144 MatTrnMatMultSym 1 1.0 1.1003e-02 1.0 0.00e+00 0.0 1.8e+03 1.1e+03 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 MatTrnMatMultNum 1 1.0 5.9371e-03 1.0 4.99e+05 1.1 3.6e+02 6.0e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 3266 MatGetLocalMat 16 1.0 3.9840e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 6.5035e-02 1.3 0.00e+00 0.0 2.0e+04 4.1e+04 0.0e+00 0 0 4 21 0 1 0 4 26 0 0 PCGAMGGraph_AGG 4 1.0 2.0390e-01 1.0 3.12e+06 1.5 5.3e+03 8.4e+02 4.8e+01 1 0 1 0 4 4 0 1 0 5 489 PCGAMGCoarse_AGG 4 1.0 2.3103e-02 1.0 4.99e+05 1.1 1.5e+04 8.4e+02 4.8e+01 0 0 3 0 4 0 0 3 0 5 839 PCGAMGProl_AGG 4 1.0 8.3230e-03 1.0 0.00e+00 0.0 5.9e+03 5.2e+02 8.0e+01 0 0 1 0 7 0 0 1 0 8 0 PCGAMGPOpt_AGG 4 1.0 1.9880e-01 1.0 3.80e+07 1.4 4.4e+04 1.2e+03 1.9e+02 1 2 8 1 17 4 2 9 2 20 6232 GAMG: createProl 4 1.0 4.3470e-01 1.0 4.16e+07 1.4 7.0e+04 1.0e+03 3.6e+02 2 3 13 2 34 9 3 15 2 38 3124 Graph 8 1.0 2.0300e-01 1.0 3.12e+06 1.5 5.3e+03 8.4e+02 4.8e+01 1 0 1 0 4 4 0 1 0 5 491 MIS/Agg 4 1.0 3.3939e-03 1.3 0.00e+00 0.0 1.1e+04 4.3e+02 2.1e+01 0 0 2 0 2 0 0 2 0 2 0 SA: col data 4 1.0 1.5466e-03 1.1 0.00e+00 0.0 2.5e+03 1.0e+03 2.4e+01 0 0 0 0 2 0 0 1 0 3 0 SA: frmProl0 4 1.0 6.2592e-03 1.0 0.00e+00 0.0 3.4e+03 1.2e+02 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 1.9878e-01 1.0 3.80e+07 1.4 4.4e+04 1.2e+03 1.9e+02 1 2 8 1 17 4 2 9 2 20 6233 GAMG: partLevel 4 1.0 3.7256e+00 1.0 9.62e+08 1.9 3.4e+04 7.7e+04 1.7e+02 19 56 6 65 16 74 56 7 81 18 7898 repartition 2 1.0 2.9397e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 3.0017e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 1.5130e-03 1.0 0.00e+00 0.0 2.3e+02 2.4e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 2.5032e-03 1.0 0.00e+00 0.0 3.2e+02 1.0e+02 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 4.3353e+00 1.0 1.00e+09 1.9 1.1e+05 2.5e+04 6.2e+02 22 58 20 67 58 86 58 22 84 66 7108 PCSetUpOnBlocks 129 1.0 3.0313e-03 1.2 4.73e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5910 PCApply 17 1.0 4.6916e+00 1.0 1.52e+09 1.5 4.6e+05 6.7e+03 6.5e+02 24 96 87 78 60 93 96 97 98 68 10812 KSPGMRESOrthog 96 1.0 3.0260e-02 2.9 2.11e+07 1.1 0.0e+00 0.0e+00 9.6e+01 0 2 0 0 9 0 2 0 0 10 27310 KSPSetUp 18 1.0 8.6641e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 4.8847e+00 1.0 1.55e+09 1.5 4.8e+05 6.6e+03 8.7e+02 25 98 89 79 81 96 98 99 99 92 10601 SFSetGraph 4 1.0 2.2912e-04 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 26 1.0 2.7394e-03 2.6 0.00e+00 0.0 1.2e+04 5.6e+02 5.0e+00 0 0 2 0 0 0 0 3 0 1 0 SFBcastEnd 26 1.0 1.0862e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 187 194 23336772 0. IS L to G Mapping 3 3 14094724 0. Section 70 53 35616 0. Vector 15 141 13835680 0. Vector Scatter 2 15 685800 0. Matrix 0 52 22699316 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 198216 0. IS L to G Mapping 4 0 0 0. Vector 356 218 3282272 0. Vector Scatter 40 21 23032 0. Matrix 145 80 42315724 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.58443e-06 Average time for zero size MPI_Send(): 1.40071e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 48: solving... ((23365, 1161600), (23365, 1161600)) Solver time: 4.216640e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 48 processors, by jychang48 Wed Mar 2 18:02:30 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.886e+01 1.00032 1.886e+01 Objects: 1.098e+03 1.13196 9.852e+02 Flops: 1.310e+09 1.70226 1.017e+09 4.881e+10 Flops/sec: 6.945e+07 1.70193 5.392e+07 2.588e+09 MPI Messages: 2.029e+04 2.24362 1.474e+04 7.073e+05 MPI Message Lengths: 2.606e+08 6.46661 5.856e+03 4.142e+09 MPI Reductions: 1.078e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4641e+01 77.6% 0.0000e+00 0.0% 6.669e+04 9.4% 1.171e+03 20.0% 1.250e+02 11.6% 1: FEM: 4.2166e+00 22.4% 4.8807e+10 100.0% 6.406e+05 90.6% 4.686e+03 80.0% 9.520e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1856e+0013.6 0.00e+00 0.0 1.6e+04 4.0e+00 4.4e+01 6 0 2 0 4 7 0 24 0 35 0 VecScatterBegin 2 1.0 2.2173e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 8.1062e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1368e+00 1.1 0.00e+00 0.0 2.7e+04 2.8e+03 2.1e+01 11 0 4 2 2 15 0 40 9 17 0 Mesh Migration 2 1.0 4.0564e-01 1.0 0.00e+00 0.0 3.4e+04 1.9e+04 5.4e+01 2 0 5 16 5 3 0 51 80 43 0 DMPlexInterp 1 1.0 2.1086e+0061847.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.3904e+00 1.1 0.00e+00 0.0 1.9e+04 1.7e+04 2.5e+01 13 0 3 8 2 16 0 29 40 20 0 DMPlexDistCones 2 1.0 9.4622e-02 1.2 0.00e+00 0.0 5.1e+03 4.3e+04 4.0e+00 0 0 1 5 0 1 0 8 26 3 0 DMPlexDistLabels 2 1.0 2.6377e-01 1.0 0.00e+00 0.0 2.1e+04 1.8e+04 2.2e+01 1 0 3 9 2 2 0 31 45 18 0 DMPlexDistribOL 1 1.0 1.7173e-01 1.1 0.00e+00 0.0 4.2e+04 1.1e+04 5.0e+01 1 0 6 11 5 1 0 64 55 40 0 DMPlexDistField 3 1.0 2.9037e-02 2.1 0.00e+00 0.0 6.8e+03 4.5e+03 1.2e+01 0 0 1 1 1 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0405e+0065.1 0.00e+00 0.0 2.1e+04 1.7e+03 6.0e+00 5 0 3 1 1 7 0 31 4 5 0 DMPlexStratify 6 1.5 5.4211e-0141.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 3.6311e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1934e+00 4.4 0.00e+00 0.0 6.4e+04 1.3e+04 4.1e+01 6 0 9 19 4 8 0 96 97 33 0 SFBcastEnd 95 1.0 2.9515e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.3102e-0322.9 0.00e+00 0.0 2.0e+03 8.2e+03 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.3510e-03 7.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.2915e-0520.0 0.00e+00 0.0 2.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.1206e-04 2.2 0.00e+00 0.0 2.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 2.1787e-03 3.6 0.00e+00 0.0 2.0e+03 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 5.0044e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 2.5555e-02 3.9 8.80e+06 1.1 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 16169 VecNorm 105 1.0 4.3268e-03 1.5 1.25e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 13624 VecScale 217 1.0 1.4749e-03 1.2 1.67e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 53683 VecCopy 77 1.0 5.0020e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 3.2456e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 9.8467e-05 1.5 8.69e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 41556 VecAYPX 544 1.0 2.2662e-03 1.3 1.58e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33170 VecAXPBYCZ 272 1.0 1.3177e-03 1.3 3.16e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 114089 VecMAXPY 105 1.0 2.5764e-03 1.1 9.97e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 181688 VecAssemblyBegin 14 1.0 6.0821e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 6.1989e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 2.0766e-04 1.5 1.02e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 23422 VecScatterBegin 988 1.0 3.0665e-02 1.9 0.00e+00 0.0 5.4e+05 1.1e+03 0.0e+00 0 0 77 15 0 1 0 85 18 0 0 VecScatterEnd 988 1.0 1.0807e-01 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecSetRandom 4 1.0 4.1986e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 5.9941e-03 1.4 1.88e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 14752 MatMult 521 1.0 2.9442e-01 1.2 3.40e+08 1.5 4.7e+05 1.2e+03 1.3e+02 1 27 67 14 12 7 27 74 17 14 44430 MatMultAdd 227 1.0 4.8976e-02 2.3 1.57e+07 1.1 4.4e+04 6.9e+02 0.0e+00 0 1 6 1 0 1 1 7 1 0 14413 MatMultTranspose 68 1.0 2.0373e-02 2.3 7.53e+06 1.2 3.0e+04 3.5e+02 0.0e+00 0 1 4 0 0 0 1 5 0 0 15974 MatSolve 129 1.2 3.4987e-02 1.2 2.34e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 30892 MatSOR 452 1.0 1.1530e-01 1.2 1.39e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 13 0 0 0 3 13 0 0 0 53183 MatLUFactorSym 1 1.0 3.8862e-05 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.4689e-03 1.1 3.97e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12180 MatILUFactorSym 1 1.0 6.6495e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 9.6900e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 5.3563e-03 1.9 3.09e+06 1.5 3.7e+03 1.2e+03 0.0e+00 0 0 1 0 0 0 0 1 0 0 22511 MatResidual 68 1.0 3.6604e-02 1.3 4.32e+07 1.6 6.4e+04 1.2e+03 0.0e+00 0 3 9 2 0 1 3 10 2 0 44632 MatAssemblyBegin 93 1.0 4.2010e-01 4.6 0.00e+00 0.0 1.4e+04 1.1e+05 5.8e+01 1 0 2 35 5 6 0 2 43 6 0 MatAssemblyEnd 93 1.0 5.8031e-01 1.2 0.00e+00 0.0 2.8e+04 1.6e+02 2.2e+02 3 0 4 0 20 13 0 4 0 23 0 MatGetRow 43909 1.0 1.7181e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 MatGetRowIJ 2 2.0 1.0252e-0510.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 4.6279e-03 1.0 0.00e+00 0.0 6.6e+02 8.8e+02 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 1.0300e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 2.2120e-03 1.1 0.00e+00 0.0 1.4e+04 3.8e+02 2.2e+01 0 0 2 0 2 0 0 2 0 2 0 MatZeroEntries 4 1.0 4.2212e-03 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 2.0452e-01 1.0 0.00e+00 0.0 8.6e+02 2.7e+02 2.0e+01 1 0 0 0 2 5 0 0 0 2 0 MatMatMult 5 1.0 7.1930e-02 1.0 2.75e+06 1.5 2.4e+04 6.3e+02 8.0e+01 0 0 3 0 7 2 0 4 0 8 1474 MatMatMultSym 5 1.0 6.3674e-02 1.0 0.00e+00 0.0 2.0e+04 5.1e+02 7.0e+01 0 0 3 0 6 2 0 3 0 7 0 MatMatMultNum 5 1.0 8.2068e-03 1.0 2.75e+06 1.5 4.0e+03 1.2e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 12916 MatPtAP 4 1.0 3.0910e+00 1.0 7.76e+08 2.2 4.5e+04 5.9e+04 6.8e+01 16 53 6 64 6 73 53 7 80 7 8436 MatPtAPSymbolic 4 1.0 1.8711e+00 1.0 0.00e+00 0.0 2.2e+04 5.4e+04 2.8e+01 10 0 3 29 3 44 0 4 37 3 0 MatPtAPNumeric 4 1.0 1.2206e+00 1.0 7.76e+08 2.2 2.2e+04 6.4e+04 4.0e+01 6 53 3 35 4 29 53 3 43 4 21364 MatTrnMatMult 1 1.0 1.4337e-02 1.0 4.16e+05 1.1 2.6e+03 1.7e+03 1.9e+01 0 0 0 0 2 0 0 0 0 2 1355 MatTrnMatMultSym 1 1.0 9.2320e-03 1.0 0.00e+00 0.0 2.2e+03 9.6e+02 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 MatTrnMatMultNum 1 1.0 5.0960e-03 1.0 4.16e+05 1.1 4.3e+02 5.3e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 3813 MatGetLocalMat 16 1.0 3.2232e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 5.9111e-02 1.3 0.00e+00 0.0 2.7e+04 3.2e+04 0.0e+00 0 0 4 21 0 1 0 4 26 0 0 PCGAMGGraph_AGG 4 1.0 1.6926e-01 1.0 2.54e+06 1.6 6.8e+03 7.3e+02 4.8e+01 1 0 1 0 4 4 0 1 0 5 568 PCGAMGCoarse_AGG 4 1.0 1.9103e-02 1.0 4.16e+05 1.1 1.9e+04 7.3e+02 4.9e+01 0 0 3 0 5 0 0 3 0 5 1017 PCGAMGProl_AGG 4 1.0 6.7439e-03 1.0 0.00e+00 0.0 7.1e+03 4.7e+02 8.0e+01 0 0 1 0 7 0 0 1 0 8 0 PCGAMGPOpt_AGG 4 1.0 1.6605e-01 1.0 3.10e+07 1.5 6.0e+04 9.7e+02 1.9e+02 1 2 8 1 17 4 2 9 2 20 7216 GAMG: createProl 4 1.0 3.6235e-01 1.0 3.39e+07 1.5 9.3e+04 8.7e+02 3.6e+02 2 3 13 2 34 9 3 14 2 38 3626 Graph 8 1.0 1.6913e-01 1.0 2.54e+06 1.6 6.8e+03 7.3e+02 4.8e+01 1 0 1 0 4 4 0 1 0 5 568 MIS/Agg 4 1.0 2.2740e-03 1.0 0.00e+00 0.0 1.4e+04 3.8e+02 2.2e+01 0 0 2 0 2 0 0 2 0 2 0 SA: col data 4 1.0 1.4884e-03 1.1 0.00e+00 0.0 3.1e+03 9.4e+02 2.4e+01 0 0 0 0 2 0 0 0 0 3 0 SA: frmProl0 4 1.0 4.6790e-03 1.0 0.00e+00 0.0 4.0e+03 1.1e+02 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 1.6603e-01 1.0 3.10e+07 1.5 6.0e+04 9.7e+02 1.9e+02 1 2 8 1 17 4 2 9 2 20 7217 GAMG: partLevel 4 1.0 3.0959e+00 1.0 7.76e+08 2.2 4.6e+04 5.8e+04 1.7e+02 16 53 6 64 16 73 53 7 80 18 8423 repartition 2 1.0 3.0708e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 3.5977e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 1.6382e-03 1.0 0.00e+00 0.0 2.7e+02 2.0e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 2.2459e-03 1.0 0.00e+00 0.0 3.8e+02 9.8e+01 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 3.6055e+00 1.0 8.07e+08 2.2 1.4e+05 1.9e+04 6.2e+02 19 56 20 66 58 86 56 22 83 66 7606 PCSetUpOnBlocks 129 1.0 2.4600e-03 1.1 3.97e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7273 PCApply 17 1.0 3.8817e+00 1.0 1.26e+09 1.7 6.2e+05 5.3e+03 6.5e+02 21 95 88 79 60 92 95 97 98 68 11983 KSPGMRESOrthog 96 1.0 2.8517e-02 3.0 1.76e+07 1.1 0.0e+00 0.0e+00 9.6e+01 0 2 0 0 9 0 2 0 0 10 28981 KSPSetUp 18 1.0 7.9155e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 4.0445e+00 1.0 1.28e+09 1.7 6.4e+05 5.2e+03 8.7e+02 21 97 90 79 81 96 97 99 99 92 11761 SFSetGraph 4 1.0 1.7500e-04 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 27 1.0 2.1336e-03 2.8 0.00e+00 0.0 1.6e+04 4.9e+02 5.0e+00 0 0 2 0 0 0 0 2 0 1 0 SFBcastEnd 27 1.0 8.4925e-04 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 195 202 22897792 0. IS L to G Mapping 3 3 14076928 0. Section 70 53 35616 0. Vector 15 141 12429688 0. Vector Scatter 2 15 577200 0. Matrix 0 52 18299000 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 174544 0. IS L to G Mapping 4 0 0 0. Vector 356 218 2791528 0. Vector Scatter 40 21 23056 0. Matrix 145 80 34632808 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 1.12057e-05 Average time for zero size MPI_Send(): 1.39574e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 56: solving... ((20104, 1161600), (20104, 1161600)) Solver time: 3.855059e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 56 processors, by jychang48 Wed Mar 2 18:02:52 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.866e+01 1.00021 1.866e+01 Objects: 1.116e+03 1.14815 9.859e+02 Flops: 1.055e+09 1.76655 8.275e+08 4.634e+10 Flops/sec: 5.653e+07 1.76628 4.435e+07 2.484e+09 MPI Messages: 2.255e+04 2.17874 1.655e+04 9.265e+05 MPI Message Lengths: 2.460e+08 7.31784 4.913e+03 4.552e+09 MPI Reductions: 1.078e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4803e+01 79.3% 0.0000e+00 0.0% 8.318e+04 9.0% 9.156e+02 18.6% 1.250e+02 11.6% 1: FEM: 3.8551e+00 20.7% 4.6342e+10 100.0% 8.434e+05 91.0% 3.997e+03 81.4% 9.520e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2211e+0012.6 0.00e+00 0.0 2.0e+04 4.0e+00 4.4e+01 6 0 2 0 4 8 0 24 0 35 0 VecScatterBegin 2 1.0 2.0027e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 8.1062e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1955e+00 1.1 0.00e+00 0.0 3.5e+04 2.2e+03 2.1e+01 12 0 4 2 2 15 0 42 9 17 0 Mesh Migration 2 1.0 4.0020e-01 1.0 0.00e+00 0.0 4.1e+04 1.6e+04 5.4e+01 2 0 4 15 5 3 0 50 79 43 0 DMPlexInterp 1 1.0 2.1409e+0065070.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.4610e+00 1.1 0.00e+00 0.0 2.6e+04 1.3e+04 2.5e+01 13 0 3 7 2 17 0 31 39 20 0 DMPlexDistCones 2 1.0 9.4115e-02 1.2 0.00e+00 0.0 6.2e+03 3.6e+04 4.0e+00 0 0 1 5 0 1 0 7 26 3 0 DMPlexDistLabels 2 1.0 2.6018e-01 1.0 0.00e+00 0.0 2.5e+04 1.5e+04 2.2e+01 1 0 3 8 2 2 0 30 45 18 0 DMPlexDistribOL 1 1.0 1.5694e-01 1.1 0.00e+00 0.0 5.1e+04 9.2e+03 5.0e+01 1 0 6 10 5 1 0 62 56 40 0 DMPlexDistField 3 1.0 3.1035e-02 2.2 0.00e+00 0.0 8.3e+03 3.8e+03 1.2e+01 0 0 1 1 1 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0645e+0052.3 0.00e+00 0.0 2.8e+04 1.3e+03 6.0e+00 5 0 3 1 1 7 0 33 4 5 0 DMPlexStratify 6 1.5 5.4196e-0150.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 3.2694e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2282e+00 4.1 0.00e+00 0.0 8.0e+04 1.0e+04 4.1e+01 6 0 9 18 4 8 0 96 97 33 0 SFBcastEnd 95 1.0 2.9801e-01 9.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.2599e-0323.0 0.00e+00 0.0 2.5e+03 6.8e+03 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 1.0490e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.5048e-0518.4 0.00e+00 0.0 2.7e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.1206e-04 2.5 0.00e+00 0.0 2.7e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 2.0077e-03 3.6 0.00e+00 0.0 2.5e+03 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 4.8900e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 2.3549e-02 3.5 7.55e+06 1.0 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 17546 VecNorm 105 1.0 4.9648e-03 1.4 1.08e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 11873 VecScale 217 1.0 1.3340e-03 1.2 1.43e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 59353 VecCopy 77 1.0 3.8218e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 2.8689e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 8.3923e-05 1.4 7.45e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 48756 VecAYPX 544 1.0 1.7948e-03 1.2 1.36e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 41879 VecAXPBYCZ 272 1.0 1.0500e-03 1.2 2.71e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 143173 VecMAXPY 105 1.0 2.2776e-03 1.1 8.55e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 205516 VecAssemblyBegin 14 1.0 5.9843e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 6.9380e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 1.8048e-04 1.4 8.77e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 26948 VecScatterBegin 988 1.0 3.1637e-02 2.2 0.00e+00 0.0 7.2e+05 9.4e+02 0.0e+00 0 0 77 15 0 1 0 85 18 0 0 VecScatterEnd 988 1.0 8.6891e-02 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSetRandom 4 1.0 3.6597e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 6.5765e-03 1.3 1.61e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 13445 MatMult 521 1.0 2.4558e-01 1.2 2.74e+08 1.5 6.3e+05 1.0e+03 1.3e+02 1 27 68 14 12 6 27 75 17 14 51699 MatMultAdd 227 1.0 3.8159e-02 2.0 1.29e+07 1.1 5.5e+04 5.9e+02 0.0e+00 0 1 6 1 0 1 1 7 1 0 18085 MatMultTranspose 68 1.0 1.6905e-02 2.4 6.06e+06 1.2 3.8e+04 3.0e+02 0.0e+00 0 1 4 0 0 0 1 5 0 0 18319 MatSolve 129 1.2 2.8386e-02 1.1 2.01e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 37974 MatSOR 452 1.0 8.4327e-02 1.2 1.13e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 12 0 0 0 2 12 0 0 0 67671 MatLUFactorSym 1 1.0 4.3869e-05 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.2627e-03 1.1 3.40e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14138 MatILUFactorSym 1 1.0 7.9203e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 6.6226e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 4.1234e-03 1.9 2.47e+06 1.4 5.0e+03 1.0e+03 0.0e+00 0 0 1 0 0 0 0 1 0 0 28256 MatResidual 68 1.0 2.7525e-02 1.1 3.46e+07 1.5 8.5e+04 1.0e+03 0.0e+00 0 3 9 2 0 1 3 10 2 0 57417 MatAssemblyBegin 93 1.0 4.3088e-01 3.8 0.00e+00 0.0 1.8e+04 9.0e+04 5.8e+01 2 0 2 35 5 7 0 2 43 6 0 MatAssemblyEnd 93 1.0 6.9626e-01 1.4 0.00e+00 0.0 3.6e+04 1.4e+02 2.2e+02 4 0 4 0 20 17 0 4 0 23 0 MatGetRow 37645 1.0 1.4686e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 MatGetRowIJ 2 2.0 5.9605e-06 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 4.5202e-03 1.0 0.00e+00 0.0 7.6e+02 8.8e+02 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 9.2030e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 2.1741e-03 1.1 0.00e+00 0.0 1.7e+04 3.4e+02 2.2e+01 0 0 2 0 2 0 0 2 0 2 0 MatZeroEntries 4 1.0 3.0334e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 1.7570e-01 1.0 0.00e+00 0.0 1.0e+03 2.4e+02 2.0e+01 1 0 0 0 2 5 0 0 0 2 0 MatMatMult 5 1.0 6.3858e-02 1.0 2.21e+06 1.5 3.2e+04 5.3e+02 8.0e+01 0 0 3 0 7 2 0 4 0 8 1611 MatMatMultSym 5 1.0 5.6910e-02 1.0 0.00e+00 0.0 2.7e+04 4.3e+02 7.0e+01 0 0 3 0 6 1 0 3 0 7 0 MatMatMultNum 5 1.0 6.9041e-03 1.0 2.21e+06 1.5 5.2e+03 1.1e+03 1.0e+01 0 0 1 0 1 0 0 1 0 1 14898 MatPtAP 4 1.0 2.8888e+00 1.0 6.50e+08 2.6 6.0e+04 5.0e+04 6.8e+01 15 53 6 65 6 75 53 7 80 7 8455 MatPtAPSymbolic 4 1.0 1.6264e+00 1.0 0.00e+00 0.0 3.0e+04 4.6e+04 2.8e+01 9 0 3 30 3 42 0 4 37 3 0 MatPtAPNumeric 4 1.0 1.2635e+00 1.0 6.50e+08 2.6 3.0e+04 5.4e+04 4.0e+01 7 53 3 35 4 33 53 4 43 4 19331 MatTrnMatMult 1 1.0 1.2778e-02 1.0 3.58e+05 1.1 3.2e+03 1.5e+03 1.9e+01 0 0 0 0 2 0 0 0 0 2 1524 MatTrnMatMultSym 1 1.0 8.3380e-03 1.0 0.00e+00 0.0 2.6e+03 8.5e+02 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 MatTrnMatMultNum 1 1.0 4.4389e-03 1.0 3.58e+05 1.1 5.2e+02 4.8e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 4388 MatGetLocalMat 16 1.0 2.8000e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 5.8958e-02 1.3 0.00e+00 0.0 3.6e+04 2.7e+04 0.0e+00 0 0 4 21 0 1 0 4 26 0 0 PCGAMGGraph_AGG 4 1.0 1.4344e-01 1.0 2.04e+06 1.5 8.7e+03 6.3e+02 4.8e+01 1 0 1 0 4 4 0 1 0 5 648 PCGAMGCoarse_AGG 4 1.0 1.7099e-02 1.0 3.58e+05 1.1 2.3e+04 6.5e+02 4.9e+01 0 0 2 0 5 0 0 3 0 5 1139 PCGAMGProl_AGG 4 1.0 6.5010e-03 1.0 0.00e+00 0.0 8.7e+03 4.1e+02 8.0e+01 0 0 1 0 7 0 0 1 0 8 0 PCGAMGPOpt_AGG 4 1.0 1.4499e-01 1.0 2.49e+07 1.5 8.0e+04 8.2e+02 1.9e+02 1 3 9 1 17 4 3 9 2 20 8020 GAMG: createProl 4 1.0 3.1266e-01 1.0 2.73e+07 1.5 1.2e+05 7.5e+02 3.6e+02 2 3 13 2 34 8 3 14 2 38 4079 Graph 8 1.0 1.4329e-01 1.0 2.04e+06 1.5 8.7e+03 6.3e+02 4.8e+01 1 0 1 0 4 4 0 1 0 5 649 MIS/Agg 4 1.0 2.2221e-03 1.0 0.00e+00 0.0 1.7e+04 3.4e+02 2.2e+01 0 0 2 0 2 0 0 2 0 2 0 SA: col data 4 1.0 1.4677e-03 1.1 0.00e+00 0.0 3.8e+03 8.2e+02 2.4e+01 0 0 0 0 2 0 0 0 0 3 0 SA: frmProl0 4 1.0 4.5345e-03 1.0 0.00e+00 0.0 4.9e+03 9.5e+01 4.0e+01 0 0 1 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 1.4497e-01 1.0 2.49e+07 1.5 8.0e+04 8.2e+02 1.9e+02 1 3 9 1 17 4 3 9 2 20 8021 GAMG: partLevel 4 1.0 2.8946e+00 1.0 6.50e+08 2.6 6.1e+04 4.9e+04 1.7e+02 16 53 7 65 16 75 53 7 80 18 8438 repartition 2 1.0 3.2711e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 4.0507e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 2.5859e-03 1.5 0.00e+00 0.0 3.1e+02 2.0e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 2.9900e-03 1.4 0.00e+00 0.0 4.5e+02 1.0e+02 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 3.3360e+00 1.0 6.75e+08 2.5 1.8e+05 1.7e+04 6.2e+02 18 56 20 68 58 87 56 22 83 66 7714 PCSetUpOnBlocks 129 1.0 2.3839e-03 1.2 3.40e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7488 PCApply 17 1.0 3.5413e+00 1.0 1.01e+09 1.8 8.2e+05 4.5e+03 6.5e+02 19 95 88 80 60 92 95 97 99 68 12428 KSPGMRESOrthog 96 1.0 2.6282e-02 2.8 1.51e+07 1.0 0.0e+00 0.0e+00 9.6e+01 0 2 0 0 9 0 2 0 0 10 31445 KSPSetUp 18 1.0 7.0262e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 3.6885e+00 1.0 1.03e+09 1.8 8.4e+05 4.4e+03 8.7e+02 20 97 91 81 81 96 97 99 99 92 12218 SFSetGraph 4 1.0 1.6308e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 27 1.0 2.0990e-03 2.8 0.00e+00 0.0 1.9e+04 4.4e+02 5.0e+00 0 0 2 0 0 0 0 2 0 1 0 SFBcastEnd 27 1.0 7.8893e-04 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 213 220 22661528 0. IS L to G Mapping 3 3 12801292 0. Section 70 53 35616 0. Vector 15 141 11422112 0. Vector Scatter 2 15 498936 0. Matrix 0 52 15514332 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 158676 0. IS L to G Mapping 4 0 0 0. Vector 356 218 2450960 0. Vector Scatter 40 21 23040 0. Matrix 145 80 29369344 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.44138e-06 Average time for zero size MPI_Send(): 1.28576e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= gamg 40 1 ================= Discretization: RT MPI processes 64: solving... ((17544, 1161600), (17544, 1161600)) Solver time: 3.773817e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 64 processors, by jychang48 Wed Mar 2 18:03:15 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.874e+01 1.00026 1.873e+01 Objects: 1.130e+03 1.16495 9.868e+02 Flops: 9.997e+08 2.24666 6.960e+08 4.454e+10 Flops/sec: 5.336e+07 2.24622 3.715e+07 2.378e+09 MPI Messages: 2.500e+04 2.38812 1.814e+04 1.161e+06 MPI Message Lengths: 2.334e+08 8.33714 4.205e+03 4.883e+09 MPI Reductions: 1.081e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4959e+01 79.9% 0.0000e+00 0.0% 1.007e+05 8.7% 7.457e+02 17.7% 1.250e+02 11.6% 1: FEM: 3.7738e+00 20.1% 4.4543e+10 100.0% 1.061e+06 91.3% 3.459e+03 82.3% 9.550e+02 88.3% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2383e+0012.0 0.00e+00 0.0 2.4e+04 4.0e+00 4.4e+01 6 0 2 0 4 8 0 24 0 35 0 VecScatterBegin 2 1.0 1.9073e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.2405e+00 1.1 0.00e+00 0.0 4.4e+04 1.8e+03 2.1e+01 12 0 4 2 2 15 0 44 9 17 0 Mesh Migration 2 1.0 3.9701e-01 1.0 0.00e+00 0.0 4.9e+04 1.4e+04 5.4e+01 2 0 4 14 5 3 0 48 79 43 0 DMPlexInterp 1 1.0 2.1140e+0062441.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.5115e+00 1.1 0.00e+00 0.0 3.3e+04 1.0e+04 2.5e+01 13 0 3 7 2 17 0 33 38 20 0 DMPlexDistCones 2 1.0 9.2092e-02 1.2 0.00e+00 0.0 7.2e+03 3.1e+04 4.0e+00 0 0 1 5 0 1 0 7 26 3 0 DMPlexDistLabels 2 1.0 2.6102e-01 1.0 0.00e+00 0.0 2.9e+04 1.3e+04 2.2e+01 1 0 3 8 2 2 0 29 45 18 0 DMPlexDistribOL 1 1.0 1.4294e-01 1.1 0.00e+00 0.0 6.1e+04 8.0e+03 5.0e+01 1 0 5 10 5 1 0 60 56 40 0 DMPlexDistField 3 1.0 3.2045e-02 2.3 0.00e+00 0.0 9.7e+03 3.4e+03 1.2e+01 0 0 1 1 1 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0870e+0055.1 0.00e+00 0.0 3.5e+04 1.0e+03 6.0e+00 6 0 3 1 1 7 0 35 4 5 0 DMPlexStratify 6 1.5 5.4254e-0157.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 2.8019e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2455e+00 4.0 0.00e+00 0.0 9.7e+04 8.7e+03 4.1e+01 6 0 8 17 4 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0168e-0110.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.6369e-0322.3 0.00e+00 0.0 2.9e+03 5.8e+03 3.0e+00 0 0 0 0 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.7442e-03 8.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.6001e-0516.8 0.00e+00 0.0 3.2e+02 1.3e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.6093e-04 4.6 0.00e+00 0.0 3.2e+02 1.3e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 17 1.0 2.0092e-03 4.2 0.00e+00 0.0 2.9e+03 4.0e+00 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 BuildTwoSidedF 12 1.0 4.2796e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecMDot 96 1.0 2.3543e-02 5.1 6.61e+06 1.1 0.0e+00 0.0e+00 9.6e+01 0 1 0 0 9 0 1 0 0 10 17550 VecNorm 105 1.0 1.2387e-02 5.0 9.42e+05 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 4759 VecScale 217 1.0 1.4968e-03 1.4 1.26e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 52895 VecCopy 77 1.0 3.4785e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 596 1.0 2.5966e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 7.5102e-05 1.4 6.53e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54480 VecAYPX 544 1.0 1.8840e-03 1.5 1.19e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 39891 VecAXPBYCZ 272 1.0 1.1594e-03 1.5 2.37e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 129640 VecMAXPY 105 1.0 1.9822e-03 1.1 7.49e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 236134 VecAssemblyBegin 14 1.0 5.4741e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 14 1.0 7.1526e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 44 1.0 1.6665e-04 1.5 7.68e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 29180 VecScatterBegin 988 1.0 3.0876e-02 2.5 0.00e+00 0.0 9.0e+05 8.2e+02 0.0e+00 0 0 78 15 0 1 0 85 18 0 0 VecScatterEnd 988 1.0 9.7675e-02 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecSetRandom 4 1.0 3.1018e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 105 1.0 1.3980e-02 3.5 1.41e+06 1.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 10 0 0 0 0 11 6325 MatMult 521 1.0 2.2436e-01 1.2 2.57e+08 1.8 7.9e+05 8.8e+02 1.3e+02 1 28 68 14 12 5 28 75 17 14 55789 MatMultAdd 227 1.0 3.2643e-02 1.8 1.14e+07 1.2 6.5e+04 5.4e+02 0.0e+00 0 2 6 1 0 1 2 6 1 0 20795 MatMultTranspose 68 1.0 1.7001e-02 2.5 5.40e+06 1.4 4.6e+04 2.7e+02 0.0e+00 0 1 4 0 0 0 1 4 0 0 17550 MatSolve 129 1.2 2.5924e-02 1.2 1.76e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 41475 MatSOR 452 1.0 7.1886e-02 1.3 9.34e+07 1.3 0.0e+00 0.0e+00 0.0e+00 0 12 0 0 0 2 12 0 0 0 74824 MatLUFactorSym 1 1.0 3.0994e-05 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.1089e-03 1.1 2.96e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 16070 MatILUFactorSym 1 1.0 7.7200e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 5 1.0 6.6314e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 14 1.0 4.2133e-03 2.7 2.33e+06 1.7 6.3e+03 8.7e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 27151 MatResidual 68 1.0 2.6713e-02 1.2 3.27e+07 1.9 1.1e+05 8.7e+02 0.0e+00 0 3 9 2 0 1 3 10 2 0 58240 MatAssemblyBegin 93 1.0 4.7054e-01 4.1 0.00e+00 0.0 2.2e+04 7.8e+04 5.8e+01 2 0 2 36 5 9 0 2 44 6 0 MatAssemblyEnd 93 1.0 8.0991e-01 1.2 0.00e+00 0.0 4.6e+04 1.2e+02 2.2e+02 4 0 4 0 20 21 0 4 0 23 0 MatGetRow 32943 1.0 1.2840e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 MatGetRowIJ 2 2.0 6.9141e-06 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 8 1.0 4.1435e-03 1.0 0.00e+00 0.0 8.6e+02 6.5e+02 7.4e+01 0 0 0 0 7 0 0 0 0 8 0 MatGetOrdering 2 2.0 8.7023e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 4 1.0 2.1451e-03 1.0 0.00e+00 0.0 2.2e+04 2.9e+02 2.5e+01 0 0 2 0 2 0 0 2 0 3 0 MatZeroEntries 4 1.0 2.5702e-03 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 5 1.0 1.5406e-01 1.0 0.00e+00 0.0 1.2e+03 2.2e+02 2.0e+01 1 0 0 0 2 4 0 0 0 2 0 MatMatMult 5 1.0 5.2665e-02 1.0 2.08e+06 1.8 4.0e+04 4.6e+02 8.0e+01 0 0 3 0 7 1 0 4 0 8 1926 MatMatMultSym 5 1.0 4.6258e-02 1.0 0.00e+00 0.0 3.4e+04 3.7e+02 7.0e+01 0 0 3 0 6 1 0 3 0 7 0 MatMatMultNum 5 1.0 6.3677e-03 1.0 2.08e+06 1.8 6.6e+03 9.2e+02 1.0e+01 0 0 1 0 1 0 0 1 0 1 15926 MatPtAP 4 1.0 2.8912e+00 1.0 6.30e+08 3.8 7.6e+04 4.2e+04 6.8e+01 15 52 7 66 6 77 52 7 80 7 7999 MatPtAPSymbolic 4 1.0 1.5281e+00 1.0 0.00e+00 0.0 3.8e+04 3.9e+04 2.8e+01 8 0 3 30 3 40 0 4 36 3 0 MatPtAPNumeric 4 1.0 1.3641e+00 1.0 6.30e+08 3.8 3.8e+04 4.6e+04 4.0e+01 7 52 3 36 4 36 52 4 44 4 16954 MatTrnMatMult 1 1.0 1.1292e-02 1.0 3.14e+05 1.1 3.7e+03 1.4e+03 1.9e+01 0 0 0 0 2 0 0 0 0 2 1728 MatTrnMatMultSym 1 1.0 7.5750e-03 1.0 0.00e+00 0.0 3.1e+03 7.8e+02 1.7e+01 0 0 0 0 2 0 0 0 0 2 0 MatTrnMatMultNum 1 1.0 3.7148e-03 1.0 3.14e+05 1.1 6.1e+02 4.3e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 5254 MatGetLocalMat 16 1.0 2.3561e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 14 1.0 5.9310e-02 1.7 0.00e+00 0.0 4.5e+04 2.3e+04 0.0e+00 0 0 4 21 0 1 0 4 26 0 0 PCGAMGGraph_AGG 4 1.0 1.2765e-01 1.0 1.93e+06 1.9 1.1e+04 5.6e+02 4.8e+01 1 0 1 0 4 3 0 1 0 5 717 PCGAMGCoarse_AGG 4 1.0 1.5398e-02 1.0 3.14e+05 1.1 2.9e+04 5.6e+02 5.2e+01 0 0 3 0 5 0 0 3 0 5 1268 PCGAMGProl_AGG 4 1.0 6.5541e-03 1.0 0.00e+00 0.0 1.0e+04 3.8e+02 8.0e+01 0 0 1 0 7 0 0 1 0 8 0 PCGAMGPOpt_AGG 4 1.0 1.2338e-01 1.0 2.34e+07 1.8 1.0e+05 7.1e+02 1.9e+02 1 3 9 1 17 3 3 10 2 20 9290 GAMG: createProl 4 1.0 2.7354e-01 1.0 2.57e+07 1.8 1.5e+05 6.5e+02 3.7e+02 1 3 13 2 34 7 3 14 2 39 4596 Graph 8 1.0 1.2751e-01 1.0 1.93e+06 1.9 1.1e+04 5.6e+02 4.8e+01 1 0 1 0 4 3 0 1 0 5 718 MIS/Agg 4 1.0 2.1992e-03 1.0 0.00e+00 0.0 2.2e+04 2.9e+02 2.5e+01 0 0 2 0 2 0 0 2 0 3 0 SA: col data 4 1.0 1.3566e-03 1.1 0.00e+00 0.0 4.5e+03 7.5e+02 2.4e+01 0 0 0 0 2 0 0 0 0 3 0 SA: frmProl0 4 1.0 4.7398e-03 1.0 0.00e+00 0.0 5.6e+03 8.8e+01 4.0e+01 0 0 0 0 4 0 0 1 0 4 0 SA: smooth 4 1.0 1.2335e-01 1.0 2.34e+07 1.8 1.0e+05 7.1e+02 1.9e+02 1 3 9 1 17 3 3 10 2 20 9292 GAMG: partLevel 4 1.0 2.8968e+00 1.0 6.30e+08 3.8 7.8e+04 4.1e+04 1.7e+02 15 52 7 66 16 77 52 7 80 18 7984 repartition 2 1.0 3.7479e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 2 1.0 4.2200e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 Move A 2 1.0 2.3668e-03 1.5 0.00e+00 0.0 3.5e+02 1.4e+03 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 Move P 2 1.0 2.7559e-03 1.4 0.00e+00 0.0 5.1e+02 9.7e+01 3.6e+01 0 0 0 0 3 0 0 0 0 4 0 PCSetUp 5 1.0 3.2839e+00 1.0 6.53e+08 3.6 2.3e+05 1.4e+04 6.3e+02 18 55 20 68 58 87 55 22 83 66 7435 PCSetUpOnBlocks 129 1.0 2.1794e-03 1.3 2.96e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8177 PCApply 17 1.0 3.4818e+00 1.0 9.60e+08 2.3 1.0e+06 3.8e+03 6.5e+02 19 95 89 81 60 92 95 97 99 68 12115 KSPGMRESOrthog 96 1.0 2.5954e-02 3.6 1.32e+07 1.1 0.0e+00 0.0e+00 9.6e+01 0 2 0 0 9 0 2 0 0 10 31841 KSPSetUp 18 1.0 7.0524e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 1 1.0 3.6077e+00 1.0 9.76e+08 2.3 1.1e+06 3.8e+03 8.8e+02 19 97 91 82 81 96 97100 99 92 11984 SFSetGraph 4 1.0 1.5688e-04 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 30 1.0 2.1074e-03 2.9 0.00e+00 0.0 2.4e+04 3.8e+02 5.0e+00 0 0 2 0 0 0 0 2 0 1 0 SFBcastEnd 30 1.0 1.0002e-03 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. PetscRandom 0 1 646 0. Index Set 227 234 22508856 0. IS L to G Mapping 3 3 12541056 0. Section 70 53 35616 0. Vector 15 141 10638768 0. Vector Scatter 2 15 437496 0. Matrix 0 52 12578516 0. Preconditioner 0 11 11020 0. Krylov Solver 0 15 151752 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM PetscRandom 1 0 0 0. Index Set 102 88 152948 0. IS L to G Mapping 4 0 0 0. Vector 356 218 2199448 0. Vector Scatter 40 21 23056 0. Matrix 145 80 23656560 0. Matrix Coarsen 4 4 2544 0. Preconditioner 21 10 8944 0. Krylov Solver 21 6 123480 0. Star Forest Bipartite Graph 4 4 3456 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 8.63075e-06 Average time for zero size MPI_Send(): 1.65775e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type gamg -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- -------------- next part -------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 1: solving... ((1161600, 1161600), (1161600, 1161600)) Solver time: 4.842733e+01 Solver iterations: 12 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 1 processor, by jychang48 Wed Mar 2 17:34:27 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 6.507e+01 1.00000 6.507e+01 Objects: 2.470e+02 1.00000 2.470e+02 Flops: 1.711e+09 1.00000 1.711e+09 1.711e+09 Flops/sec: 2.630e+07 1.00000 2.630e+07 2.630e+07 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.6646e+01 25.6% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: FEM: 4.8427e+01 74.4% 1.7111e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 8 1.0 1.5751e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 2 1.0 3.6759e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexInterp 1 1.0 2.1008e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 13 0 0 0 0 0 DMPlexStratify 4 1.0 5.1098e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 SFSetGraph 7 1.0 2.6133e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM VecMDot 12 1.0 6.1646e-02 1.0 1.81e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 2939 VecNorm 13 1.0 1.5800e-02 1.0 3.02e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1911 VecScale 26 1.0 1.6351e-02 1.0 2.52e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1542 VecCopy 1 1.0 2.6531e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 1.5827e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 1.4691e-03 1.0 2.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1581 VecMAXPY 13 1.0 7.4229e-02 1.0 2.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 12 0 0 0 0 12 0 0 0 2817 VecScatterBegin 58 1.0 6.3398e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 13 1.0 2.4897e-02 1.0 4.53e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 1820 MatMult 25 1.0 2.4851e-01 1.0 2.70e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 16 0 0 0 1 16 0 0 0 1088 MatMultAdd 48 1.0 1.8471e-01 1.0 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 0 13 0 0 0 1249 MatSolve 13 1.0 1.4378e-01 1.0 1.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 904 MatLUFactorNum 1 1.0 6.5479e-02 1.0 1.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 276 MatILUFactorSym 1 1.0 4.7690e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 4.4951e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 8.9600e-03 1.0 5.34e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 596 MatAssemblyBegin 10 1.0 4.7684e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 10 1.0 9.9671e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRow 768000 1.0 4.7885e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 2 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 4.8168e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 2.9769e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.0544e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatMatMult 1 1.0 1.2049e-01 1.0 1.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 111 MatMatMultSym 1 1.0 8.2528e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMultNum 1 1.0 3.7939e-02 1.0 1.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 351 PCSetUp 4 1.0 2.7671e+01 1.0 3.68e+07 1.0 0.0e+00 0.0e+00 0.0e+00 43 2 0 0 0 57 2 0 0 0 1 PCSetUpOnBlocks 13 1.0 1.1621e-01 1.0 1.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 156 PCApply 13 1.0 4.5220e+01 1.0 1.98e+08 1.0 0.0e+00 0.0e+00 0.0e+00 69 12 0 0 0 93 12 0 0 0 4 KSPGMRESOrthog 12 1.0 1.2641e-01 1.0 3.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 21 0 0 0 0 21 0 0 0 2867 KSPSetUp 4 1.0 2.3469e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 4.6958e+01 1.0 8.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 72 52 0 0 0 97 52 0 0 0 19 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 22 22 38639672 0. Section 26 8 5376 0. Vector 13 31 178264920 0. Vector Scatter 2 6 3984 0. Matrix 0 3 124219284 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 10 4 19256 0. GraphPartitioner 4 3 1836 0. Star Forest Bipartite Graph 23 12 9696 0. Discrete System 10 4 3456 0. --- Event Stage 1: FEM Index Set 19 12 9408 0. IS L to G Mapping 4 0 0 0. Vector 79 52 21737472 0. Vector Scatter 6 0 0 0. Matrix 10 2 37023836 0. Preconditioner 6 1 1016 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 2: solving... ((579051, 1161600), (579051, 1161600)) Solver time: 3.476467e+01 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 2 processors, by jychang48 Wed Mar 2 17:35:18 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 4.903e+01 1.00022 4.903e+01 Objects: 4.840e+02 1.02979 4.770e+02 Flops: 1.033e+09 1.00377 1.031e+09 2.063e+09 Flops/sec: 2.108e+07 1.00400 2.104e+07 4.207e+07 MPI Messages: 3.485e+02 1.24687 3.140e+02 6.280e+02 MPI Message Lengths: 4.050e+08 1.62105 1.043e+06 6.549e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4264e+01 29.1% 0.0000e+00 0.0% 5.020e+02 79.9% 9.935e+05 95.3% 1.250e+02 29.6% 1: FEM: 3.4765e+01 70.9% 2.0627e+09 100.0% 1.260e+02 20.1% 4.932e+04 4.7% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 8.3311e-01 9.6 0.00e+00 0.0 1.2e+02 4.0e+00 4.4e+01 1 0 19 0 10 3 0 24 0 35 0 VecScatterBegin 2 1.0 1.6110e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.6245e+00 1.1 0.00e+00 0.0 9.2e+01 5.5e+05 2.1e+01 3 0 15 8 5 11 0 18 8 17 0 Mesh Migration 2 1.0 1.7902e+00 1.0 0.00e+00 0.0 3.7e+02 1.4e+06 5.4e+01 4 0 60 79 13 13 0 75 83 43 0 DMPlexInterp 1 1.0 2.0429e+0045337.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 7 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2455e+00 1.1 0.00e+00 0.0 1.7e+02 1.9e+06 2.5e+01 4 0 26 48 6 15 0 33 50 20 0 DMPlexDistCones 2 1.0 3.6353e-01 1.0 0.00e+00 0.0 5.4e+01 3.2e+06 4.0e+00 1 0 9 27 1 3 0 11 28 3 0 DMPlexDistLabels 2 1.0 9.6565e-01 1.0 0.00e+00 0.0 2.4e+02 1.2e+06 2.2e+01 2 0 38 45 5 7 0 48 47 18 0 DMPlexDistribOL 1 1.0 1.1889e+00 1.0 0.00e+00 0.0 3.1e+02 9.6e+05 5.0e+01 2 0 49 45 12 8 0 61 48 40 0 DMPlexDistField 3 1.0 4.3184e-02 1.1 0.00e+00 0.0 6.2e+01 3.5e+05 1.2e+01 0 0 10 3 3 0 0 12 3 10 0 DMPlexDistData 2 1.0 8.3491e-0126.1 0.00e+00 0.0 5.4e+01 4.0e+05 6.0e+00 1 0 9 3 1 3 0 11 3 5 0 DMPlexStratify 6 1.5 7.6915e-01 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 SFSetGraph 51 1.0 4.2324e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 SFBcastBegin 95 1.0 9.3474e-01 3.1 0.00e+00 0.0 4.8e+02 1.2e+06 4.1e+01 1 0 77 92 10 4 0 96 96 33 0 SFBcastEnd 95 1.0 4.0109e-01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 3.5613e-03 1.4 0.00e+00 0.0 1.1e+01 1.3e+06 3.0e+00 0 0 2 2 1 0 0 2 2 2 0 SFReduceEnd 4 1.0 5.1992e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.0994e-05 6.2 0.00e+00 0.0 1.0e+00 4.2e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 9.9111e-04 6.3 0.00e+00 0.0 1.0e+00 4.2e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.8351e-03108.4 0.00e+00 0.0 2.0e+00 4.0e+00 1.0e+00 0 0 0 0 0 0 0 2 0 0 0 VecMDot 15 1.0 4.5149e-02 1.0 1.40e+08 1.0 0.0e+00 0.0e+00 1.5e+01 0 14 0 0 4 0 14 0 0 5 6175 VecNorm 16 1.0 1.0752e-02 1.2 1.86e+07 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 3457 VecScale 32 1.0 9.4597e-03 1.0 1.56e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 3280 VecCopy 1 1.0 1.1580e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 3.4921e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 6.0606e-04 1.0 1.17e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3833 VecMAXPY 16 1.0 5.1952e-02 1.0 1.57e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 6037 VecScatterBegin 146 1.0 3.5844e-02 1.0 0.00e+00 0.0 7.6e+01 3.4e+04 0.0e+00 0 0 12 0 0 0 0 60 8 0 0 VecScatterEnd 146 1.0 2.1336e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 1.6374e-02 1.1 2.80e+07 1.0 0.0e+00 0.0e+00 1.6e+01 0 3 0 0 4 0 3 0 0 5 3405 MatMult 31 1.0 1.5530e-01 1.0 1.69e+08 1.0 7.6e+01 3.4e+04 1.2e+02 0 16 12 0 28 0 16 60 8 41 2172 MatMultAdd 60 1.0 1.1559e-01 1.0 1.45e+08 1.0 6.0e+01 3.6e+04 0.0e+00 0 14 10 0 0 0 14 48 7 0 2494 MatSolve 16 1.0 9.8875e-02 1.0 8.00e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 1611 MatLUFactorNum 1 1.0 3.3598e-02 1.0 9.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 538 MatILUFactorSym 1 1.0 2.3966e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 2.4728e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 3.5920e-03 1.0 2.67e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1485 MatAssemblyBegin 12 1.0 5.8002e-0312.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 9.0917e-02 1.0 0.00e+00 0.0 1.6e+01 7.8e+03 4.8e+01 0 0 3 0 11 0 0 13 0 16 0 MatGetRow 384000 1.0 3.7148e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 3 1.0 3.0994e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 2.1708e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 1.4248e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.6908e-01 1.0 0.00e+00 0.0 4.0e+00 6.7e+03 1.2e+01 2 0 1 0 3 2 0 3 0 4 0 MatMatMult 1 1.0 1.4067e-01 1.0 4.95e+06 1.0 8.0e+00 2.2e+04 1.6e+01 0 0 1 0 4 0 0 6 1 5 70 MatMatMultSym 1 1.0 1.2321e-01 1.0 0.00e+00 0.0 7.0e+00 1.8e+04 1.4e+01 0 0 1 0 3 0 0 6 0 5 0 MatMatMultNum 1 1.0 1.7438e-02 1.0 4.95e+06 1.0 1.0e+00 5.5e+04 2.0e+00 0 0 0 0 0 0 0 1 0 1 568 MatGetLocalMat 2 1.0 2.2079e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 1.1523e-03 1.5 0.00e+00 0.0 4.0e+00 3.8e+04 0.0e+00 0 0 1 0 0 0 0 3 0 0 0 PCSetUp 4 1.0 2.2106e+01 1.0 1.67e+07 1.0 2.0e+01 4.7e+05 6.6e+01 45 2 3 1 16 64 2 16 31 22 2 PCSetUpOnBlocks 16 1.0 5.9039e-02 1.0 9.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 306 PCApply 16 1.0 3.2480e+01 1.0 1.20e+08 1.0 1.6e+01 2.7e+04 4.0e+00 66 12 3 0 1 93 12 13 1 1 7 KSPGMRESOrthog 15 1.0 9.1311e-02 1.0 2.80e+08 1.0 0.0e+00 0.0e+00 1.5e+01 0 27 0 0 4 0 27 0 0 5 6106 KSPSetUp 4 1.0 9.7001e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 3.3756e+01 1.0 5.98e+08 1.0 9.6e+01 1.3e+05 2.2e+02 69 58 15 2 51 97 58 76 39 73 35 SFBcastBegin 1 1.0 1.8959e-0324.3 0.00e+00 0.0 6.0e+00 4.1e+04 1.0e+00 0 0 1 0 0 0 0 5 1 0 0 SFBcastEnd 1 1.0 3.1090e-04 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 79 79 49124340 0. IS L to G Mapping 3 3 23945692 0. Section 70 53 35616 0. Vector 15 45 140251432 0. Vector Scatter 2 7 13904896 0. Matrix 0 5 64871960 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 73104 0. IS L to G Mapping 4 0 0 0. Vector 114 72 10946608 0. Vector Scatter 13 2 2192 0. Matrix 26 8 52067772 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 1.62125e-06 Average time for zero size MPI_Send(): 2.5034e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 4: solving... ((288348, 1161600), (288348, 1161600)) Solver time: 2.221880e+01 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 4 processors, by jychang48 Wed Mar 2 17:35:54 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 3.523e+01 1.00003 3.523e+01 Objects: 4.920e+02 1.04237 4.775e+02 Flops: 5.295e+08 1.00702 5.270e+08 2.108e+09 Flops/sec: 1.503e+07 1.00704 1.496e+07 5.983e+07 MPI Messages: 7.315e+02 1.64938 5.530e+02 2.212e+03 MPI Message Lengths: 2.891e+08 2.21291 3.089e+05 6.833e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3011e+01 36.9% 0.0000e+00 0.0% 1.654e+03 74.8% 2.935e+05 95.0% 1.250e+02 29.6% 1: FEM: 2.2219e+01 63.1% 2.1080e+09 100.0% 5.580e+02 25.2% 1.541e+04 5.0% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 9.3237e-0117.3 0.00e+00 0.0 3.9e+02 4.0e+00 4.4e+01 2 0 18 0 10 5 0 24 0 35 0 VecScatterBegin 2 1.0 7.1883e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 5.0068e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.7474e+00 1.1 0.00e+00 0.0 3.8e+02 1.4e+05 2.1e+01 5 0 17 8 5 13 0 23 8 17 0 Mesh Migration 2 1.0 1.0405e+00 1.0 0.00e+00 0.0 1.1e+03 4.7e+05 5.4e+01 3 0 51 79 13 8 0 69 83 43 0 DMPlexInterp 1 1.0 2.0642e+0053115.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 DMPlexDistribute 1 1.0 2.0381e+00 1.1 0.00e+00 0.0 3.9e+02 8.1e+05 2.5e+01 6 0 18 46 6 15 0 23 49 20 0 DMPlexDistCones 2 1.0 2.3054e-01 1.0 0.00e+00 0.0 1.6e+02 1.1e+06 4.0e+00 1 0 7 26 1 2 0 10 28 3 0 DMPlexDistLabels 2 1.0 5.6307e-01 1.0 0.00e+00 0.0 7.2e+02 4.2e+05 2.2e+01 2 0 33 45 5 4 0 44 47 18 0 DMPlexDistribOL 1 1.0 7.6747e-01 1.0 0.00e+00 0.0 1.2e+03 2.8e+05 5.0e+01 2 0 52 46 12 6 0 70 49 40 0 DMPlexDistField 3 1.0 2.9449e-02 1.0 0.00e+00 0.0 2.0e+02 1.1e+05 1.2e+01 0 0 9 3 3 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.2793e-0140.7 0.00e+00 0.0 2.2e+02 1.0e+05 6.0e+00 2 0 10 3 1 5 0 14 4 5 0 DMPlexStratify 6 1.5 6.5141e-01 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFSetGraph 51 1.0 2.4321e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFBcastBegin 95 1.0 9.8964e-01 4.5 0.00e+00 0.0 1.6e+03 4.0e+05 4.1e+01 2 0 71 92 10 6 0 95 96 33 0 SFBcastEnd 95 1.0 3.2814e-01 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 2.9824e-03 2.1 0.00e+00 0.0 4.9e+01 2.9e+05 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 6.0050e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.6001e-05 9.4 0.00e+00 0.0 5.0e+00 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 4.2892e-04 2.8 0.00e+00 0.0 5.0e+00 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.8361e-0391.7 0.00e+00 0.0 1.0e+01 4.0e+00 1.0e+00 0 0 0 0 0 0 0 2 0 0 0 VecMDot 15 1.0 3.0009e-02 1.2 7.01e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 13 0 0 4 0 13 0 0 5 9290 VecNorm 16 1.0 6.5994e-03 1.3 9.35e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 5632 VecScale 32 1.0 4.8001e-03 1.0 7.81e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 6464 VecCopy 1 1.0 5.7697e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 1.5269e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 3.2783e-04 1.0 5.84e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7087 VecMAXPY 16 1.0 2.9337e-02 1.0 7.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 10691 VecScatterBegin 146 1.0 1.7044e-02 1.1 0.00e+00 0.0 3.8e+02 1.4e+04 0.0e+00 0 0 17 1 0 0 0 68 15 0 0 VecScatterEnd 146 1.0 1.6215e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 9.4516e-03 1.2 1.40e+07 1.0 0.0e+00 0.0e+00 1.6e+01 0 3 0 0 4 0 3 0 0 5 5899 MatMult 31 1.0 8.5318e-02 1.0 8.48e+07 1.0 3.8e+02 1.4e+04 1.2e+02 0 16 17 1 28 0 16 68 15 41 3953 MatMultAdd 60 1.0 6.3984e-02 1.0 7.25e+07 1.0 3.0e+02 1.4e+04 0.0e+00 0 14 14 1 0 0 14 54 13 0 4506 MatSolve 16 1.0 5.3797e-02 1.1 4.00e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 2948 MatLUFactorNum 1 1.0 1.7410e-02 1.0 4.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1037 MatILUFactorSym 1 1.0 6.5880e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.2823e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.9329e-03 1.1 1.34e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2758 MatAssemblyBegin 12 1.0 1.0853e-0230.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 4.9526e-02 1.0 0.00e+00 0.0 8.0e+01 3.1e+03 4.8e+01 0 0 4 0 11 0 0 14 1 16 0 MatGetRow 192000 1.0 3.7644e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 MatGetRowIJ 3 1.0 5.0068e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 1.0573e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 6.9499e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.6509e-01 1.0 0.00e+00 0.0 2.0e+01 2.7e+03 1.2e+01 2 0 1 0 3 3 0 4 0 4 0 MatMatMult 1 1.0 7.6802e-02 1.0 2.48e+06 1.0 4.0e+01 8.9e+03 1.6e+01 0 0 2 0 4 0 0 7 1 5 129 MatMatMultSym 1 1.0 6.7404e-02 1.1 0.00e+00 0.0 3.5e+01 7.0e+03 1.4e+01 0 0 2 0 3 0 0 6 1 5 0 MatMatMultNum 1 1.0 9.4151e-03 1.0 2.48e+06 1.0 5.0e+00 2.2e+04 2.0e+00 0 0 0 0 0 0 0 1 0 1 1051 MatGetLocalMat 2 1.0 1.1554e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 4.0152e-03 5.9 0.00e+00 0.0 2.0e+01 1.5e+04 0.0e+00 0 0 1 0 0 0 0 4 1 0 0 PCSetUp 4 1.0 1.6053e+01 1.0 8.38e+06 1.0 7.6e+01 1.3e+05 6.6e+01 46 2 3 1 16 72 2 14 28 22 2 PCSetUpOnBlocks 16 1.0 2.4721e-02 1.0 4.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 730 PCApply 16 1.0 2.0598e+01 1.0 6.00e+07 1.0 8.0e+01 1.1e+04 4.0e+00 58 11 4 0 1 93 11 14 3 1 12 KSPGMRESOrthog 15 1.0 5.6051e-02 1.1 1.40e+08 1.0 0.0e+00 0.0e+00 1.5e+01 0 26 0 0 4 0 26 0 0 5 9947 KSPSetUp 4 1.0 5.0209e-03 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 2.1633e+01 1.0 3.00e+08 1.0 4.6e+02 3.3e+04 2.2e+02 61 57 21 2 51 97 57 82 44 73 55 SFBcastBegin 1 1.0 1.9069e-0318.0 0.00e+00 0.0 3.0e+01 1.7e+04 1.0e+00 0 0 1 0 0 0 0 5 1 0 0 SFBcastEnd 1 1.0 2.2197e-04 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 87 87 35923148 0. IS L to G Mapping 3 3 18881016 0. Section 70 53 35616 0. Vector 15 45 72396112 0. Vector Scatter 2 7 6928024 0. Matrix 0 5 32297372 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 76832 0. IS L to G Mapping 4 0 0 0. Vector 114 72 5529184 0. Vector Scatter 13 2 2192 0. Matrix 26 8 26014168 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 1.38283e-06 Average time for zero size MPI_Send(): 1.72853e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 8: solving... ((143102, 1161600), (143102, 1161600)) Solver time: 1.735006e+01 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 8 processors, by jychang48 Wed Mar 2 17:36:26 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 3.019e+01 1.00018 3.019e+01 Objects: 5.080e+02 1.06723 4.808e+02 Flops: 2.751e+08 1.02555 2.702e+08 2.162e+09 Flops/sec: 9.112e+06 1.02543 8.951e+06 7.161e+07 MPI Messages: 1.488e+03 1.92936 9.162e+02 7.330e+03 MPI Message Lengths: 2.303e+08 3.35599 9.783e+04 7.171e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2840e+01 42.5% 0.0000e+00 0.0% 5.296e+03 72.3% 9.268e+04 94.7% 1.250e+02 29.6% 1: FEM: 1.7350e+01 57.5% 2.1619e+09 100.0% 2.034e+03 27.7% 5.144e+03 5.3% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 9.9128e-0138.8 0.00e+00 0.0 1.2e+03 4.0e+00 4.4e+01 3 0 17 0 10 7 0 23 0 35 0 VecScatterBegin 2 1.0 2.8896e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 4.2915e-06 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.8551e+00 1.1 0.00e+00 0.0 1.4e+03 4.1e+04 2.1e+01 6 0 19 8 5 14 0 26 8 17 0 Mesh Migration 2 1.0 6.9572e-01 1.0 0.00e+00 0.0 3.4e+03 1.6e+05 5.4e+01 2 0 47 78 13 5 0 65 82 43 0 DMPlexInterp 1 1.0 2.1258e+0061070.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 DMPlexDistribute 1 1.0 2.0482e+00 1.1 0.00e+00 0.0 1.0e+03 3.2e+05 2.5e+01 7 0 14 44 6 16 0 19 47 20 0 DMPlexDistCones 2 1.0 1.6333e-01 1.0 0.00e+00 0.0 4.9e+02 3.8e+05 4.0e+00 1 0 7 26 1 1 0 9 27 3 0 DMPlexDistLabels 2 1.0 3.9636e-01 1.0 0.00e+00 0.0 2.1e+03 1.5e+05 2.2e+01 1 0 29 44 5 3 0 40 47 18 0 DMPlexDistribOL 1 1.0 5.2188e-01 1.0 0.00e+00 0.0 3.9e+03 8.8e+04 5.0e+01 2 0 53 47 12 4 0 73 50 40 0 DMPlexDistField 3 1.0 2.2661e-02 1.2 0.00e+00 0.0 6.4e+02 3.8e+04 1.2e+01 0 0 9 3 3 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.7732e-0152.9 0.00e+00 0.0 8.5e+02 3.0e+04 6.0e+00 3 0 12 4 1 6 0 16 4 5 0 DMPlexStratify 6 1.5 6.0215e-01 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFSetGraph 51 1.0 1.4261e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFBcastBegin 95 1.0 1.0311e+00 6.1 0.00e+00 0.0 5.0e+03 1.3e+05 4.1e+01 3 0 69 91 10 7 0 95 96 33 0 SFBcastEnd 95 1.0 3.0112e-01 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 3.5210e-03 3.5 0.00e+00 0.0 1.8e+02 8.2e+04 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 6.8002e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 6.3896e-0510.7 0.00e+00 0.0 1.9e+01 7.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 2.3389e-04 2.4 0.00e+00 0.0 1.9e+01 7.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.9801e-0368.1 0.00e+00 0.0 3.8e+01 4.0e+00 1.0e+00 0 0 1 0 0 0 0 2 0 0 0 VecMDot 15 1.0 2.0872e-02 1.2 3.52e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 13 0 0 4 0 13 0 0 5 13357 VecNorm 16 1.0 4.1130e-03 1.4 4.69e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 9038 VecScale 32 1.0 2.5165e-03 1.1 3.92e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 12329 VecCopy 1 1.0 3.9196e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 5.7840e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 2.0885e-04 1.1 2.93e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11124 VecMAXPY 16 1.0 2.0167e-02 1.1 3.95e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 15552 VecScatterBegin 146 1.0 9.3625e-03 1.1 0.00e+00 0.0 1.4e+03 5.6e+03 0.0e+00 0 0 20 1 0 0 0 71 22 0 0 VecScatterEnd 146 1.0 1.7571e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 5.5876e-03 1.2 7.03e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 3 0 0 4 0 3 0 0 5 9979 MatMult 31 1.0 5.3636e-02 1.0 4.25e+07 1.0 1.4e+03 5.6e+03 1.2e+02 0 16 20 1 28 0 16 71 22 41 6288 MatMultAdd 60 1.0 4.1250e-02 1.0 3.64e+07 1.0 1.1e+03 6.0e+03 0.0e+00 0 13 16 1 0 0 13 56 18 0 6989 MatSolve 16 1.0 2.9535e-02 1.1 2.00e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 5344 MatLUFactorNum 1 1.0 9.0630e-03 1.0 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1993 MatILUFactorSym 1 1.0 3.5720e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 7.2582e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.1430e-03 1.1 6.70e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4666 MatAssemblyBegin 12 1.0 1.4740e-0230.9 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 2.9862e-02 1.1 0.00e+00 0.0 3.0e+02 1.3e+03 4.8e+01 0 0 4 0 11 0 0 15 1 16 0 MatGetRow 96000 1.0 3.8036e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 MatGetRowIJ 3 1.0 7.3910e-06 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 5.1844e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 3.9101e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.6591e-01 1.0 0.00e+00 0.0 7.6e+01 1.1e+03 1.2e+01 3 0 1 0 3 4 0 4 0 4 0 MatMatMult 1 1.0 4.1730e-02 1.0 1.24e+06 1.0 1.5e+02 3.7e+03 1.6e+01 0 0 2 0 4 0 0 7 1 5 237 MatMatMultSym 1 1.0 3.6349e-02 1.1 0.00e+00 0.0 1.3e+02 2.9e+03 1.4e+01 0 0 2 0 3 0 0 7 1 5 0 MatMatMultNum 1 1.0 5.3639e-03 1.0 1.24e+06 1.0 1.9e+01 9.1e+03 2.0e+00 0 0 0 0 0 0 0 1 0 1 1846 MatGetLocalMat 2 1.0 6.2211e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 2.3620e-03 3.6 0.00e+00 0.0 7.6e+01 6.3e+03 0.0e+00 0 0 1 0 0 0 0 4 1 0 0 PCSetUp 4 1.0 1.3550e+01 1.0 4.22e+06 1.0 2.6e+02 3.8e+04 6.6e+01 45 2 4 1 16 78 2 13 26 22 2 PCSetUpOnBlocks 16 1.0 1.3087e-02 1.1 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1380 PCApply 16 1.0 1.6049e+01 1.0 3.01e+07 1.0 3.0e+02 4.5e+03 4.0e+00 53 11 4 0 1 92 11 15 4 1 15 KSPGMRESOrthog 15 1.0 3.8775e-02 1.1 7.03e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 26 0 0 4 0 26 0 0 5 14380 KSPSetUp 4 1.0 2.0099e-03 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 1.6968e+01 1.0 1.50e+08 1.0 1.7e+03 1.1e+04 2.2e+02 56 55 23 3 51 98 55 84 48 73 70 SFBcastBegin 1 1.0 2.1381e-0310.9 0.00e+00 0.0 1.1e+02 7.1e+03 1.0e+00 0 0 2 0 0 0 0 6 2 0 0 SFBcastEnd 1 1.0 1.9217e-04 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 103 103 29164164 0. IS L to G Mapping 3 3 16320748 0. Section 70 53 35616 0. Vector 15 45 38486192 0. Vector Scatter 2 7 3442120 0. Matrix 0 5 16044008 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 72488 0. IS L to G Mapping 4 0 0 0. Vector 114 72 2819096 0. Vector Scatter 13 2 2192 0. Matrix 26 8 12996772 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 2.57492e-06 Average time for zero size MPI_Send(): 1.63913e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 16: solving... ((70996, 1161600), (70996, 1161600)) Solver time: 1.058687e+01 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 16 processors, by jychang48 Wed Mar 2 17:36:54 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.499e+01 1.00048 2.498e+01 Objects: 5.300e+02 1.11345 4.844e+02 Flops: 1.457e+08 1.07257 1.405e+08 2.248e+09 Flops/sec: 5.832e+06 1.07236 5.625e+06 8.999e+07 MPI Messages: 2.200e+03 2.65801 1.275e+03 2.041e+04 MPI Message Lengths: 2.006e+08 5.65486 3.779e+04 7.712e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4394e+01 57.6% 0.0000e+00 0.0% 1.470e+04 72.0% 3.566e+04 94.4% 1.250e+02 29.6% 1: FEM: 1.0587e+01 42.4% 2.2481e+09 100.0% 5.706e+03 28.0% 2.128e+03 5.6% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1248e+0011.8 0.00e+00 0.0 3.5e+03 4.0e+00 4.4e+01 4 0 17 0 10 7 0 24 0 35 0 VecScatterBegin 2 1.0 9.4891e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 5.0068e-06 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.9227e+00 1.1 0.00e+00 0.0 4.4e+03 1.4e+04 2.1e+01 8 0 22 8 5 13 0 30 9 17 0 Mesh Migration 2 1.0 5.1665e-01 1.0 0.00e+00 0.0 8.9e+03 6.6e+04 5.4e+01 2 0 44 77 13 4 0 61 81 43 0 DMPlexInterp 1 1.0 2.1289e+0057607.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 DMPlexDistribute 1 1.0 2.1160e+00 1.1 0.00e+00 0.0 2.9e+03 1.1e+05 2.5e+01 8 0 14 42 6 15 0 20 44 20 0 DMPlexDistCones 2 1.0 1.2104e-01 1.1 0.00e+00 0.0 1.3e+03 1.5e+05 4.0e+00 0 0 6 26 1 1 0 9 27 3 0 DMPlexDistLabels 2 1.0 3.1131e-01 1.0 0.00e+00 0.0 5.5e+03 6.1e+04 2.2e+01 1 0 27 43 5 2 0 38 46 18 0 DMPlexDistribOL 1 1.0 3.4250e-01 1.0 0.00e+00 0.0 1.1e+04 3.5e+04 5.0e+01 1 0 52 49 12 2 0 72 52 40 0 DMPlexDistField 3 1.0 2.6409e-02 1.7 0.00e+00 0.0 1.7e+03 1.5e+04 1.2e+01 0 0 8 3 3 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.8067e-0166.0 0.00e+00 0.0 2.9e+03 9.6e+03 6.0e+00 4 0 14 4 1 6 0 20 4 5 0 DMPlexStratify 6 1.5 5.7033e-0116.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 8.5479e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFBcastBegin 95 1.0 1.1410e+00 4.2 0.00e+00 0.0 1.4e+04 5.0e+04 4.1e+01 4 0 69 91 10 7 0 95 97 33 0 SFBcastEnd 95 1.0 2.9869e-01 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 7.5610e-0311.6 0.00e+00 0.0 5.0e+02 3.0e+04 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.0378e-03 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.8876e-0517.1 0.00e+00 0.0 5.4e+01 3.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 2.0194e-04 2.8 0.00e+00 0.0 5.4e+01 3.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.9481e-0357.1 0.00e+00 0.0 1.1e+02 4.0e+00 1.0e+00 0 0 1 0 0 0 0 2 0 0 0 VecMDot 15 1.0 1.0908e-02 1.3 1.77e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 12 0 0 4 0 12 0 0 5 25557 VecNorm 16 1.0 2.3785e-03 1.4 2.36e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 15628 VecScale 32 1.0 1.2298e-03 1.1 1.97e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 25230 VecCopy 1 1.0 2.2292e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 2.8174e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 1.0705e-04 1.2 1.47e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 21702 VecMAXPY 16 1.0 7.5836e-03 1.1 1.99e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 14 0 0 0 41357 VecScatterBegin 146 1.0 5.2605e-03 1.2 0.00e+00 0.0 4.1e+03 3.1e+03 0.0e+00 0 0 20 2 0 0 0 72 30 0 0 VecScatterEnd 146 1.0 1.6868e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 3.2125e-03 1.3 3.53e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 17356 MatMult 31 1.0 2.9088e-02 1.1 2.15e+07 1.0 4.1e+03 3.1e+03 1.2e+02 0 15 20 2 28 0 15 72 30 41 11595 MatMultAdd 60 1.0 2.0502e-02 1.1 1.84e+07 1.0 3.2e+03 3.3e+03 0.0e+00 0 13 16 1 0 0 13 57 25 0 14061 MatSolve 16 1.0 1.4782e-02 1.2 1.00e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 10598 MatLUFactorNum 1 1.0 4.4949e-03 1.1 1.17e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4012 MatILUFactorSym 1 1.0 1.7769e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 3.6948e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.8299e-03 4.0 3.37e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2915 MatAssemblyBegin 12 1.0 6.6197e-0314.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 1.8342e-02 1.2 0.00e+00 0.0 8.6e+02 7.2e+02 4.8e+01 0 0 4 0 11 0 0 15 1 16 0 MatGetRow 48000 1.0 1.9010e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 MatGetRowIJ 3 1.0 7.3910e-06 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 2.5330e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 1.8501e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 3.8416e-01 1.0 0.00e+00 0.0 2.2e+02 6.2e+02 1.2e+01 2 0 1 0 3 4 0 4 0 4 0 MatMatMult 1 1.0 2.2542e-02 1.0 6.22e+05 1.0 4.3e+02 2.1e+03 1.6e+01 0 0 2 0 4 0 0 8 2 5 439 MatMatMultSym 1 1.0 1.9671e-02 1.0 0.00e+00 0.0 3.8e+02 1.6e+03 1.4e+01 0 0 2 0 3 0 0 7 1 5 0 MatMatMultNum 1 1.0 2.8560e-03 1.0 6.22e+05 1.0 5.4e+01 5.1e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 3467 MatGetLocalMat 2 1.0 3.1629e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 1.3680e-03 3.8 0.00e+00 0.0 2.2e+02 3.5e+03 0.0e+00 0 0 1 0 0 0 0 4 2 0 0 PCSetUp 4 1.0 8.6555e+00 1.0 2.13e+06 1.0 7.1e+02 1.4e+04 6.6e+01 35 1 3 1 16 82 1 12 24 22 4 PCSetUpOnBlocks 16 1.0 6.4955e-03 1.1 1.17e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2776 PCApply 16 1.0 9.8637e+00 1.0 1.51e+07 1.1 8.6e+02 2.5e+03 4.0e+00 39 11 4 0 1 93 11 15 5 1 24 KSPGMRESOrthog 15 1.0 1.7411e-02 1.1 3.53e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 25 0 0 4 0 25 0 0 5 32024 KSPSetUp 4 1.0 1.0309e-03 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 1.0327e+01 1.0 7.56e+07 1.0 4.8e+03 4.8e+03 2.2e+02 41 53 24 3 51 98 53 84 53 73 115 SFBcastBegin 1 1.0 2.0330e-0312.4 0.00e+00 0.0 3.3e+02 3.9e+03 1.0e+00 0 0 2 0 0 0 0 6 3 0 0 SFBcastEnd 1 1.0 6.5589e-0418.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 125 125 25601172 0. IS L to G Mapping 3 3 15014432 0. Section 70 53 35616 0. Vector 15 45 21647864 0. Vector Scatter 2 7 1711576 0. Matrix 0 5 7939056 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 62680 0. IS L to G Mapping 4 0 0 0. Vector 114 72 1468496 0. Vector Scatter 13 2 2192 0. Matrix 26 8 6483848 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 5.19753e-06 Average time for zero size MPI_Send(): 1.74344e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 24: solving... ((47407, 1161600), (47407, 1161600)) Solver time: 7.910459e+00 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 24 processors, by jychang48 Wed Mar 2 17:37:20 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.242e+01 1.00049 2.241e+01 Objects: 5.440e+02 1.14286 4.867e+02 Flops: 1.007e+08 1.09540 9.605e+07 2.305e+09 Flops/sec: 4.496e+06 1.09573 4.286e+06 1.029e+08 MPI Messages: 2.382e+03 2.69553 1.502e+03 3.605e+04 MPI Message Lengths: 1.887e+08 7.68458 2.238e+04 8.069e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4502e+01 64.7% 0.0000e+00 0.0% 2.618e+04 72.6% 2.108e+04 94.2% 1.250e+02 29.6% 1: FEM: 7.9106e+00 35.3% 2.3052e+09 100.0% 9.866e+03 27.4% 1.304e+03 5.8% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1278e+00 9.6 0.00e+00 0.0 6.2e+03 4.0e+00 4.4e+01 5 0 17 0 10 7 0 24 0 35 0 VecScatterBegin 2 1.0 5.1975e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 6.1989e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.9991e+00 1.1 0.00e+00 0.0 8.7e+03 7.7e+03 2.1e+01 9 0 24 8 5 14 0 33 9 17 0 Mesh Migration 2 1.0 4.5798e-01 1.0 0.00e+00 0.0 1.5e+04 4.0e+04 5.4e+01 2 0 42 76 13 3 0 58 81 43 0 DMPlexInterp 1 1.0 2.1214e+0062660.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2150e+00 1.1 0.00e+00 0.0 5.7e+03 5.7e+04 2.5e+01 10 0 16 40 6 15 0 22 43 20 0 DMPlexDistCones 2 1.0 1.0382e-01 1.1 0.00e+00 0.0 2.2e+03 9.3e+04 4.0e+00 0 0 6 25 1 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.8698e-01 1.0 0.00e+00 0.0 9.3e+03 3.7e+04 2.2e+01 1 0 26 43 5 2 0 35 46 18 0 DMPlexDistribOL 1 1.0 2.6150e-01 1.0 0.00e+00 0.0 1.8e+04 2.2e+04 5.0e+01 1 0 51 50 12 2 0 70 53 40 0 DMPlexDistField 3 1.0 2.8048e-02 1.9 0.00e+00 0.0 3.0e+03 9.4e+03 1.2e+01 0 0 8 3 3 0 0 11 4 10 0 DMPlexDistData 2 1.0 1.0004e+0029.4 0.00e+00 0.0 6.0e+03 5.0e+03 6.0e+00 4 0 17 4 1 6 0 23 4 5 0 DMPlexStratify 6 1.5 5.5870e-0122.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 6.0882e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1378e+00 4.0 0.00e+00 0.0 2.5e+04 2.9e+04 4.1e+01 5 0 69 91 10 7 0 96 97 33 0 SFBcastEnd 95 1.0 2.9645e-01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 7.4639e-0312.3 0.00e+00 0.0 8.7e+02 1.8e+04 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.2810e-03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 5.1975e-0527.2 0.00e+00 0.0 9.4e+01 2.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.4901e-04 3.2 0.00e+00 0.0 9.4e+01 2.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.2400e-0322.1 0.00e+00 0.0 1.9e+02 4.0e+00 1.0e+00 0 0 1 0 0 0 0 2 0 0 0 VecMDot 15 1.0 7.2665e-03 1.2 1.18e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 12 0 0 4 0 12 0 0 5 38365 VecNorm 16 1.0 1.5225e-03 1.3 1.58e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 24414 VecScale 32 1.0 8.1635e-04 1.1 1.32e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 38007 VecCopy 1 1.0 1.5211e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 1.9157e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 9.1076e-05 1.5 9.87e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 25508 VecMAXPY 16 1.0 4.0472e-03 1.1 1.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 14 0 0 0 77494 VecScatterBegin 146 1.0 3.9029e-03 1.3 0.00e+00 0.0 7.1e+03 2.2e+03 0.0e+00 0 0 20 2 0 0 0 72 34 0 0 VecScatterEnd 146 1.0 1.7533e-03 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 2.1329e-03 1.2 2.37e+06 1.0 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 26141 MatMult 31 1.0 2.0300e-02 1.1 1.43e+07 1.0 7.1e+03 2.2e+03 1.2e+02 0 15 20 2 28 0 15 72 34 41 16614 MatMultAdd 60 1.0 1.4215e-02 1.1 1.22e+07 1.0 5.6e+03 2.3e+03 0.0e+00 0 13 16 2 0 0 13 57 28 0 20280 MatSolve 16 1.0 1.0126e-02 1.1 6.68e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 15400 MatLUFactorNum 1 1.0 2.9531e-03 1.1 7.77e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 6088 MatILUFactorSym 1 1.0 1.2059e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 2.4192e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.8942e-0310.6 2.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1842 MatAssemblyBegin 12 1.0 3.4149e-0312.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 1.3027e-02 1.2 0.00e+00 0.0 1.5e+03 5.2e+02 4.8e+01 0 0 4 0 11 0 0 15 2 16 0 MatGetRow 32000 1.0 1.2811e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 MatGetRowIJ 3 1.0 9.0599e-06 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 1.8091e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 1.3399e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 2.5847e-01 1.0 0.00e+00 0.0 3.7e+02 4.4e+02 1.2e+01 1 0 1 0 3 3 0 4 0 4 0 MatMatMult 1 1.0 1.5638e-02 1.0 4.15e+05 1.0 7.4e+02 1.5e+03 1.6e+01 0 0 2 0 4 0 0 8 2 5 633 MatMatMultSym 1 1.0 1.3800e-02 1.0 0.00e+00 0.0 6.5e+02 1.2e+03 1.4e+01 0 0 2 0 3 0 0 7 2 5 0 MatMatMultNum 1 1.0 1.8721e-03 1.0 4.15e+05 1.0 9.3e+01 3.6e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 5287 MatGetLocalMat 2 1.0 2.0990e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 8.5115e-04 3.2 0.00e+00 0.0 3.7e+02 2.5e+03 0.0e+00 0 0 1 0 0 0 0 4 2 0 0 PCSetUp 4 1.0 6.6768e+00 1.0 1.42e+06 1.1 1.2e+03 8.7e+03 6.6e+01 30 1 3 1 16 84 1 12 22 22 5 PCSetUpOnBlocks 16 1.0 4.3478e-03 1.1 7.77e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4135 PCApply 16 1.0 7.3809e+00 1.0 1.01e+07 1.1 1.5e+03 1.8e+03 4.0e+00 33 10 4 0 1 93 10 15 6 1 32 KSPGMRESOrthog 15 1.0 1.0740e-02 1.2 2.37e+07 1.0 0.0e+00 0.0e+00 1.5e+01 0 24 0 0 4 0 24 0 0 5 51917 KSPSetUp 4 1.0 7.2980e-04 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 7.6940e+00 1.0 5.05e+07 1.0 8.3e+03 3.2e+03 2.2e+02 34 52 23 3 51 97 52 85 56 73 155 SFBcastBegin 1 1.0 1.3211e-03 7.3 0.00e+00 0.0 5.8e+02 2.8e+03 1.0e+00 0 0 2 0 0 0 0 6 3 0 0 SFBcastEnd 1 1.0 4.4298e-0422.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 139 139 24064612 0. IS L to G Mapping 3 3 14448024 0. Section 70 53 35616 0. Vector 15 45 16129832 0. Vector Scatter 2 7 1145440 0. Matrix 0 5 5296216 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 48256 0. IS L to G Mapping 4 0 0 0. Vector 114 72 1020040 0. Vector Scatter 13 2 2192 0. Matrix 26 8 4329832 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.01222e-06 Average time for zero size MPI_Send(): 1.41064e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 32: solving... ((35155, 1161600), (35155, 1161600)) Solver time: 6.555492e+00 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 32 processors, by jychang48 Wed Mar 2 17:37:44 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.096e+01 1.00042 2.095e+01 Objects: 5.700e+02 1.19748 4.884e+02 Flops: 7.783e+07 1.11880 7.364e+07 2.357e+09 Flops/sec: 3.714e+06 1.11873 3.514e+06 1.125e+08 MPI Messages: 3.398e+03 3.61735 1.672e+03 5.349e+04 MPI Message Lengths: 1.852e+08 9.83329 1.568e+04 8.389e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4399e+01 68.7% 0.0000e+00 0.0% 3.925e+04 73.4% 1.474e+04 94.0% 1.250e+02 29.6% 1: FEM: 6.5554e+00 31.3% 2.3565e+09 100.0% 1.424e+04 26.6% 9.392e+02 6.0% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1524e+0013.3 0.00e+00 0.0 9.3e+03 4.0e+00 4.4e+01 5 0 17 0 10 7 0 24 0 35 0 VecScatterBegin 2 1.0 4.9114e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.0014e-05 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.0450e+00 1.1 0.00e+00 0.0 1.4e+04 5.1e+03 2.1e+01 10 0 26 8 5 14 0 36 9 17 0 Mesh Migration 2 1.0 4.3641e-01 1.0 0.00e+00 0.0 2.2e+04 2.9e+04 5.4e+01 2 0 41 75 13 3 0 56 80 43 0 DMPlexInterp 1 1.0 2.1204e+0066370.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2712e+00 1.1 0.00e+00 0.0 9.4e+03 3.5e+04 2.5e+01 11 0 18 39 6 16 0 24 41 20 0 DMPlexDistCones 2 1.0 1.0140e-01 1.2 0.00e+00 0.0 3.2e+03 6.6e+04 4.0e+00 0 0 6 25 1 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.7727e-01 1.0 0.00e+00 0.0 1.3e+04 2.7e+04 2.2e+01 1 0 25 43 5 2 0 34 45 18 0 DMPlexDistribOL 1 1.0 2.2697e-01 1.0 0.00e+00 0.0 2.7e+04 1.6e+04 5.0e+01 1 0 50 51 12 2 0 68 54 40 0 DMPlexDistField 3 1.0 2.7891e-02 2.0 0.00e+00 0.0 4.3e+03 6.8e+03 1.2e+01 0 0 8 3 3 0 0 11 4 10 0 DMPlexDistData 2 1.0 9.9838e-0173.6 0.00e+00 0.0 1.0e+04 3.2e+03 6.0e+00 4 0 19 4 1 7 0 26 4 5 0 DMPlexStratify 6 1.5 5.4583e-0129.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 5.1974e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1583e+00 4.5 0.00e+00 0.0 3.8e+04 2.0e+04 4.1e+01 5 0 70 91 10 7 0 96 97 33 0 SFBcastEnd 95 1.0 3.0140e-01 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 8.4107e-0318.3 0.00e+00 0.0 1.3e+03 1.3e+04 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.6839e-03 5.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.0054e-0512.9 0.00e+00 0.0 1.4e+02 2.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.4400e-04 2.1 0.00e+00 0.0 1.4e+02 2.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.5190e-0325.4 0.00e+00 0.0 2.9e+02 4.0e+00 1.0e+00 0 0 1 0 0 0 0 2 0 0 0 VecMDot 15 1.0 7.0019e-03 1.3 8.90e+06 1.1 0.0e+00 0.0e+00 1.5e+01 0 12 0 0 4 0 12 0 0 5 39815 VecNorm 16 1.0 1.7715e-03 1.5 1.19e+06 1.1 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 20983 VecScale 32 1.0 6.4635e-04 1.2 9.95e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 48004 VecCopy 1 1.0 1.8501e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 1.4009e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 7.7963e-05 1.6 7.42e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 29799 VecMAXPY 16 1.0 2.8646e-03 1.2 1.00e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 0 13 0 0 0 109485 VecScatterBegin 146 1.0 3.1700e-03 1.3 0.00e+00 0.0 1.0e+04 1.8e+03 0.0e+00 0 0 19 2 0 0 0 72 37 0 0 VecScatterEnd 146 1.0 1.4212e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 2.2540e-03 1.3 1.78e+06 1.1 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 24737 MatMult 31 1.0 1.6098e-02 1.1 1.08e+07 1.1 1.0e+04 1.8e+03 1.2e+02 0 14 19 2 28 0 14 72 37 41 20950 MatMultAdd 60 1.0 1.1012e-02 1.2 9.23e+06 1.1 8.1e+03 1.9e+03 0.0e+00 0 12 15 2 0 0 12 57 31 0 26178 MatSolve 16 1.0 7.5953e-03 1.2 5.01e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 20444 MatLUFactorNum 1 1.0 2.1951e-03 1.1 5.87e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 8182 MatILUFactorSym 1 1.0 9.3198e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.9350e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.6952e-03 8.8 1.69e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3146 MatAssemblyBegin 12 1.0 4.2899e-0319.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 1.1258e-02 1.2 0.00e+00 0.0 2.2e+03 4.2e+02 4.8e+01 0 0 4 0 11 0 0 15 2 16 0 MatGetRow 24000 1.0 9.5220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 3 1.0 9.0599e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 1.2531e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 1.2612e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.9306e-01 1.0 0.00e+00 0.0 5.4e+02 3.6e+02 1.2e+01 1 0 1 0 3 3 0 4 0 4 0 MatMatMult 1 1.0 1.2406e-02 1.0 3.11e+05 1.0 1.1e+03 1.2e+03 1.6e+01 0 0 2 0 4 0 0 8 3 5 798 MatMatMultSym 1 1.0 1.1072e-02 1.0 0.00e+00 0.0 9.4e+02 9.4e+02 1.4e+01 0 0 2 0 3 0 0 7 2 5 0 MatMatMultNum 1 1.0 1.3359e-03 1.0 3.11e+05 1.0 1.4e+02 2.9e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 7412 MatGetLocalMat 2 1.0 1.5900e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 7.4410e-04 2.9 0.00e+00 0.0 5.4e+02 2.0e+03 0.0e+00 0 0 1 0 0 0 0 4 2 0 0 PCSetUp 4 1.0 5.6156e+00 1.0 1.07e+06 1.1 1.7e+03 6.2e+03 6.6e+01 27 1 3 1 16 86 1 12 21 22 6 PCSetUpOnBlocks 16 1.0 3.2737e-03 1.1 5.87e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 5486 PCApply 16 1.0 6.1178e+00 1.0 7.57e+06 1.1 2.2e+03 1.4e+03 4.0e+00 29 10 4 0 1 93 10 15 6 1 38 KSPGMRESOrthog 15 1.0 9.2309e-03 1.2 1.78e+07 1.1 0.0e+00 0.0e+00 1.5e+01 0 24 0 0 4 0 24 0 0 5 60402 KSPSetUp 4 1.0 5.3430e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 6.3550e+00 1.0 3.80e+07 1.1 1.2e+04 2.4e+03 2.2e+02 30 50 23 3 51 97 50 85 58 73 187 SFBcastBegin 1 1.0 1.6100e-03 9.0 0.00e+00 0.0 8.6e+02 2.2e+03 1.0e+00 0 0 2 0 0 0 0 6 4 0 0 SFBcastEnd 1 1.0 2.7430e-03162.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 165 165 23651716 0. IS L to G Mapping 3 3 14326164 0. Section 70 53 35616 0. Vector 15 45 13270832 0. Vector Scatter 2 7 851392 0. Matrix 0 5 3930504 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 50224 0. IS L to G Mapping 4 0 0 0. Vector 114 72 792920 0. Vector Scatter 13 2 2192 0. Matrix 26 8 3242072 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 7.58171e-06 Average time for zero size MPI_Send(): 1.68383e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 40: solving... ((27890, 1161600), (27890, 1161600)) Solver time: 5.808753e+00 Solver iterations: 15 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 40 processors, by jychang48 Wed Mar 2 17:38:08 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.025e+01 1.00036 2.025e+01 Objects: 5.920e+02 1.24895 4.896e+02 Flops: 6.409e+07 1.14442 5.989e+07 2.396e+09 Flops/sec: 3.165e+06 1.14437 2.958e+06 1.183e+08 MPI Messages: 4.088e+03 4.83560 1.815e+03 7.260e+04 MPI Message Lengths: 1.828e+08 12.01197 1.189e+04 8.633e+08 MPI Reductions: 4.220e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4441e+01 71.3% 0.0000e+00 0.0% 5.371e+04 74.0% 1.117e+04 93.9% 1.250e+02 29.6% 1: FEM: 5.8089e+00 28.7% 2.3956e+09 100.0% 1.888e+04 26.0% 7.254e+02 6.1% 2.960e+02 70.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1936e+0013.3 0.00e+00 0.0 1.3e+04 4.0e+00 4.4e+01 5 0 18 0 10 8 0 24 0 35 0 VecScatterBegin 2 1.0 3.5048e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 8.1062e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1145e+00 1.1 0.00e+00 0.0 2.0e+04 3.6e+03 2.1e+01 10 0 28 9 5 15 0 38 9 17 0 Mesh Migration 2 1.0 4.1762e-01 1.0 0.00e+00 0.0 2.9e+04 2.2e+04 5.4e+01 2 0 40 75 13 3 0 54 80 43 0 DMPlexInterp 1 1.0 2.1110e+0060232.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.3499e+00 1.1 0.00e+00 0.0 1.4e+04 2.3e+04 2.5e+01 12 0 19 38 6 16 0 26 40 20 0 DMPlexDistCones 2 1.0 9.5561e-02 1.2 0.00e+00 0.0 4.3e+03 5.1e+04 4.0e+00 0 0 6 25 1 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.6967e-01 1.0 0.00e+00 0.0 1.7e+04 2.1e+04 2.2e+01 1 0 24 42 5 2 0 32 45 18 0 DMPlexDistribOL 1 1.0 2.0232e-01 1.0 0.00e+00 0.0 3.6e+04 1.2e+04 5.0e+01 1 0 49 51 12 1 0 66 55 40 0 DMPlexDistField 3 1.0 3.1459e-02 2.2 0.00e+00 0.0 5.7e+03 5.2e+03 1.2e+01 0 0 8 3 3 0 0 11 4 10 0 DMPlexDistData 2 1.0 1.0449e+0078.5 0.00e+00 0.0 1.5e+04 2.2e+03 6.0e+00 5 0 21 4 1 7 0 28 4 5 0 DMPlexStratify 6 1.5 5.4727e-0136.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 4.2525e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1982e+00 4.5 0.00e+00 0.0 5.1e+04 1.5e+04 4.1e+01 5 0 71 91 10 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0828e-01 6.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 8.9126e-0319.8 0.00e+00 0.0 1.7e+03 9.4e+03 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.0048e-03 6.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.0054e-0514.0 0.00e+00 0.0 1.9e+02 1.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.1182e-04 2.3 0.00e+00 0.0 1.9e+02 1.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.0710e-0319.5 0.00e+00 0.0 3.8e+02 4.0e+00 1.0e+00 0 0 1 0 0 0 0 2 0 0 0 VecMDot 15 1.0 4.8821e-03 1.3 7.14e+06 1.1 0.0e+00 0.0e+00 1.5e+01 0 12 0 0 4 0 12 0 0 5 57102 VecNorm 16 1.0 1.3859e-03 1.6 9.52e+05 1.1 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 26820 VecScale 32 1.0 5.1856e-04 1.2 7.98e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 59833 VecCopy 1 1.0 9.8944e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 104 1.0 1.1306e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 5.8174e-05 1.6 5.95e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 39935 VecMAXPY 16 1.0 2.0962e-03 1.2 8.03e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 0 13 0 0 0 149621 VecScatterBegin 146 1.0 2.9225e-03 1.6 0.00e+00 0.0 1.4e+04 1.5e+03 0.0e+00 0 0 19 2 0 0 0 72 39 0 0 VecScatterEnd 146 1.0 1.6057e-03 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 16 1.0 1.7741e-03 1.4 1.43e+06 1.1 0.0e+00 0.0e+00 1.6e+01 0 2 0 0 4 0 2 0 0 5 31429 MatMult 31 1.0 1.4111e-02 1.2 8.66e+06 1.1 1.4e+04 1.5e+03 1.2e+02 0 14 19 2 28 0 14 72 39 41 23901 MatMultAdd 60 1.0 9.0714e-03 1.2 7.40e+06 1.1 1.1e+04 1.6e+03 0.0e+00 0 12 15 2 0 0 12 57 32 0 31779 MatSolve 16 1.0 6.0453e-03 1.2 4.01e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 25604 MatLUFactorNum 1 1.0 1.7362e-03 1.1 4.73e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 10319 MatILUFactorSym 1 1.0 7.0596e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.5569e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.7138e-0311.6 1.35e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3112 MatAssemblyBegin 12 1.0 4.5278e-0321.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 9.7792e-03 1.2 0.00e+00 0.0 2.9e+03 3.5e+02 4.8e+01 0 0 4 0 11 0 0 15 2 16 0 MatGetRow 19200 1.0 7.6635e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 3 1.0 9.5367e-06 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 1.1559e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 8.6069e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.5613e-01 1.0 0.00e+00 0.0 7.2e+02 3.0e+02 1.2e+01 1 0 1 0 3 3 0 4 0 4 0 MatMatMult 1 1.0 1.0730e-02 1.0 2.49e+05 1.0 1.4e+03 9.9e+02 1.6e+01 0 0 2 0 4 0 0 8 3 5 923 MatMatMultSym 1 1.0 9.6390e-03 1.0 0.00e+00 0.0 1.3e+03 7.8e+02 1.4e+01 0 0 2 0 3 0 0 7 2 5 0 MatMatMultNum 1 1.0 1.0788e-03 1.1 2.49e+05 1.0 1.8e+02 2.4e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 9176 MatGetLocalMat 2 1.0 1.2472e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 6.6495e-04 3.2 0.00e+00 0.0 7.2e+02 1.7e+03 0.0e+00 0 0 1 0 0 0 0 4 2 0 0 PCSetUp 4 1.0 5.0245e+00 1.0 8.56e+05 1.1 2.3e+03 4.7e+03 6.6e+01 25 1 3 1 16 86 1 12 21 22 7 PCSetUpOnBlocks 16 1.0 2.5797e-03 1.1 4.73e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 6945 PCApply 16 1.0 5.4329e+00 1.0 6.07e+06 1.1 2.9e+03 1.2e+03 4.0e+00 27 10 4 0 1 94 10 15 6 1 43 KSPGMRESOrthog 15 1.0 6.5861e-03 1.2 1.43e+07 1.1 0.0e+00 0.0e+00 1.5e+01 0 23 0 0 4 0 23 0 0 5 84658 KSPSetUp 4 1.0 4.7898e-04 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 5.6257e+00 1.0 3.05e+07 1.1 1.6e+04 2.0e+03 2.2e+02 28 50 22 4 51 97 50 85 60 73 211 SFBcastBegin 1 1.0 1.1621e-03 6.0 0.00e+00 0.0 1.2e+03 1.8e+03 1.0e+00 0 0 2 0 0 0 0 6 4 0 0 SFBcastEnd 1 1.0 6.1488e-0436.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 187 187 23331236 0. IS L to G Mapping 3 3 14094724 0. Section 70 53 35616 0. Vector 15 45 11572992 0. Vector Scatter 2 7 677032 0. Matrix 0 5 3124972 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 48720 0. IS L to G Mapping 4 0 0 0. Vector 114 72 656808 0. Vector Scatter 13 2 2192 0. Matrix 26 8 2592528 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.91414e-07 Average time for MPI_Barrier(): 9.63211e-06 Average time for zero size MPI_Send(): 1.42455e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 48: solving... ((23365, 1161600), (23365, 1161600)) Solver time: 5.547698e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 48 processors, by jychang48 Wed Mar 2 17:38:33 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 2.058e+01 1.00033 2.058e+01 Objects: 6.040e+02 1.26360 4.931e+02 Flops: 5.665e+07 1.15192 5.290e+07 2.539e+09 Flops/sec: 2.753e+06 1.15212 2.571e+06 1.234e+08 MPI Messages: 3.949e+03 4.19214 1.888e+03 9.063e+04 MPI Message Lengths: 1.799e+08 13.98669 9.756e+03 8.842e+08 MPI Reductions: 4.320e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.5032e+01 73.0% 0.0000e+00 0.0% 6.669e+04 73.6% 9.138e+03 93.7% 1.250e+02 28.9% 1: FEM: 5.5477e+00 27.0% 2.5394e+09 100.0% 2.394e+04 26.4% 6.183e+02 6.3% 3.060e+02 70.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.3304e+0082.7 0.00e+00 0.0 1.6e+04 4.0e+00 4.4e+01 6 0 18 0 10 8 0 24 0 35 0 VecScatterBegin 2 1.0 2.3127e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 8.8215e-06 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1502e+00 1.1 0.00e+00 0.0 2.7e+04 2.8e+03 2.1e+01 10 0 30 9 5 14 0 40 9 17 0 Mesh Migration 2 1.0 4.0710e-01 1.0 0.00e+00 0.0 3.4e+04 1.9e+04 5.4e+01 2 0 38 75 12 3 0 51 80 43 0 DMPlexInterp 1 1.0 2.1277e+0062846.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.4059e+00 1.1 0.00e+00 0.0 1.9e+04 1.7e+04 2.5e+01 12 0 21 37 6 16 0 29 40 20 0 DMPlexDistCones 2 1.0 9.4815e-02 1.2 0.00e+00 0.0 5.1e+03 4.3e+04 4.0e+00 0 0 6 25 1 1 0 8 26 3 0 DMPlexDistLabels 2 1.0 2.6481e-01 1.0 0.00e+00 0.0 2.1e+04 1.8e+04 2.2e+01 1 0 23 42 5 2 0 31 45 18 0 DMPlexDistribOL 1 1.0 1.7179e-01 1.1 0.00e+00 0.0 4.2e+04 1.1e+04 5.0e+01 1 0 47 52 12 1 0 64 55 40 0 DMPlexDistField 3 1.0 3.1440e-02 2.3 0.00e+00 0.0 6.8e+03 4.5e+03 1.2e+01 0 0 8 3 3 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0401e+0067.8 0.00e+00 0.0 2.1e+04 1.7e+03 6.0e+00 5 0 23 4 1 7 0 31 4 5 0 DMPlexStratify 6 1.5 5.3913e-0141.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 3.6406e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.3376e+00 6.7 0.00e+00 0.0 6.4e+04 1.3e+04 4.1e+01 6 0 71 91 9 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0287e-01 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.4006e-0323.4 0.00e+00 0.0 2.0e+03 8.2e+03 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.3119e-03 7.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.3855e-0515.8 0.00e+00 0.0 2.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.2398e-04 2.5 0.00e+00 0.0 2.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.3649e-0317.7 0.00e+00 0.0 4.6e+02 4.0e+00 1.0e+00 0 0 1 0 0 0 0 2 0 0 0 VecMDot 16 1.0 4.8552e-03 1.3 6.76e+06 1.1 0.0e+00 0.0e+00 1.6e+01 0 12 0 0 4 0 12 0 0 5 65075 VecNorm 17 1.0 1.2991e-03 1.5 8.45e+05 1.1 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 4 0 2 0 0 6 30400 VecScale 34 1.0 4.4942e-04 1.2 7.09e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 73353 VecCopy 1 1.0 1.1802e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 110 1.0 9.8443e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 7.2956e-05 2.3 4.97e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31844 VecMAXPY 17 1.0 1.9226e-03 1.2 7.56e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 14 0 0 0 183671 VecScatterBegin 155 1.0 2.6774e-03 1.6 0.00e+00 0.0 1.8e+04 1.3e+03 0.0e+00 0 0 19 3 0 0 0 74 42 0 0 VecScatterEnd 155 1.0 1.3506e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 17 1.0 1.6842e-03 1.3 1.27e+06 1.1 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 4 0 2 0 0 6 35175 MatMult 33 1.0 1.3046e-02 1.2 7.76e+06 1.1 1.8e+04 1.3e+03 1.3e+02 0 14 19 3 30 0 14 74 42 42 27561 MatMultAdd 64 1.0 8.1303e-03 1.2 6.63e+06 1.1 1.4e+04 1.4e+03 0.0e+00 0 12 15 2 0 0 12 58 35 0 37821 MatSolve 17 1.0 5.3048e-03 1.1 3.55e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 30925 MatLUFactorNum 1 1.0 1.4539e-03 1.1 3.97e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 12306 MatILUFactorSym 1 1.0 5.9080e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.2629e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.9281e-0315.4 1.13e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2766 MatAssemblyBegin 12 1.0 3.2587e-0318.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 8.6644e-03 1.2 0.00e+00 0.0 3.5e+03 3.1e+02 4.8e+01 0 0 4 0 11 0 0 15 2 16 0 MatGetRow 16000 1.0 6.3950e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 3 1.0 1.0729e-05 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 9.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 7.7963e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.3025e-01 1.0 0.00e+00 0.0 8.6e+02 2.7e+02 1.2e+01 1 0 1 0 3 2 0 4 0 4 0 MatMatMult 1 1.0 9.7330e-03 1.0 2.08e+05 1.0 1.7e+03 8.9e+02 1.6e+01 0 0 2 0 4 0 0 7 3 5 1017 MatMatMultSym 1 1.0 8.7850e-03 1.0 0.00e+00 0.0 1.5e+03 7.0e+02 1.4e+01 0 0 2 0 3 0 0 6 2 5 0 MatMatMultNum 1 1.0 9.3794e-04 1.0 2.08e+05 1.0 2.2e+02 2.2e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 10554 MatGetLocalMat 2 1.0 1.0867e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 6.5875e-04 3.9 0.00e+00 0.0 8.6e+02 1.5e+03 0.0e+00 0 0 1 0 0 0 0 4 2 0 0 PCSetUp 4 1.0 4.8043e+00 1.0 7.17e+05 1.1 2.8e+03 4.0e+03 6.6e+01 23 1 3 1 15 87 1 12 20 22 7 PCSetUpOnBlocks 17 1.0 2.2023e-03 1.1 3.97e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 8124 PCApply 17 1.0 5.2019e+00 1.0 5.36e+06 1.1 3.7e+03 1.1e+03 4.0e+00 25 10 4 0 1 94 10 15 7 1 48 KSPGMRESOrthog 16 1.0 6.3984e-03 1.2 1.35e+07 1.1 0.0e+00 0.0e+00 1.6e+01 0 25 0 0 4 0 25 0 0 5 98759 KSPSetUp 4 1.0 3.8409e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 5.3660e+00 1.0 2.80e+07 1.1 2.0e+04 1.7e+03 2.3e+02 26 51 23 4 53 97 51 85 62 74 242 SFBcastBegin 1 1.0 1.4529e-03 8.4 0.00e+00 0.0 1.4e+03 1.7e+03 1.0e+00 0 0 2 0 0 0 0 6 4 0 0 SFBcastEnd 1 1.0 7.1597e-0450.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 195 195 22892240 0. IS L to G Mapping 3 3 14076928 0. Section 70 53 35616 0. Vector 15 45 10511976 0. Vector Scatter 2 7 568432 0. Matrix 0 5 2614408 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 42440 0. IS L to G Mapping 4 0 0 0. Vector 118 76 575296 0. Vector Scatter 13 2 2192 0. Matrix 26 8 2166428 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 8.96454e-06 Average time for zero size MPI_Send(): 1.35601e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 56: solving... ((20104, 1161600), (20104, 1161600)) Solver time: 5.245087e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 56 processors, by jychang48 Wed Mar 2 17:38:56 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.992e+01 1.00032 1.991e+01 Objects: 6.220e+02 1.29583 4.939e+02 Flops: 4.935e+07 1.17708 4.599e+07 2.575e+09 Flops/sec: 2.478e+06 1.17720 2.310e+06 1.293e+08 MPI Messages: 4.414e+03 3.94944 2.001e+03 1.120e+05 MPI Message Lengths: 1.783e+08 16.37358 8.092e+03 9.066e+08 MPI Reductions: 4.320e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4666e+01 73.7% 0.0000e+00 0.0% 8.318e+04 74.3% 7.573e+03 93.6% 1.250e+02 28.9% 1: FEM: 5.2451e+00 26.3% 2.5753e+09 100.0% 2.884e+04 25.7% 5.196e+02 6.4% 3.060e+02 70.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2130e+0012.1 0.00e+00 0.0 2.0e+04 4.0e+00 4.4e+01 6 0 18 0 10 8 0 24 0 35 0 VecScatterBegin 2 1.0 2.5034e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.0014e-05 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1797e+00 1.1 0.00e+00 0.0 3.5e+04 2.2e+03 2.1e+01 11 0 31 9 5 15 0 42 9 17 0 Mesh Migration 2 1.0 3.9948e-01 1.0 0.00e+00 0.0 4.1e+04 1.6e+04 5.4e+01 2 0 37 74 12 3 0 50 79 43 0 DMPlexInterp 1 1.0 2.1081e+0065986.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.4434e+00 1.1 0.00e+00 0.0 2.6e+04 1.3e+04 2.5e+01 12 0 23 36 6 17 0 31 39 20 0 DMPlexDistCones 2 1.0 9.4779e-02 1.2 0.00e+00 0.0 6.2e+03 3.6e+04 4.0e+00 0 0 5 25 1 1 0 7 26 3 0 DMPlexDistLabels 2 1.0 2.6069e-01 1.0 0.00e+00 0.0 2.5e+04 1.5e+04 2.2e+01 1 0 22 42 5 2 0 30 45 18 0 DMPlexDistribOL 1 1.0 1.5735e-01 1.1 0.00e+00 0.0 5.1e+04 9.2e+03 5.0e+01 1 0 46 52 12 1 0 62 56 40 0 DMPlexDistField 3 1.0 3.0454e-02 2.4 0.00e+00 0.0 8.3e+03 3.8e+03 1.2e+01 0 0 7 3 3 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0585e+0051.5 0.00e+00 0.0 2.8e+04 1.3e+03 6.0e+00 5 0 25 4 1 7 0 33 4 5 0 DMPlexStratify 6 1.5 5.3799e-0149.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 3.2726e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2201e+00 4.0 0.00e+00 0.0 8.0e+04 1.0e+04 4.1e+01 6 0 71 91 9 8 0 96 97 33 0 SFBcastEnd 95 1.0 2.9608e-01 8.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.1550e-0324.3 0.00e+00 0.0 2.5e+03 6.8e+03 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 1.0157e-02 8.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.2902e-0517.2 0.00e+00 0.0 2.7e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.2207e-04 2.7 0.00e+00 0.0 2.7e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.4939e-0332.5 0.00e+00 0.0 5.6e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 2 0 0 0 VecMDot 16 1.0 4.8904e-03 1.5 5.80e+06 1.1 0.0e+00 0.0e+00 1.6e+01 0 12 0 0 4 0 12 0 0 5 64605 VecNorm 17 1.0 1.1060e-03 1.5 7.25e+05 1.1 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 4 0 2 0 0 6 35708 VecScale 34 1.0 4.6110e-04 1.4 6.08e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 71495 VecCopy 1 1.0 7.5102e-05 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 110 1.0 8.4519e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 4.8161e-05 1.8 4.26e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 48239 VecMAXPY 17 1.0 1.5872e-03 1.2 6.48e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 14 0 0 0 222491 VecScatterBegin 155 1.0 2.4736e-03 1.6 0.00e+00 0.0 2.1e+04 1.2e+03 0.0e+00 0 0 19 3 0 0 0 74 43 0 0 VecScatterEnd 155 1.0 1.1752e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 17 1.0 1.4255e-03 1.4 1.09e+06 1.1 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 4 0 2 0 0 6 41558 MatMult 33 1.0 1.2118e-02 1.2 6.64e+06 1.1 2.1e+04 1.2e+03 1.3e+02 0 14 19 3 30 0 14 74 43 42 29670 MatMultAdd 64 1.0 7.1990e-03 1.3 5.67e+06 1.1 1.7e+04 1.3e+03 0.0e+00 0 12 15 2 0 0 12 58 36 0 42713 MatSolve 17 1.0 4.6377e-03 1.1 3.05e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 35279 MatLUFactorNum 1 1.0 1.2581e-03 1.1 3.40e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 14188 MatILUFactorSym 1 1.0 5.0306e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.0850e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.8859e-0317.4 9.68e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2827 MatAssemblyBegin 12 1.0 3.0339e-0318.7 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 8.4910e-03 1.4 0.00e+00 0.0 4.2e+03 2.8e+02 4.8e+01 0 0 4 0 11 0 0 15 2 16 0 MatGetRow 13716 1.0 5.4852e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 3 1.0 9.2983e-06 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 8.1325e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 6.8188e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.1231e-01 1.0 0.00e+00 0.0 1.0e+03 2.4e+02 1.2e+01 1 0 1 0 3 2 0 4 0 4 0 MatMatMult 1 1.0 8.4488e-03 1.0 1.78e+05 1.0 2.1e+03 8.0e+02 1.6e+01 0 0 2 0 4 0 0 7 3 5 1171 MatMatMultSym 1 1.0 7.6680e-03 1.1 0.00e+00 0.0 1.8e+03 6.3e+02 1.4e+01 0 0 2 0 3 0 0 6 2 5 0 MatMatMultNum 1 1.0 7.8893e-04 1.0 1.78e+05 1.0 2.6e+02 2.0e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 12545 MatGetLocalMat 2 1.0 9.2196e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 5.2214e-04 3.5 0.00e+00 0.0 1.0e+03 1.4e+03 0.0e+00 0 0 1 0 0 0 0 4 2 0 0 PCSetUp 4 1.0 4.5953e+00 1.0 6.15e+05 1.1 3.4e+03 3.3e+03 6.6e+01 23 1 3 1 15 88 1 12 19 22 7 PCSetUpOnBlocks 17 1.0 1.9102e-03 1.1 3.40e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9345 PCApply 17 1.0 4.9382e+00 1.0 4.60e+06 1.1 4.4e+03 9.4e+02 4.0e+00 25 10 4 0 1 94 10 15 7 1 50 KSPGMRESOrthog 16 1.0 6.3207e-03 1.3 1.16e+07 1.1 0.0e+00 0.0e+00 1.6e+01 0 25 0 0 4 0 25 0 0 5 99973 KSPSetUp 4 1.0 3.2091e-04 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 5.0805e+00 1.0 2.39e+07 1.1 2.5e+04 1.5e+03 2.3e+02 26 50 22 4 53 97 50 85 63 74 256 SFBcastBegin 1 1.0 1.5471e-0310.0 0.00e+00 0.0 1.7e+03 1.5e+03 1.0e+00 0 0 1 0 0 0 0 6 4 0 0 SFBcastEnd 1 1.0 6.0892e-0451.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 213 213 22655968 0. IS L to G Mapping 3 3 12801292 0. Section 70 53 35616 0. Vector 15 45 9747824 0. Vector Scatter 2 7 490168 0. Matrix 0 5 2241752 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 38312 0. IS L to G Mapping 4 0 0 0. Vector 118 76 511680 0. Vector Scatter 13 2 2192 0. Matrix 26 8 1858040 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 1.07765e-05 Average time for zero size MPI_Send(): 1.37516e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= hypre 40 1 ================= Discretization: RT MPI processes 64: solving... ((17544, 1161600), (17544, 1161600)) Solver time: 5.107411e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 64 processors, by jychang48 Wed Mar 2 17:39:20 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.985e+01 1.00035 1.984e+01 Objects: 6.360e+02 1.33054 4.947e+02 Flops: 4.357e+07 1.18779 4.072e+07 2.606e+09 Flops/sec: 2.196e+06 1.18785 2.052e+06 1.314e+08 MPI Messages: 4.678e+03 4.58129 2.100e+03 1.344e+05 MPI Message Lengths: 1.774e+08 18.56783 6.891e+03 9.262e+08 MPI Reductions: 4.320e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4735e+01 74.3% 0.0000e+00 0.0% 1.007e+05 74.9% 6.442e+03 93.5% 1.250e+02 28.9% 1: FEM: 5.1075e+00 25.7% 2.6063e+09 100.0% 3.371e+04 25.1% 4.482e+02 6.5% 3.060e+02 70.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2411e+0012.0 0.00e+00 0.0 2.4e+04 4.0e+00 4.4e+01 6 0 18 0 10 8 0 24 0 35 0 VecScatterBegin 2 1.0 5.1975e-05 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 8.1062e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.2346e+00 1.1 0.00e+00 0.0 4.4e+04 1.8e+03 2.1e+01 11 0 33 9 5 15 0 44 9 17 0 Mesh Migration 2 1.0 3.9620e-01 1.0 0.00e+00 0.0 4.9e+04 1.4e+04 5.4e+01 2 0 36 74 12 3 0 48 79 43 0 DMPlexInterp 1 1.0 2.1139e+0058718.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.5038e+00 1.1 0.00e+00 0.0 3.3e+04 1.0e+04 2.5e+01 13 0 25 36 6 17 0 33 38 20 0 DMPlexDistCones 2 1.0 9.2579e-02 1.2 0.00e+00 0.0 7.2e+03 3.1e+04 4.0e+00 0 0 5 25 1 1 0 7 26 3 0 DMPlexDistLabels 2 1.0 2.5964e-01 1.0 0.00e+00 0.0 2.9e+04 1.3e+04 2.2e+01 1 0 22 42 5 2 0 29 45 18 0 DMPlexDistribOL 1 1.0 1.4357e-01 1.1 0.00e+00 0.0 6.1e+04 8.0e+03 5.0e+01 1 0 45 52 12 1 0 60 56 40 0 DMPlexDistField 3 1.0 3.1811e-02 2.4 0.00e+00 0.0 9.7e+03 3.4e+03 1.2e+01 0 0 7 4 3 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0804e+0051.6 0.00e+00 0.0 3.5e+04 1.0e+03 6.0e+00 5 0 26 4 1 7 0 35 4 5 0 DMPlexStratify 6 1.5 5.3848e-0156.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 2.8280e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2476e+00 4.0 0.00e+00 0.0 9.7e+04 8.7e+03 4.1e+01 6 0 72 91 9 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0134e-0110.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.5589e-0322.9 0.00e+00 0.0 2.9e+03 5.8e+03 3.0e+00 0 0 2 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.6300e-03 8.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.2902e-0515.3 0.00e+00 0.0 3.2e+02 1.3e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.2994e-04 3.1 0.00e+00 0.0 3.2e+02 1.3e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.2991e-0318.9 0.00e+00 0.0 6.5e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 2 0 0 0 VecMDot 16 1.0 5.5118e-03 1.9 5.08e+06 1.1 0.0e+00 0.0e+00 1.6e+01 0 12 0 0 4 0 12 0 0 5 57322 VecNorm 17 1.0 1.2081e-03 1.3 6.35e+05 1.1 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 4 0 2 0 0 6 32692 VecScale 34 1.0 3.6025e-04 1.2 5.33e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 91510 VecCopy 1 1.0 5.5075e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 110 1.0 7.6127e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 4.4823e-05 1.8 3.73e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 51831 VecMAXPY 17 1.0 1.3549e-03 1.1 5.68e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 14 0 0 0 260623 VecScatterBegin 155 1.0 2.3611e-03 1.6 0.00e+00 0.0 2.5e+04 1.1e+03 0.0e+00 0 0 18 3 0 0 0 74 45 0 0 VecScatterEnd 155 1.0 1.4179e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 17 1.0 1.5237e-03 1.3 9.52e+05 1.1 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 4 0 2 0 0 6 38879 MatMult 33 1.0 1.1678e-02 1.2 5.80e+06 1.1 2.5e+04 1.1e+03 1.3e+02 0 14 18 3 30 0 14 74 45 42 30787 MatMultAdd 64 1.0 6.7151e-03 1.3 4.96e+06 1.1 2.0e+04 1.1e+03 0.0e+00 0 12 15 2 0 0 12 58 37 0 45791 MatSolve 17 1.0 4.0982e-03 1.1 2.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 39822 MatLUFactorNum 1 1.0 1.1230e-03 1.1 2.96e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 15869 MatILUFactorSym 1 1.0 4.8018e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 9.8610e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.9418e-0330.9 8.46e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1812 MatAssemblyBegin 12 1.0 2.3770e-0314.8 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatAssemblyEnd 12 1.0 7.4003e-03 1.4 0.00e+00 0.0 4.9e+03 2.5e+02 4.8e+01 0 0 4 0 11 0 0 15 2 16 0 MatGetRow 12000 1.0 4.7642e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 3 1.0 1.1206e-05 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 6.6686e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0 MatGetOrdering 1 1.0 6.3181e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 9.9198e-02 1.0 0.00e+00 0.0 1.2e+03 2.2e+02 1.2e+01 0 0 1 0 3 2 0 4 0 4 0 MatMatMult 1 1.0 7.7269e-03 1.0 1.56e+05 1.0 2.4e+03 7.3e+02 1.6e+01 0 0 2 0 4 0 0 7 3 5 1281 MatMatMultSym 1 1.0 7.0100e-03 1.0 0.00e+00 0.0 2.1e+03 5.8e+02 1.4e+01 0 0 2 0 3 0 0 6 2 5 0 MatMatMultNum 1 1.0 7.3886e-04 1.1 1.56e+05 1.0 3.1e+02 1.8e+03 2.0e+00 0 0 0 0 0 0 0 1 1 1 13397 MatGetLocalMat 2 1.0 8.2898e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 4.1103e-04 2.6 0.00e+00 0.0 1.2e+03 1.2e+03 0.0e+00 0 0 1 0 0 0 0 4 3 0 0 PCSetUp 4 1.0 4.4948e+00 1.0 5.36e+05 1.1 3.9e+03 2.9e+03 6.6e+01 23 1 3 1 15 88 1 12 19 22 7 PCSetUpOnBlocks 17 1.0 1.6925e-03 1.1 2.96e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 10529 PCApply 17 1.0 4.8126e+00 1.0 4.02e+06 1.1 5.2e+03 8.5e+02 4.0e+00 24 9 4 0 1 94 9 15 7 1 51 KSPGMRESOrthog 16 1.0 6.7701e-03 1.6 1.02e+07 1.1 0.0e+00 0.0e+00 1.6e+01 0 24 0 0 4 0 24 0 0 5 93337 KSPSetUp 4 1.0 2.9278e-04 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 4.9404e+00 1.0 2.09e+07 1.1 2.9e+04 1.3e+03 2.3e+02 25 50 21 4 53 97 50 85 64 74 263 SFBcastBegin 1 1.0 1.3502e-03 8.5 0.00e+00 0.0 2.0e+03 1.4e+03 1.0e+00 0 0 1 0 0 0 0 6 4 0 0 SFBcastEnd 1 1.0 6.3281e-03530.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 227 227 22503312 0. IS L to G Mapping 3 3 12541056 0. Section 70 53 35616 0. Vector 15 45 9148440 0. Vector Scatter 2 7 428728 0. Matrix 0 5 1962204 0. Preconditioner 0 5 5176 0. Krylov Solver 0 5 23264 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 31 24 36680 0. IS L to G Mapping 4 0 0 0. Vector 118 76 463200 0. Vector Scatter 13 2 2192 0. Matrix 26 8 1629940 0. Preconditioner 6 1 896 0. Krylov Solver 6 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 1.2207e-05 Average time for zero size MPI_Send(): 1.64285e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type hypre -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- -------------- next part -------------- ================= ml 40 1 ================= Discretization: RT MPI processes 1: solving... ((1161600, 1161600), (1161600, 1161600)) Solver time: 5.434000e+00 Solver iterations: 14 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 1 processor, by jychang48 Wed Mar 2 17:40:07 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.920e+01 1.00000 1.920e+01 Objects: 3.260e+02 1.00000 3.260e+02 Flops: 3.005e+09 1.00000 3.005e+09 3.005e+09 Flops/sec: 1.565e+08 1.00000 1.565e+08 1.565e+08 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3767e+01 71.7% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: FEM: 5.4340e+00 28.3% 3.0049e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 8 1.0 1.6064e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 2 1.0 3.6058e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexInterp 1 1.0 2.0997e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 11 0 0 0 0 15 0 0 0 0 0 DMPlexStratify 4 1.0 5.1062e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0 SFSetGraph 7 1.0 2.6127e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM VecMDot 14 1.0 8.3214e-02 1.0 2.44e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 2 8 0 0 0 2931 VecNorm 15 1.0 1.8275e-02 1.0 3.48e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1907 VecScale 30 1.0 1.8777e-02 1.0 2.91e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1549 VecCopy 1 1.0 2.6550e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 231 1.0 2.2175e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 4 0 0 0 0 0 VecAXPY 1 1.0 1.4780e-03 1.0 2.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1572 VecAYPX 75 1.0 7.5226e-03 1.0 6.78e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 901 VecMAXPY 15 1.0 9.9054e-02 1.0 2.76e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 9 0 0 0 2 9 0 0 0 2791 VecScatterBegin 66 1.0 7.1086e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 15 1.0 2.8637e-02 1.0 5.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 1825 MatMult 189 1.0 4.6065e-01 1.0 5.58e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 19 0 0 0 8 19 0 0 0 1212 MatMultAdd 131 1.0 2.5532e-01 1.0 3.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 10 0 0 0 5 10 0 0 0 1215 MatSolve 30 1.0 1.6508e-01 1.0 1.50e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 3 5 0 0 0 908 MatSOR 150 1.0 7.3408e-01 1.0 8.00e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 27 0 0 0 14 27 0 0 0 1089 MatLUFactorSym 1 1.0 1.0014e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 6.5596e-02 1.0 1.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 276 MatILUFactorSym 1 1.0 4.8129e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatConvert 1 1.0 1.8516e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 7.8988e-03 1.0 5.34e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 676 MatResidual 75 1.0 1.0117e-01 1.0 1.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 2 5 0 0 0 1436 MatAssemblyBegin 25 1.0 1.6451e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 25 1.0 1.1168e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 MatGetRow 768000 1.0 4.7904e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 9 0 0 0 0 0 MatGetRowIJ 2 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 4 1.0 4.8298e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetOrdering 2 1.0 3.0270e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.0552e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 19 0 0 0 0 0 MatMatMult 1 1.0 1.2038e-01 1.0 1.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 111 MatMatMultSym 1 1.0 8.1955e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 MatMatMultNum 1 1.0 3.8406e-02 1.0 1.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 347 PCSetUp 4 1.0 2.2227e+00 1.0 1.01e+08 1.0 0.0e+00 0.0e+00 0.0e+00 12 3 0 0 0 41 3 0 0 0 46 PCSetUpOnBlocks 15 1.0 1.1679e-01 1.0 1.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 2 1 0 0 0 155 PCApply 15 1.0 2.1770e+00 1.0 1.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 11 44 0 0 0 40 44 0 0 0 605 KSPGMRESOrthog 14 1.0 1.7076e-01 1.0 4.88e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 16 0 0 0 3 16 0 0 0 2857 KSPSetUp 10 1.0 2.5997e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 4.0425e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00 0.0e+00 21 73 0 0 0 74 73 0 0 0 539 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 22 25 38642032 0. Section 26 8 5376 0. Vector 13 68 289326248 0. Vector Scatter 2 6 3984 0. Matrix 0 19 155400352 0. Preconditioner 0 11 11008 0. Krylov Solver 0 11 30976 0. Distributed Mesh 10 4 19256 0. GraphPartitioner 4 3 1836 0. Star Forest Bipartite Graph 23 12 9696 0. Discrete System 10 4 3456 0. --- Event Stage 1: FEM Index Set 22 12 9408 0. IS L to G Mapping 4 0 0 0. Vector 127 63 19225304 0. Vector Scatter 6 0 0 0. Matrix 26 2 37023836 0. Preconditioner 12 1 1016 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 2: solving... ((579051, 1161600), (579051, 1161600)) Solver time: 3.572446e+00 Solver iterations: 16 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 2 processors, by jychang48 Wed Mar 2 17:40:26 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.773e+01 1.00036 1.773e+01 Objects: 7.890e+02 1.01806 7.820e+02 Flops: 1.659e+09 1.00160 1.658e+09 3.315e+09 Flops/sec: 9.356e+07 1.00124 9.350e+07 1.870e+08 MPI Messages: 7.360e+02 1.10345 7.015e+02 1.403e+03 MPI Message Lengths: 4.089e+08 1.61162 4.723e+05 6.626e+08 MPI Reductions: 6.180e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4155e+01 79.8% 0.0000e+00 0.0% 5.020e+02 35.8% 4.447e+05 94.2% 1.250e+02 20.2% 1: FEM: 3.5725e+00 20.2% 3.3150e+09 100.0% 9.010e+02 64.2% 2.757e+04 5.8% 4.920e+02 79.6% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 8.3753e-0110.1 0.00e+00 0.0 1.2e+02 4.0e+00 4.4e+01 3 0 8 0 7 3 0 24 0 35 0 VecScatterBegin 2 1.0 1.5290e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 3.0994e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.6361e+00 1.1 0.00e+00 0.0 9.2e+01 5.5e+05 2.1e+01 9 0 7 8 3 11 0 18 8 17 0 Mesh Migration 2 1.0 1.7862e+00 1.0 0.00e+00 0.0 3.7e+02 1.4e+06 5.4e+01 10 0 27 78 9 13 0 75 83 43 0 DMPlexInterp 1 1.0 2.0356e+0049929.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 7 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2571e+00 1.1 0.00e+00 0.0 1.7e+02 1.9e+06 2.5e+01 12 0 12 47 4 15 0 33 50 20 0 DMPlexDistCones 2 1.0 3.6382e-01 1.0 0.00e+00 0.0 5.4e+01 3.2e+06 4.0e+00 2 0 4 26 1 3 0 11 28 3 0 DMPlexDistLabels 2 1.0 9.6086e-01 1.0 0.00e+00 0.0 2.4e+02 1.2e+06 2.2e+01 5 0 17 44 4 7 0 48 47 18 0 DMPlexDistribOL 1 1.0 1.1848e+00 1.0 0.00e+00 0.0 3.1e+02 9.6e+05 5.0e+01 7 0 22 45 8 8 0 61 48 40 0 DMPlexDistField 3 1.0 4.3099e-02 1.1 0.00e+00 0.0 6.2e+01 3.5e+05 1.2e+01 0 0 4 3 2 0 0 12 3 10 0 DMPlexDistData 2 1.0 8.3743e-0126.3 0.00e+00 0.0 5.4e+01 4.0e+05 6.0e+00 2 0 4 3 1 3 0 11 3 5 0 DMPlexStratify 6 1.5 7.6924e-01 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0 SFSetGraph 51 1.0 4.1922e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 SFBcastBegin 95 1.0 9.3978e-01 3.2 0.00e+00 0.0 4.8e+02 1.2e+06 4.1e+01 3 0 34 91 7 4 0 96 96 33 0 SFBcastEnd 95 1.0 4.0925e-01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 3.2341e-03 1.3 0.00e+00 0.0 1.1e+01 1.3e+06 3.0e+00 0 0 1 2 0 0 0 2 2 2 0 SFReduceEnd 4 1.0 5.1863e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.0994e-05 7.6 0.00e+00 0.0 1.0e+00 4.2e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 4.2391e-04 2.8 0.00e+00 0.0 1.0e+00 4.2e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.8229e-03121.4 0.00e+00 0.0 2.0e+00 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 16 1.0 5.0983e-02 1.0 1.58e+08 1.0 0.0e+00 0.0e+00 1.6e+01 0 10 0 0 3 1 10 0 0 3 6197 VecNorm 17 1.0 1.0093e-02 1.0 1.98e+07 1.0 0.0e+00 0.0e+00 1.7e+01 0 1 0 0 3 0 1 0 0 3 3913 VecScale 289 1.0 1.0229e-02 1.0 1.69e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3286 VecCopy 1 1.0 1.1420e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 238 1.0 3.8023e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecAXPY 86 1.0 4.3805e-03 1.0 8.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4027 VecAYPX 85 1.0 3.9430e-03 1.0 3.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1942 VecMAXPY 17 1.0 5.8177e-02 1.0 1.77e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 2 11 0 0 0 6070 VecScatterBegin 541 1.0 3.9975e-02 1.0 0.00e+00 0.0 8.2e+02 1.3e+04 0.0e+00 0 0 58 2 0 1 0 91 27 0 0 VecScatterEnd 541 1.0 1.5631e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 17 1.0 1.6049e-02 1.0 2.97e+07 1.0 0.0e+00 0.0e+00 1.7e+01 0 2 0 0 3 0 2 0 0 3 3691 MatMult 213 1.0 2.6564e-01 1.0 2.90e+08 1.0 2.5e+02 1.8e+04 1.3e+02 1 17 18 1 21 7 17 28 12 26 2183 MatMultAdd 149 1.0 1.5978e-01 1.0 1.62e+08 1.0 6.4e+01 3.6e+04 0.0e+00 1 10 5 0 0 4 10 7 6 0 2020 MatSolve 34 1.0 1.0493e-01 1.0 8.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 3 5 0 0 0 1612 MatSOR 170 1.0 4.4878e-01 1.0 4.49e+08 1.0 5.1e+02 1.0e+04 0.0e+00 3 27 36 1 0 13 27 57 13 0 1996 MatLUFactorSym 1 1.0 1.0967e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 3.4249e-02 1.0 9.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 527 MatILUFactorSym 1 1.0 2.4367e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatConvert 2 1.0 1.0835e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 3.6101e-03 1.0 2.67e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1477 MatResidual 85 1.0 5.9523e-02 1.0 8.19e+07 1.0 1.7e+02 1.0e+04 0.0e+00 0 5 12 0 0 2 5 19 4 0 2746 MatAssemblyBegin 19 1.0 1.9948e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 0 0 0 0 3 0 0 0 0 3 0 MatAssemblyEnd 19 1.0 1.0086e-01 1.0 0.00e+00 0.0 3.6e+01 4.1e+03 8.8e+01 1 0 3 0 14 3 0 4 0 18 0 MatGetRow 384000 1.0 3.7113e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 10 0 0 0 0 0 MatGetRowIJ 2 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 1.2000e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 2.1678e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 1 0 0 0 1 0 MatGetOrdering 2 1.0 1.4310e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.6797e-01 1.0 0.00e+00 0.0 4.0e+00 6.7e+03 1.2e+01 4 0 0 0 2 21 0 0 0 2 0 MatMatMult 1 1.0 1.4012e-01 1.0 4.95e+06 1.0 8.0e+00 2.2e+04 1.6e+01 1 0 1 0 3 4 0 1 0 3 71 MatMatMultSym 1 1.0 1.2285e-01 1.0 0.00e+00 0.0 7.0e+00 1.8e+04 1.4e+01 1 0 0 0 2 3 0 1 0 3 0 MatMatMultNum 1 1.0 1.7299e-02 1.0 4.95e+06 1.0 1.0e+00 5.5e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 572 MatRedundantMat 1 1.0 1.2240e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 2.2047e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatGetBrAoCol 2 1.0 1.2829e-03 2.9 0.00e+00 0.0 4.0e+00 3.8e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCSetUp 4 1.0 1.4806e+00 1.0 4.89e+07 1.0 7.6e+01 1.3e+05 2.5e+02 8 3 5 2 41 41 3 8 26 51 66 PCSetUpOnBlocks 17 1.0 6.0061e-02 1.0 9.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 301 PCApply 17 1.0 1.2664e+00 1.0 6.96e+08 1.0 7.9e+02 1.0e+04 1.9e+02 7 42 56 1 31 35 42 87 21 39 1099 KSPGMRESOrthog 16 1.0 1.0304e-01 1.0 3.17e+08 1.0 0.0e+00 0.0e+00 1.6e+01 1 19 0 0 3 3 19 0 0 3 6133 KSPSetUp 11 1.0 9.7177e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 2 0 0 0 0 2 0 KSPSolve 1 1.0 2.5623e+00 1.0 1.22e+09 1.0 8.7e+02 2.3e+04 4.1e+02 14 74 62 3 67 72 74 97 51 84 954 SFBcastBegin 1 1.0 1.8821e-0324.7 0.00e+00 0.0 6.0e+00 4.1e+04 1.0e+00 0 0 0 0 0 0 0 1 1 0 0 SFBcastEnd 1 1.0 7.5102e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 79 84 49128276 0. IS L to G Mapping 3 3 23945692 0. Section 70 53 35616 0. Vector 15 96 153254352 0. Vector Scatter 2 14 13912584 0. Matrix 0 34 80629960 0. Preconditioner 0 12 11944 0. Krylov Solver 0 12 32144 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 52 40 97704 0. IS L to G Mapping 4 0 0 0. Vector 348 255 71197824 0. Vector Scatter 20 2 2192 0. Matrix 55 8 52067772 0. Preconditioner 13 1 896 0. Krylov Solver 13 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 1.00136e-06 Average time for zero size MPI_Send(): 2.5034e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 4: solving... ((288348, 1161600), (288348, 1161600)) Solver time: 2.355974e+00 Solver iterations: 17 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 4 processors, by jychang48 Wed Mar 2 17:40:43 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.540e+01 1.00012 1.540e+01 Objects: 7.540e+02 1.02725 7.395e+02 Flops: 8.911e+08 1.00740 8.865e+08 3.546e+09 Flops/sec: 5.788e+07 1.00742 5.759e+07 2.303e+08 MPI Messages: 1.734e+03 1.43406 1.438e+03 5.750e+03 MPI Message Lengths: 2.936e+08 2.18390 1.217e+05 7.001e+08 MPI Reductions: 5.940e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3039e+01 84.7% 0.0000e+00 0.0% 1.654e+03 28.8% 1.129e+05 92.7% 1.250e+02 21.0% 1: FEM: 2.3563e+00 15.3% 3.5462e+09 100.0% 4.096e+03 71.2% 8.839e+03 7.3% 4.680e+02 78.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 9.1547e-0122.2 0.00e+00 0.0 3.9e+02 4.0e+00 4.4e+01 4 0 7 0 7 5 0 24 0 35 0 VecScatterBegin 2 1.0 6.8903e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 4.0531e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.7451e+00 1.1 0.00e+00 0.0 3.8e+02 1.4e+05 2.1e+01 11 0 7 8 4 13 0 23 8 17 0 Mesh Migration 2 1.0 1.0095e+00 1.0 0.00e+00 0.0 1.1e+03 4.7e+05 5.4e+01 7 0 20 77 9 8 0 69 83 43 0 DMPlexInterp 1 1.0 2.0528e+0051557.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0 DMPlexDistribute 1 1.0 2.0368e+00 1.1 0.00e+00 0.0 3.9e+02 8.1e+05 2.5e+01 13 0 7 45 4 15 0 23 49 20 0 DMPlexDistCones 2 1.0 2.3249e-01 1.0 0.00e+00 0.0 1.6e+02 1.1e+06 4.0e+00 2 0 3 26 1 2 0 10 28 3 0 DMPlexDistLabels 2 1.0 5.3011e-01 1.0 0.00e+00 0.0 7.2e+02 4.2e+05 2.2e+01 3 0 13 43 4 4 0 44 47 18 0 DMPlexDistribOL 1 1.0 7.3405e-01 1.0 0.00e+00 0.0 1.2e+03 2.8e+05 5.0e+01 5 0 20 45 8 6 0 70 49 40 0 DMPlexDistField 3 1.0 2.9428e-02 1.1 0.00e+00 0.0 2.0e+02 1.1e+05 1.2e+01 0 0 4 3 2 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.2139e-0141.2 0.00e+00 0.0 2.2e+02 1.0e+05 6.0e+00 4 0 4 3 1 5 0 14 4 5 0 DMPlexStratify 6 1.5 6.4224e-01 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFSetGraph 51 1.0 2.3842e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFBcastBegin 95 1.0 9.6980e-01 4.8 0.00e+00 0.0 1.6e+03 4.0e+05 4.1e+01 5 0 27 89 7 6 0 95 96 33 0 SFBcastEnd 95 1.0 3.1558e-01 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 2.9893e-03 2.1 0.00e+00 0.0 4.9e+01 2.9e+05 3.0e+00 0 0 1 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 6.0470e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.6955e-0512.9 0.00e+00 0.0 5.0e+00 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 3.9387e-04 2.3 0.00e+00 0.0 5.0e+00 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.7231e-0396.4 0.00e+00 0.0 1.0e+01 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 17 1.0 3.9099e-02 1.2 8.94e+07 1.0 0.0e+00 0.0e+00 1.7e+01 0 10 0 0 3 1 10 0 0 4 9091 VecNorm 18 1.0 8.1410e-03 1.3 1.05e+07 1.0 0.0e+00 0.0e+00 1.8e+01 0 1 0 0 3 0 1 0 0 4 5137 VecScale 252 1.0 5.7852e-03 1.0 9.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 6272 VecCopy 1 1.0 6.0797e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 232 1.0 1.7792e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecAXPY 73 1.0 2.5489e-03 1.0 4.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 7271 VecAYPX 72 1.0 2.3601e-03 1.1 2.03e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3434 VecMAXPY 18 1.0 3.7059e-02 1.0 9.93e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 2 11 0 0 0 10657 VecScatterBegin 500 1.0 2.3368e-02 1.0 0.00e+00 0.0 3.8e+03 5.8e+03 0.0e+00 0 0 66 3 0 1 0 93 43 0 0 VecScatterEnd 500 1.0 8.3990e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 18 1.0 1.1338e-02 1.2 1.58e+07 1.0 0.0e+00 0.0e+00 1.8e+01 0 2 0 0 3 0 2 0 0 4 5532 MatMult 189 1.0 1.5278e-01 1.0 1.54e+08 1.0 1.2e+03 8.0e+03 1.4e+02 1 17 21 1 23 6 17 29 19 29 4009 MatMultAdd 140 1.0 9.3157e-02 1.0 8.62e+07 1.0 3.4e+02 1.4e+04 0.0e+00 1 10 6 1 0 4 10 8 10 0 3681 MatSolve 36 1.0 5.9907e-02 1.1 4.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 2 5 0 0 0 2978 MatSOR 144 1.0 2.5068e-01 1.0 2.40e+08 1.0 2.3e+03 4.9e+03 0.0e+00 2 27 39 2 0 11 27 55 22 0 3799 MatLUFactorSym 1 1.0 1.3113e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.8109e-02 1.0 4.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 997 MatILUFactorSym 1 1.0 1.3030e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatConvert 2 1.0 5.7888e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.0142e-03 1.1 1.34e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2647 MatResidual 72 1.0 3.2858e-02 1.0 4.39e+07 1.0 7.6e+02 4.9e+03 0.0e+00 0 5 13 1 0 1 5 18 7 0 5307 MatAssemblyBegin 18 1.0 1.1560e-02 9.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 5.6518e-02 1.1 0.00e+00 0.0 1.7e+02 1.8e+03 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 192000 1.0 3.7683e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 16 0 0 0 0 0 MatGetRowIJ 2 1.0 5.9605e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 6.4135e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 1.0420e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 7.2408e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.6580e-01 1.0 0.00e+00 0.0 2.0e+01 2.7e+03 1.2e+01 5 0 0 0 2 32 0 0 0 3 0 MatMatMult 1 1.0 7.8246e-02 1.0 2.48e+06 1.0 4.0e+01 8.9e+03 1.6e+01 0 0 1 0 3 3 0 1 1 3 126 MatMatMultSym 1 1.0 6.8802e-02 1.1 0.00e+00 0.0 3.5e+01 7.0e+03 1.4e+01 0 0 1 0 2 3 0 1 0 3 0 MatMatMultNum 1 1.0 9.4202e-03 1.0 2.48e+06 1.0 5.0e+00 2.2e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 1050 MatRedundantMat 1 1.0 8.8930e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 1.1519e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 4.1471e-03 5.9 0.00e+00 0.0 2.0e+01 1.5e+04 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 1.1437e+00 1.0 2.45e+07 1.0 3.2e+02 3.4e+04 2.2e+02 7 3 6 2 37 48 3 8 22 47 85 PCSetUpOnBlocks 18 1.0 3.1880e-02 1.1 4.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 566 PCApply 18 1.0 6.9520e-01 1.0 3.70e+08 1.0 3.6e+03 4.8e+03 1.6e+02 4 42 62 2 26 29 42 87 33 33 2119 KSPGMRESOrthog 17 1.0 7.2115e-02 1.1 1.79e+08 1.0 0.0e+00 0.0e+00 1.7e+01 0 20 0 0 3 3 20 0 0 4 9858 KSPSetUp 10 1.0 5.5130e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 1.7607e+00 1.0 6.60e+08 1.0 4.0e+03 7.9e+03 3.9e+02 11 74 69 5 65 75 74 98 62 83 1494 SFBcastBegin 1 1.0 1.7951e-0317.1 0.00e+00 0.0 3.0e+01 1.7e+04 1.0e+00 0 0 1 0 0 0 0 1 1 0 0 SFBcastEnd 1 1.0 1.2398e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 87 92 35927100 0. IS L to G Mapping 3 3 18881016 0. Section 70 53 35616 0. Vector 15 87 78953080 0. Vector Scatter 2 13 6934616 0. Matrix 0 29 40306524 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 100832 0. IS L to G Mapping 4 0 0 0. Vector 315 231 37564192 0. Vector Scatter 19 2 2192 0. Matrix 50 8 26014168 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 1.19209e-06 Average time for zero size MPI_Send(): 1.54972e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 8: solving... ((143102, 1161600), (143102, 1161600)) Solver time: 1.725371e+00 Solver iterations: 17 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 8 processors, by jychang48 Wed Mar 2 17:40:59 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.389e+01 1.00022 1.389e+01 Objects: 7.700e+02 1.04336 7.428e+02 Flops: 4.567e+08 1.01904 4.507e+08 3.605e+09 Flops/sec: 3.287e+07 1.01910 3.244e+07 2.595e+08 MPI Messages: 3.824e+03 1.59612 2.718e+03 2.174e+04 MPI Message Lengths: 2.346e+08 3.28110 3.422e+04 7.441e+08 MPI Reductions: 5.940e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2167e+01 87.6% 0.0000e+00 0.0% 5.296e+03 24.4% 3.124e+04 91.3% 1.250e+02 21.0% 1: FEM: 1.7258e+00 12.4% 3.6052e+09 100.0% 1.645e+04 75.6% 2.978e+03 8.7% 4.680e+02 78.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 9.9127e-0135.3 0.00e+00 0.0 1.2e+03 4.0e+00 4.4e+01 6 0 6 0 7 7 0 23 0 35 0 VecScatterBegin 2 1.0 2.7800e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 3.3379e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.8539e+00 1.1 0.00e+00 0.0 1.4e+03 4.1e+04 2.1e+01 13 0 6 8 4 15 0 26 8 17 0 Mesh Migration 2 1.0 6.9888e-01 1.0 0.00e+00 0.0 3.4e+03 1.6e+05 5.4e+01 5 0 16 75 9 6 0 65 82 43 0 DMPlexInterp 1 1.0 2.1155e+0064296.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 DMPlexDistribute 1 1.0 2.0479e+00 1.1 0.00e+00 0.0 1.0e+03 3.2e+05 2.5e+01 15 0 5 43 4 17 0 19 47 20 0 DMPlexDistCones 2 1.0 1.6353e-01 1.0 0.00e+00 0.0 4.9e+02 3.8e+05 4.0e+00 1 0 2 25 1 1 0 9 27 3 0 DMPlexDistLabels 2 1.0 3.9835e-01 1.0 0.00e+00 0.0 2.1e+03 1.5e+05 2.2e+01 3 0 10 42 4 3 0 40 47 18 0 DMPlexDistribOL 1 1.0 5.2452e-01 1.0 0.00e+00 0.0 3.9e+03 8.8e+04 5.0e+01 4 0 18 46 8 4 0 73 50 40 0 DMPlexDistField 3 1.0 2.3566e-02 1.3 0.00e+00 0.0 6.4e+02 3.8e+04 1.2e+01 0 0 3 3 2 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.7306e-0152.6 0.00e+00 0.0 8.5e+02 3.0e+04 6.0e+00 6 0 4 3 1 7 0 16 4 5 0 DMPlexStratify 6 1.5 5.9495e-01 8.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 SFSetGraph 51 1.0 1.4286e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 SFBcastBegin 95 1.0 1.0315e+00 6.1 0.00e+00 0.0 5.0e+03 1.3e+05 4.1e+01 6 0 23 88 7 7 0 95 96 33 0 SFBcastEnd 95 1.0 2.9712e-01 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 3.7477e-03 3.8 0.00e+00 0.0 1.8e+02 8.2e+04 3.0e+00 0 0 1 2 1 0 0 3 2 2 0 SFReduceEnd 4 1.0 6.8834e-03 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 6.5088e-0513.0 0.00e+00 0.0 1.9e+01 7.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 2.9087e-04 2.9 0.00e+00 0.0 1.9e+01 7.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.9469e-0377.8 0.00e+00 0.0 3.8e+01 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 17 1.0 2.6954e-02 1.1 4.48e+07 1.0 0.0e+00 0.0e+00 1.7e+01 0 10 0 0 3 1 10 0 0 4 13187 VecNorm 18 1.0 1.0048e-02 2.8 5.27e+06 1.0 0.0e+00 0.0e+00 1.8e+01 0 1 0 0 3 0 1 0 0 4 4162 VecScale 252 1.0 3.3975e-03 1.1 4.70e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 10932 VecCopy 1 1.0 3.8004e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 232 1.0 6.7997e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 73 1.0 1.5988e-03 1.1 2.33e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 11593 VecAYPX 72 1.0 1.4379e-03 1.1 1.02e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5638 VecMAXPY 18 1.0 2.6705e-02 1.1 4.98e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 1 11 0 0 0 14789 VecScatterBegin 500 1.0 1.6459e-02 1.1 0.00e+00 0.0 1.5e+04 2.3e+03 0.0e+00 0 0 71 5 0 1 0 93 54 0 0 VecScatterEnd 500 1.0 8.3652e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 18 1.0 1.1821e-02 2.2 7.91e+06 1.0 0.0e+00 0.0e+00 1.8e+01 0 2 0 0 3 0 2 0 0 4 5307 MatMult 189 1.0 1.0047e-01 1.1 7.71e+07 1.0 4.7e+03 3.2e+03 1.4e+02 1 17 22 2 23 6 17 29 23 29 6111 MatMultAdd 140 1.0 5.8617e-02 1.0 4.32e+07 1.0 1.3e+03 6.0e+03 0.0e+00 0 10 6 1 0 3 10 8 12 0 5850 MatSolve 36 1.0 3.2941e-02 1.1 2.25e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 2 5 0 0 0 5391 MatSOR 144 1.0 1.4923e-01 1.0 1.20e+08 1.0 9.2e+03 1.9e+03 0.0e+00 1 27 42 2 0 9 27 56 28 0 6406 MatLUFactorSym 1 1.0 1.4782e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 9.5110e-03 1.1 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 1900 MatILUFactorSym 1 1.0 6.8221e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 3.1312e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.1721e-03 1.1 6.70e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4550 MatResidual 72 1.0 1.9800e-02 1.0 2.22e+07 1.0 3.1e+03 1.9e+03 0.0e+00 0 5 14 1 0 1 5 19 9 0 8879 MatAssemblyBegin 18 1.0 1.0952e-0212.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 3.2517e-02 1.1 0.00e+00 0.0 6.8e+02 7.0e+02 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 96000 1.0 3.8074e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 22 0 0 0 0 0 MatGetRowIJ 2 1.0 9.0599e-06 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 1.1802e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 5.2059e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 3.7909e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.6783e-01 1.0 0.00e+00 0.0 7.6e+01 1.1e+03 1.2e+01 6 0 0 0 2 44 0 0 0 3 0 MatMatMult 1 1.0 4.1842e-02 1.0 1.24e+06 1.0 1.5e+02 3.7e+03 1.6e+01 0 0 1 0 3 2 0 1 1 3 237 MatMatMultSym 1 1.0 3.6508e-02 1.1 0.00e+00 0.0 1.3e+02 2.9e+03 1.4e+01 0 0 1 0 2 2 0 1 1 3 0 MatMatMultNum 1 1.0 5.3141e-03 1.0 1.24e+06 1.0 1.9e+01 9.1e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 1863 MatRedundantMat 1 1.0 1.4997e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 6.3159e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 2.3229e-03 3.7 0.00e+00 0.0 7.6e+01 6.3e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 9.7671e-01 1.0 1.23e+07 1.0 1.2e+03 9.8e+03 2.2e+02 7 3 6 2 37 57 3 8 19 47 100 PCSetUpOnBlocks 18 1.0 1.6695e-02 1.3 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 1082 PCApply 18 1.0 4.0179e-01 1.0 1.86e+08 1.0 1.5e+04 1.9e+03 1.6e+02 3 41 67 4 26 23 41 89 42 33 3677 KSPGMRESOrthog 17 1.0 5.0506e-02 1.1 8.96e+07 1.0 0.0e+00 0.0e+00 1.7e+01 0 20 0 0 3 3 20 0 0 4 14075 KSPSetUp 10 1.0 2.3572e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 1.3470e+00 1.0 3.31e+08 1.0 1.6e+04 2.8e+03 3.9e+02 10 73 74 6 65 78 73 98 70 83 1956 SFBcastBegin 1 1.0 2.0890e-0312.0 0.00e+00 0.0 1.1e+02 7.1e+03 1.0e+00 0 0 1 0 0 0 0 1 1 0 0 SFBcastEnd 1 1.0 1.4305e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 103 108 29168148 0. IS L to G Mapping 3 3 16320748 0. Section 70 53 35616 0. Vector 15 87 41816936 0. Vector Scatter 2 13 3448712 0. Matrix 0 29 20176708 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 96316 0. IS L to G Mapping 4 0 0 0. Vector 315 231 18936288 0. Vector Scatter 19 2 2192 0. Matrix 50 8 12996772 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 2.00272e-06 Average time for zero size MPI_Send(): 1.63913e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 16: solving... ((70996, 1161600), (70996, 1161600)) Solver time: 9.724491e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 16 processors, by jychang48 Wed Mar 2 17:41:17 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.549e+01 1.00008 1.549e+01 Objects: 8.040e+02 1.07200 7.584e+02 Flops: 2.486e+08 1.05022 2.426e+08 3.882e+09 Flops/sec: 1.605e+07 1.05015 1.567e+07 2.507e+08 MPI Messages: 6.216e+03 1.99327 4.443e+03 7.109e+04 MPI Message Lengths: 2.044e+08 5.47991 1.150e+04 8.175e+08 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4514e+01 93.7% 0.0000e+00 0.0% 1.470e+04 20.7% 1.024e+04 89.0% 1.250e+02 20.7% 1: FEM: 9.7263e-01 6.3% 3.8821e+09 100.0% 5.639e+04 79.3% 1.263e+03 11.0% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1980e+00 7.6 0.00e+00 0.0 3.5e+03 4.0e+00 4.4e+01 6 0 5 0 7 7 0 24 0 35 0 VecScatterBegin 2 1.0 8.1062e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 5.0068e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.9332e+00 1.1 0.00e+00 0.0 4.4e+03 1.4e+04 2.1e+01 12 0 6 8 3 13 0 30 9 17 0 Mesh Migration 2 1.0 5.2375e-01 1.0 0.00e+00 0.0 8.9e+03 6.6e+04 5.4e+01 3 0 13 72 9 4 0 61 81 43 0 DMPlexInterp 1 1.0 2.1157e+0055461.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 DMPlexDistribute 1 1.0 2.1336e+00 1.1 0.00e+00 0.0 2.9e+03 1.1e+05 2.5e+01 14 0 4 39 4 15 0 20 44 20 0 DMPlexDistCones 2 1.0 1.2215e-01 1.0 0.00e+00 0.0 1.3e+03 1.5e+05 4.0e+00 1 0 2 24 1 1 0 9 27 3 0 DMPlexDistLabels 2 1.0 3.1635e-01 1.0 0.00e+00 0.0 5.5e+03 6.1e+04 2.2e+01 2 0 8 41 4 2 0 38 46 18 0 DMPlexDistribOL 1 1.0 3.4299e-01 1.0 0.00e+00 0.0 1.1e+04 3.5e+04 5.0e+01 2 0 15 46 8 2 0 72 52 40 0 DMPlexDistField 3 1.0 2.6731e-02 1.7 0.00e+00 0.0 1.7e+03 1.5e+04 1.2e+01 0 0 2 3 2 0 0 12 4 10 0 DMPlexDistData 2 1.0 9.9244e-0167.5 0.00e+00 0.0 2.9e+03 9.6e+03 6.0e+00 6 0 4 3 1 6 0 20 4 5 0 DMPlexStratify 6 1.5 5.6545e-0115.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 8.5791e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 SFBcastBegin 95 1.0 1.2138e+00 3.6 0.00e+00 0.0 1.4e+04 5.0e+04 4.1e+01 7 0 20 86 7 7 0 95 97 33 0 SFBcastEnd 95 1.0 3.0706e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 7.5688e-0311.7 0.00e+00 0.0 5.0e+02 3.0e+04 3.0e+00 0 0 1 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.1227e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 5.1022e-0526.8 0.00e+00 0.0 5.4e+01 3.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 2.3007e-04 2.4 0.00e+00 0.0 5.4e+01 3.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.2939e-0336.9 0.00e+00 0.0 1.1e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 1.5401e-02 1.3 2.52e+07 1.0 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 25795 VecNorm 19 1.0 4.7302e-03 2.3 2.80e+06 1.0 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 9332 VecScale 266 1.0 2.0359e-03 1.1 2.62e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 19960 VecCopy 1 1.0 2.0003e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 3.7987e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 9.3222e-04 1.1 1.22e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 20859 VecAYPX 76 1.0 6.7663e-04 1.2 5.38e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12652 VecMAXPY 19 1.0 1.0955e-02 1.1 2.78e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 1 11 0 0 0 40080 VecScatterBegin 527 1.0 1.1807e-02 1.3 0.00e+00 0.0 5.3e+04 1.1e+03 0.0e+00 0 0 74 7 0 1 0 94 66 0 0 VecScatterEnd 527 1.0 1.1447e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 19 1.0 5.7294e-03 1.9 4.20e+06 1.0 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 11556 MatMult 199 1.0 5.5462e-02 1.1 4.11e+07 1.0 1.5e+04 1.7e+03 1.4e+02 0 17 22 3 24 5 17 27 28 30 11672 MatMultAdd 148 1.0 3.2288e-02 1.1 2.31e+07 1.0 3.9e+03 3.3e+03 0.0e+00 0 9 5 2 0 3 9 7 14 0 11244 MatSolve 38 1.0 1.7801e-02 1.2 1.19e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 2 5 0 0 0 10460 MatSOR 152 1.0 7.6650e-02 1.0 6.41e+07 1.0 3.2e+04 9.6e+02 0.0e+00 0 26 44 4 0 8 26 56 34 0 13202 MatLUFactorSym 1 1.0 2.9087e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 4.6940e-03 1.1 1.17e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3851 MatILUFactorSym 1 1.0 3.3820e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.5450e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.5489e-03 5.7 3.37e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2093 MatResidual 76 1.0 1.2002e-02 1.2 1.20e+07 1.0 1.1e+04 9.6e+02 0.0e+00 0 5 15 1 0 1 5 19 11 0 15609 MatAssemblyBegin 18 1.0 6.2559e-0314.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 2.1353e-02 1.2 0.00e+00 0.0 2.2e+03 3.4e+02 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 48000 1.0 1.9073e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 19 0 0 0 0 0 MatGetRowIJ 2 1.0 4.4990e-04117.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 1.5783e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 2.5711e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 6.6710e-04 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 3.8572e-01 1.0 0.00e+00 0.0 2.2e+02 6.2e+02 1.2e+01 2 0 0 0 2 40 0 0 0 3 0 MatMatMult 1 1.0 2.2668e-02 1.0 6.22e+05 1.0 4.3e+02 2.1e+03 1.6e+01 0 0 1 0 3 2 0 1 1 3 437 MatMatMultSym 1 1.0 1.9739e-02 1.1 0.00e+00 0.0 3.8e+02 1.6e+03 1.4e+01 0 0 1 0 2 2 0 1 1 3 0 MatMatMultNum 1 1.0 2.9190e-03 1.0 6.22e+05 1.0 5.4e+01 5.1e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 3392 MatRedundantMat 1 1.0 1.8787e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 3.1598e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 1.4391e-03 3.9 0.00e+00 0.0 2.2e+02 3.5e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 5.0388e-01 1.0 6.18e+06 1.0 3.9e+03 3.5e+03 2.2e+02 3 3 6 2 36 52 3 7 15 46 194 PCSetUpOnBlocks 19 1.0 8.1487e-03 1.3 1.17e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 2213 PCApply 19 1.0 2.2397e-01 1.0 9.86e+07 1.0 5.1e+04 9.1e+02 1.6e+02 1 40 72 6 26 23 40 90 52 33 6956 KSPGMRESOrthog 18 1.0 2.4910e-02 1.2 5.04e+07 1.0 0.0e+00 0.0e+00 1.8e+01 0 20 0 0 3 2 20 0 0 4 31896 KSPSetUp 10 1.0 1.2152e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 7.0285e-01 1.0 1.79e+08 1.0 5.6e+04 1.3e+03 4.0e+02 5 73 78 9 66 72 73 98 77 83 4018 SFBcastBegin 1 1.0 1.3649e-03 8.3 0.00e+00 0.0 3.3e+02 3.9e+03 1.0e+00 0 0 0 0 0 0 0 1 1 0 0 SFBcastEnd 1 1.0 6.1879e-03167.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 125 130 25605220 0. IS L to G Mapping 3 3 15014432 0. Section 70 53 35616 0. Vector 15 87 23370536 0. Vector Scatter 2 13 1718168 0. Matrix 0 29 10124824 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 84652 0. IS L to G Mapping 4 0 0 0. Vector 327 243 10154536 0. Vector Scatter 19 2 2192 0. Matrix 50 8 6483848 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 5.00679e-06 Average time for zero size MPI_Send(): 1.80304e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 24: solving... ((47407, 1161600), (47407, 1161600)) Solver time: 7.069969e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 24 processors, by jychang48 Wed Mar 2 17:41:35 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.503e+01 1.00014 1.503e+01 Objects: 8.180e+02 1.09067 7.607e+02 Flops: 1.700e+08 1.06571 1.644e+08 3.945e+09 Flops/sec: 1.131e+07 1.06582 1.094e+07 2.625e+08 MPI Messages: 8.270e+03 2.25324 5.616e+03 1.348e+05 MPI Message Lengths: 1.913e+08 7.34285 6.417e+03 8.650e+08 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4322e+01 95.3% 0.0000e+00 0.0% 2.618e+04 19.4% 5.638e+03 87.8% 1.250e+02 20.7% 1: FEM: 7.0697e-01 4.7% 3.9447e+09 100.0% 1.086e+05 80.6% 7.798e+02 12.2% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1264e+00 9.5 0.00e+00 0.0 6.2e+03 4.0e+00 4.4e+01 7 0 5 0 7 7 0 24 0 35 0 VecScatterBegin 2 1.0 5.2214e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.8835e-05 9.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 1.9900e+00 1.1 0.00e+00 0.0 8.7e+03 7.7e+03 2.1e+01 13 0 6 8 3 14 0 33 9 17 0 Mesh Migration 2 1.0 4.5752e-01 1.0 0.00e+00 0.0 1.5e+04 4.0e+04 5.4e+01 3 0 11 71 9 3 0 58 81 43 0 DMPlexInterp 1 1.0 2.1216e+0058932.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2043e+00 1.1 0.00e+00 0.0 5.7e+03 5.7e+04 2.5e+01 15 0 4 37 4 15 0 22 43 20 0 DMPlexDistCones 2 1.0 1.0450e-01 1.1 0.00e+00 0.0 2.2e+03 9.3e+04 4.0e+00 1 0 2 24 1 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.8710e-01 1.0 0.00e+00 0.0 9.3e+03 3.7e+04 2.2e+01 2 0 7 40 4 2 0 35 46 18 0 DMPlexDistribOL 1 1.0 2.6298e-01 1.0 0.00e+00 0.0 1.8e+04 2.2e+04 5.0e+01 2 0 14 47 8 2 0 70 53 40 0 DMPlexDistField 3 1.0 2.8758e-02 2.1 0.00e+00 0.0 3.0e+03 9.4e+03 1.2e+01 0 0 2 3 2 0 0 11 4 10 0 DMPlexDistData 2 1.0 9.9861e-0127.8 0.00e+00 0.0 6.0e+03 5.0e+03 6.0e+00 6 0 4 3 1 6 0 23 4 5 0 DMPlexStratify 6 1.5 5.6285e-0122.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 6.0824e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1366e+00 4.0 0.00e+00 0.0 2.5e+04 2.9e+04 4.1e+01 7 0 19 85 7 7 0 96 97 33 0 SFBcastEnd 95 1.0 3.0235e-01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 7.5159e-0313.3 0.00e+00 0.0 8.7e+02 1.8e+04 3.0e+00 0 0 1 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.3404e-03 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.4107e-0523.1 0.00e+00 0.0 9.4e+01 2.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.7095e-04 3.6 0.00e+00 0.0 9.4e+01 2.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.7259e-0328.3 0.00e+00 0.0 1.9e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 1.1039e-02 1.3 1.69e+07 1.0 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 35989 VecNorm 19 1.0 3.1846e-03 2.2 1.88e+06 1.0 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 13861 VecScale 266 1.0 1.4791e-03 1.1 1.81e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 28126 VecCopy 1 1.0 1.4901e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 2.5969e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 6.2633e-04 1.1 8.16e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31062 VecAYPX 76 1.0 4.6229e-04 1.2 3.59e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 18529 VecMAXPY 19 1.0 6.4399e-03 1.1 1.87e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 1 11 0 0 0 68182 VecScatterBegin 527 1.0 1.0387e-02 1.5 0.00e+00 0.0 1.0e+05 7.2e+02 0.0e+00 0 0 76 9 0 1 0 94 70 0 0 VecScatterEnd 527 1.0 1.1114e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 19 1.0 3.8595e-03 1.8 2.81e+06 1.0 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 17155 MatMult 199 1.0 3.9954e-02 1.1 2.74e+07 1.0 2.9e+04 1.1e+03 1.4e+02 0 16 21 4 24 5 16 26 30 30 16239 MatMultAdd 148 1.0 2.3000e-02 1.1 1.54e+07 1.0 6.8e+03 2.3e+03 0.0e+00 0 9 5 2 0 3 9 6 15 0 15786 MatSolve 38 1.0 1.2058e-02 1.1 7.95e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 2 5 0 0 0 15399 MatSOR 152 1.0 5.3896e-02 1.0 4.33e+07 1.0 6.0e+04 6.3e+02 0.0e+00 0 26 45 4 0 7 26 56 36 0 18843 MatLUFactorSym 1 1.0 3.1948e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 3.0081e-03 1.1 7.86e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6048 MatILUFactorSym 1 1.0 2.2931e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 1.0121e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.3041e-03 8.9 2.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2314 MatResidual 76 1.0 8.7066e-03 1.2 8.11e+06 1.1 2.0e+04 6.3e+02 0.0e+00 0 5 15 1 0 1 5 19 12 0 21688 MatAssemblyBegin 18 1.0 4.5638e-03 9.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 1.5963e-02 1.2 0.00e+00 0.0 4.3e+03 2.2e+02 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 32000 1.0 1.2688e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 18 0 0 0 0 0 MatGetRowIJ 2 1.0 5.6219e-04112.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 1.5306e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 1.8289e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 7.2408e-04 5.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 2.5769e-01 1.0 0.00e+00 0.0 3.7e+02 4.4e+02 1.2e+01 2 0 0 0 2 36 0 0 0 3 0 MatMatMult 1 1.0 1.5805e-02 1.0 4.15e+05 1.0 7.4e+02 1.5e+03 1.6e+01 0 0 1 0 3 2 0 1 1 3 626 MatMatMultSym 1 1.0 1.3917e-02 1.0 0.00e+00 0.0 6.5e+02 1.2e+03 1.4e+01 0 0 0 0 2 2 0 1 1 3 0 MatMatMultNum 1 1.0 1.8940e-03 1.0 4.15e+05 1.0 9.3e+01 3.6e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 5226 MatRedundantMat 1 1.0 1.8001e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 2.1231e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 8.0800e-04 3.2 0.00e+00 0.0 3.7e+02 2.5e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 3.4841e-01 1.0 4.13e+06 1.0 7.5e+03 2.0e+03 2.2e+02 2 2 6 2 36 49 2 7 14 46 281 PCSetUpOnBlocks 19 1.0 5.4433e-03 1.4 7.77e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 3303 PCApply 19 1.0 1.6812e-01 1.0 6.60e+07 1.0 9.9e+04 5.9e+02 1.6e+02 1 40 74 7 26 24 40 91 55 33 9296 KSPGMRESOrthog 18 1.0 1.6622e-02 1.2 3.38e+07 1.0 0.0e+00 0.0e+00 1.8e+01 0 20 0 0 3 2 20 0 0 4 47800 KSPSetUp 10 1.0 8.8477e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 4.9091e-01 1.0 1.19e+08 1.0 1.1e+05 7.9e+02 4.0e+02 3 72 79 10 66 69 72 99 80 83 5762 SFBcastBegin 1 1.0 1.8461e-0310.3 0.00e+00 0.0 5.8e+02 2.8e+03 1.0e+00 0 0 0 0 0 0 0 1 2 0 0 SFBcastEnd 1 1.0 6.1107e-0451.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 139 144 24068724 0. IS L to G Mapping 3 3 14448024 0. Section 70 53 35616 0. Vector 15 87 17303640 0. Vector Scatter 2 13 1152032 0. Matrix 0 29 6806588 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 67128 0. IS L to G Mapping 4 0 0 0. Vector 327 243 6910640 0. Vector Scatter 19 2 2192 0. Matrix 50 8 4329832 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.82285e-06 Average time for zero size MPI_Send(): 1.37091e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 32: solving... ((35155, 1161600), (35155, 1161600)) Solver time: 5.819740e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 32 processors, by jychang48 Wed Mar 2 17:41:54 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.505e+01 1.00013 1.505e+01 Objects: 8.440e+02 1.12533 7.624e+02 Flops: 1.300e+08 1.08171 1.250e+08 3.999e+09 Flops/sec: 8.641e+06 1.08171 8.307e+06 2.658e+08 MPI Messages: 1.004e+04 2.37869 6.666e+03 2.133e+05 MPI Message Lengths: 1.880e+08 9.33983 4.255e+03 9.076e+08 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4463e+01 96.1% 0.0000e+00 0.0% 3.925e+04 18.4% 3.698e+03 86.9% 1.250e+02 20.7% 1: FEM: 5.8202e-01 3.9% 3.9993e+09 100.0% 1.740e+05 81.6% 5.576e+02 13.1% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.1604e+0012.6 0.00e+00 0.0 9.3e+03 4.0e+00 4.4e+01 7 0 4 0 7 7 0 24 0 35 0 VecScatterBegin 2 1.0 3.6001e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.6928e-0517.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.0413e+00 1.1 0.00e+00 0.0 1.4e+04 5.1e+03 2.1e+01 14 0 7 8 3 14 0 36 9 17 0 Mesh Migration 2 1.0 4.3615e-01 1.0 0.00e+00 0.0 2.2e+04 2.9e+04 5.4e+01 3 0 10 70 9 3 0 56 80 43 0 DMPlexInterp 1 1.0 2.1172e+0064348.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.2697e+00 1.1 0.00e+00 0.0 9.4e+03 3.5e+04 2.5e+01 15 0 4 36 4 16 0 24 41 20 0 DMPlexDistCones 2 1.0 1.0138e-01 1.1 0.00e+00 0.0 3.2e+03 6.6e+04 4.0e+00 1 0 2 23 1 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.7745e-01 1.0 0.00e+00 0.0 1.3e+04 2.7e+04 2.2e+01 2 0 6 39 4 2 0 34 45 18 0 DMPlexDistribOL 1 1.0 2.2781e-01 1.0 0.00e+00 0.0 2.7e+04 1.6e+04 5.0e+01 1 0 13 47 8 2 0 68 54 40 0 DMPlexDistField 3 1.0 3.1382e-02 2.1 0.00e+00 0.0 4.3e+03 6.8e+03 1.2e+01 0 0 2 3 2 0 0 11 4 10 0 DMPlexDistData 2 1.0 1.0031e+0069.7 0.00e+00 0.0 1.0e+04 3.2e+03 6.0e+00 6 0 5 4 1 7 0 26 4 5 0 DMPlexStratify 6 1.5 5.4998e-0129.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 5.1585e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.1656e+00 4.3 0.00e+00 0.0 3.8e+04 2.0e+04 4.1e+01 7 0 18 84 7 7 0 96 97 33 0 SFBcastEnd 95 1.0 2.9936e-01 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 8.7662e-0318.1 0.00e+00 0.0 1.3e+03 1.3e+04 3.0e+00 0 0 1 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 8.8432e-03 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.0054e-0518.7 0.00e+00 0.0 1.4e+02 2.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.4091e-04 2.2 0.00e+00 0.0 1.4e+02 2.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.3969e-0331.2 0.00e+00 0.0 2.9e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 8.9464e-03 1.4 1.27e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 44405 VecNorm 19 1.0 3.4175e-03 2.4 1.41e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 12916 VecScale 266 1.0 1.2009e-03 1.2 1.39e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 35370 VecCopy 1 1.0 1.2088e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 2.0475e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 4.8566e-04 1.2 6.13e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 40100 VecAYPX 76 1.0 3.9840e-04 1.4 2.70e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 21526 VecMAXPY 19 1.0 4.6327e-03 1.4 1.40e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 1 11 0 0 0 94779 VecScatterBegin 527 1.0 9.9692e-03 1.6 0.00e+00 0.0 1.6e+05 5.3e+02 0.0e+00 0 0 76 10 0 1 0 94 73 0 0 VecScatterEnd 527 1.0 1.0046e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 19 1.0 3.9191e-03 2.0 2.11e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 16894 MatMult 199 1.0 3.2215e-02 1.1 2.07e+07 1.0 4.5e+04 8.3e+02 1.4e+02 0 16 21 4 24 5 16 26 31 30 20169 MatMultAdd 148 1.0 1.8383e-02 1.1 1.16e+07 1.1 9.8e+03 1.9e+03 0.0e+00 0 9 5 2 0 3 9 6 16 0 19751 MatSolve 38 1.0 9.1224e-03 1.2 5.99e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 1 5 0 0 0 20347 MatSOR 152 1.0 4.2917e-02 1.0 3.26e+07 1.1 9.7e+04 4.7e+02 0.0e+00 0 25 45 5 0 7 25 55 38 0 23694 MatLUFactorSym 1 1.0 4.6015e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 2.2459e-03 1.1 6.08e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8301 MatILUFactorSym 1 1.0 1.7860e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 8.0991e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.7459e-03 8.5 1.69e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3055 MatResidual 76 1.0 7.5746e-03 1.2 6.14e+06 1.1 3.2e+04 4.7e+02 0.0e+00 0 5 15 2 0 1 5 18 13 0 25052 MatAssemblyBegin 18 1.0 3.6287e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 1.3054e-02 1.2 0.00e+00 0.0 7.0e+03 1.6e+02 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 24000 1.0 9.4720e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 16 0 0 0 0 0 MatGetRowIJ 2 1.0 5.1618e-04108.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 1.8001e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 1.2710e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 6.5613e-04 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.9316e-01 1.0 0.00e+00 0.0 5.4e+02 3.6e+02 1.2e+01 1 0 0 0 2 33 0 0 0 3 0 MatMatMult 1 1.0 1.2574e-02 1.0 3.11e+05 1.0 1.1e+03 1.2e+03 1.6e+01 0 0 1 0 3 2 0 1 1 3 787 MatMatMultSym 1 1.0 1.1223e-02 1.0 0.00e+00 0.0 9.4e+02 9.4e+02 1.4e+01 0 0 0 0 2 2 0 1 1 3 0 MatMatMultNum 1 1.0 1.3480e-03 1.0 3.11e+05 1.0 1.4e+02 2.9e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 7345 MatRedundantMat 1 1.0 2.0981e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 1.5619e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 7.3791e-04 2.4 0.00e+00 0.0 5.4e+02 2.0e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 2.6482e-01 1.0 3.11e+06 1.0 1.2e+04 1.3e+03 2.2e+02 2 2 6 2 36 45 2 7 13 46 371 PCSetUpOnBlocks 19 1.0 4.1590e-03 1.4 5.87e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 4318 PCApply 19 1.0 1.3506e-01 1.0 4.98e+07 1.0 1.6e+05 4.3e+02 1.6e+02 1 39 75 8 26 23 39 92 58 33 11590 KSPGMRESOrthog 18 1.0 1.2306e-02 1.3 2.54e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 20 0 0 3 2 20 0 0 4 64564 KSPSetUp 10 1.0 7.0786e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 3.7938e-01 1.0 9.02e+07 1.0 1.7e+05 5.7e+02 4.0e+02 3 71 81 11 66 65 71 99 82 83 7463 SFBcastBegin 1 1.0 1.4801e-03 9.9 0.00e+00 0.0 8.6e+02 2.2e+03 1.0e+00 0 0 0 0 0 0 0 0 2 0 0 SFBcastEnd 1 1.0 4.5705e-0426.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 165 170 23655892 0. IS L to G Mapping 3 3 14326164 0. Section 70 53 35616 0. Vector 15 87 14178840 0. Vector Scatter 2 13 857984 0. Matrix 0 29 5131328 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 69996 0. IS L to G Mapping 4 0 0 0. Vector 327 243 5282368 0. Vector Scatter 19 2 2192 0. Matrix 50 8 3242072 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 9.01222e-06 Average time for zero size MPI_Send(): 1.65403e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 40: solving... ((27890, 1161600), (27890, 1161600)) Solver time: 5.010350e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 40 processors, by jychang48 Wed Mar 2 17:42:13 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.486e+01 1.00013 1.486e+01 Objects: 8.660e+02 1.15775 7.635e+02 Flops: 1.061e+08 1.10163 1.011e+08 4.043e+09 Flops/sec: 7.138e+06 1.10159 6.800e+06 2.720e+08 MPI Messages: 1.235e+04 2.82415 7.572e+03 3.029e+05 MPI Message Lengths: 1.855e+08 11.36833 3.104e+03 9.402e+08 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4362e+01 96.6% 0.0000e+00 0.0% 5.371e+04 17.7% 2.676e+03 86.2% 1.250e+02 20.7% 1: FEM: 5.0110e-01 3.4% 4.0430e+09 100.0% 2.492e+05 82.3% 4.278e+02 13.8% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2159e+0011.9 0.00e+00 0.0 1.3e+04 4.0e+00 4.4e+01 7 0 4 0 7 8 0 24 0 35 0 VecScatterBegin 2 1.0 3.5763e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 7.8678e-06 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1240e+00 1.1 0.00e+00 0.0 2.0e+04 3.6e+03 2.1e+01 14 0 7 8 3 15 0 38 9 17 0 Mesh Migration 2 1.0 4.1766e-01 1.0 0.00e+00 0.0 2.9e+04 2.2e+04 5.4e+01 3 0 10 69 9 3 0 54 80 43 0 DMPlexInterp 1 1.0 2.1100e+0061887.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.3591e+00 1.1 0.00e+00 0.0 1.4e+04 2.3e+04 2.5e+01 16 0 5 35 4 16 0 26 40 20 0 DMPlexDistCones 2 1.0 9.5764e-02 1.1 0.00e+00 0.0 4.3e+03 5.1e+04 4.0e+00 1 0 1 23 1 1 0 8 27 3 0 DMPlexDistLabels 2 1.0 2.6905e-01 1.0 0.00e+00 0.0 1.7e+04 2.1e+04 2.2e+01 2 0 6 39 4 2 0 32 45 18 0 DMPlexDistribOL 1 1.0 2.0288e-01 1.0 0.00e+00 0.0 3.6e+04 1.2e+04 5.0e+01 1 0 12 47 8 1 0 66 55 40 0 DMPlexDistField 3 1.0 2.9222e-02 2.1 0.00e+00 0.0 5.7e+03 5.2e+03 1.2e+01 0 0 2 3 2 0 0 11 4 10 0 DMPlexDistData 2 1.0 1.0574e+0077.8 0.00e+00 0.0 1.5e+04 2.2e+03 6.0e+00 7 0 5 4 1 7 0 28 4 5 0 DMPlexStratify 6 1.5 5.4311e-0135.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 4.2634e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2205e+00 4.4 0.00e+00 0.0 5.1e+04 1.5e+04 4.1e+01 8 0 17 83 7 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0737e-01 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.1023e-0319.9 0.00e+00 0.0 1.7e+03 9.4e+03 3.0e+00 0 0 1 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.7229e-03 7.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 4.1008e-0521.5 0.00e+00 0.0 1.9e+02 1.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.1396e-04 2.1 0.00e+00 0.0 1.9e+02 1.8e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.7800e-0332.3 0.00e+00 0.0 3.8e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 7.3071e-03 1.4 1.02e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 54367 VecNorm 19 1.0 1.8599e-03 1.8 1.13e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 23733 VecScale 266 1.0 1.0359e-03 1.2 1.14e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 41655 VecCopy 1 1.0 1.0204e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 1.7505e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 4.1318e-04 1.3 4.91e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 47153 VecAYPX 76 1.0 3.1829e-04 1.3 2.16e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 26956 VecMAXPY 19 1.0 2.7597e-03 1.2 1.12e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 159106 VecScatterBegin 527 1.0 9.4907e-03 1.8 0.00e+00 0.0 2.3e+05 4.2e+02 0.0e+00 0 0 77 10 0 1 0 94 75 0 0 VecScatterEnd 527 1.0 1.1055e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecNormalize 19 1.0 2.2883e-03 1.5 1.70e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 28934 MatMult 199 1.0 2.8809e-02 1.1 1.66e+07 1.0 6.2e+04 6.6e+02 1.4e+02 0 16 21 4 24 6 16 25 32 30 22581 MatMultAdd 148 1.0 1.5773e-02 1.2 9.31e+06 1.1 1.3e+04 1.6e+03 0.0e+00 0 9 4 2 0 3 9 5 16 0 23019 MatSolve 38 1.0 7.4098e-03 1.1 4.82e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 1 5 0 0 0 25130 MatSOR 152 1.0 3.6705e-02 1.0 2.62e+07 1.1 1.4e+05 3.7e+02 0.0e+00 0 25 45 5 0 7 25 55 39 0 27750 MatLUFactorSym 1 1.0 5.2929e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.8580e-03 1.1 5.15e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10544 MatILUFactorSym 1 1.0 1.2209e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 6.3920e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.2421e-0315.3 1.35e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2378 MatResidual 76 1.0 6.9928e-03 1.2 4.96e+06 1.1 4.6e+04 3.7e+02 0.0e+00 0 5 15 2 0 1 5 18 13 0 27255 MatAssemblyBegin 18 1.0 3.5076e-0310.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 1.1627e-02 1.2 0.00e+00 0.0 1.0e+04 1.2e+02 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 19200 1.0 7.6632e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 15 0 0 0 0 0 MatGetRowIJ 2 1.0 5.0902e-04101.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 2.1482e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 1.1673e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 6.2823e-04 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.5566e-01 1.0 0.00e+00 0.0 7.2e+02 3.0e+02 1.2e+01 1 0 0 0 2 31 0 0 0 3 0 MatMatMult 1 1.0 1.0531e-02 1.0 2.49e+05 1.0 1.4e+03 9.9e+02 1.6e+01 0 0 0 0 3 2 0 1 1 3 940 MatMatMultSym 1 1.0 9.4621e-03 1.0 0.00e+00 0.0 1.3e+03 7.8e+02 1.4e+01 0 0 0 0 2 2 0 1 1 3 0 MatMatMultNum 1 1.0 1.0662e-03 1.0 2.49e+05 1.0 1.8e+02 2.4e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 9284 MatRedundantMat 1 1.0 2.4199e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 1.2550e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 5.9271e-04 2.9 0.00e+00 0.0 7.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 2.1873e-01 1.0 2.52e+06 1.0 1.7e+04 9.5e+02 2.2e+02 1 2 6 2 36 44 2 7 13 46 454 PCSetUpOnBlocks 19 1.0 3.0646e-03 1.3 4.73e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 5846 PCApply 19 1.0 1.1932e-01 1.0 4.00e+07 1.0 2.3e+05 3.3e+02 1.6e+02 1 39 76 8 26 24 39 93 59 33 13154 KSPGMRESOrthog 18 1.0 9.8231e-03 1.3 2.03e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 20 0 0 3 2 20 0 0 4 80884 KSPSetUp 10 1.0 6.3419e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 3.1758e-01 1.0 7.21e+07 1.0 2.5e+05 4.4e+02 4.0e+02 2 70 81 12 66 63 70 99 84 83 8928 SFBcastBegin 1 1.0 1.8940e-0310.1 0.00e+00 0.0 1.2e+03 1.8e+03 1.0e+00 0 0 0 0 0 0 0 0 2 0 0 SFBcastEnd 1 1.0 6.5684e-0443.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 187 192 23335476 0. IS L to G Mapping 3 3 14094724 0. Section 70 53 35616 0. Vector 15 87 12319944 0. Vector Scatter 2 13 683624 0. Matrix 0 29 4148000 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 68468 0. IS L to G Mapping 4 0 0 0. Vector 327 243 4309184 0. Vector Scatter 19 2 2192 0. Matrix 50 8 2592528 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.01222e-06 Average time for zero size MPI_Send(): 1.40071e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 48: solving... ((23365, 1161600), (23365, 1161600)) Solver time: 4.549189e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 48 processors, by jychang48 Wed Mar 2 17:42:31 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.494e+01 1.00013 1.494e+01 Objects: 8.740e+02 1.16845 7.631e+02 Flops: 8.941e+07 1.11020 8.501e+07 4.081e+09 Flops/sec: 5.986e+06 1.11026 5.692e+06 2.732e+08 MPI Messages: 1.241e+04 2.76359 8.136e+03 3.905e+05 MPI Message Lengths: 1.820e+08 13.08995 2.475e+03 9.666e+08 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4481e+01 97.0% 0.0000e+00 0.0% 6.669e+04 17.1% 2.121e+03 85.7% 1.250e+02 20.7% 1: FEM: 4.5505e-01 3.0% 4.0806e+09 100.0% 3.239e+05 82.9% 3.544e+02 14.3% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2041e+0013.4 0.00e+00 0.0 1.6e+04 4.0e+00 4.4e+01 7 0 4 0 7 8 0 24 0 35 0 VecScatterBegin 2 1.0 3.1948e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 9.0599e-06 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1464e+00 1.1 0.00e+00 0.0 2.7e+04 2.8e+03 2.1e+01 14 0 7 8 3 15 0 40 9 17 0 Mesh Migration 2 1.0 4.0869e-01 1.0 0.00e+00 0.0 3.4e+04 1.9e+04 5.4e+01 3 0 9 68 9 3 0 51 80 43 0 DMPlexInterp 1 1.0 2.1102e+0064135.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.4017e+00 1.1 0.00e+00 0.0 1.9e+04 1.7e+04 2.5e+01 16 0 5 34 4 17 0 29 40 20 0 DMPlexDistCones 2 1.0 9.4518e-02 1.2 0.00e+00 0.0 5.1e+03 4.3e+04 4.0e+00 1 0 1 23 1 1 0 8 26 3 0 DMPlexDistLabels 2 1.0 2.6648e-01 1.0 0.00e+00 0.0 2.1e+04 1.8e+04 2.2e+01 2 0 5 39 4 2 0 31 45 18 0 DMPlexDistribOL 1 1.0 1.7164e-01 1.1 0.00e+00 0.0 4.2e+04 1.1e+04 5.0e+01 1 0 11 47 8 1 0 64 55 40 0 DMPlexDistField 3 1.0 2.8804e-02 2.2 0.00e+00 0.0 6.8e+03 4.5e+03 1.2e+01 0 0 2 3 2 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0458e+0067.3 0.00e+00 0.0 2.1e+04 1.7e+03 6.0e+00 7 0 5 4 1 7 0 31 4 5 0 DMPlexStratify 6 1.5 5.3865e-0141.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 3.6523e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2136e+00 4.5 0.00e+00 0.0 6.4e+04 1.3e+04 4.1e+01 8 0 16 83 7 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0116e-01 6.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.5670e-0323.0 0.00e+00 0.0 2.0e+03 8.2e+03 3.0e+00 0 0 1 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.3460e-03 7.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.5048e-0518.4 0.00e+00 0.0 2.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.1182e-04 2.5 0.00e+00 0.0 2.2e+02 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 3.1829e-0358.8 0.00e+00 0.0 4.6e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 6.1665e-03 1.5 8.50e+06 1.1 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 64423 VecNorm 19 1.0 1.2932e-03 1.5 9.45e+05 1.1 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 34133 VecScale 266 1.0 9.3555e-04 1.2 9.80e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 46730 VecCopy 1 1.0 7.7009e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 1.3967e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 3.7766e-04 1.4 4.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 51622 VecAYPX 76 1.0 3.3188e-04 1.6 1.80e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 25871 VecMAXPY 19 1.0 2.2688e-03 1.2 9.40e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 193532 VecScatterBegin 527 1.0 9.3722e-03 1.9 0.00e+00 0.0 3.0e+05 3.5e+02 0.0e+00 0 0 78 11 0 2 0 94 76 0 0 VecScatterEnd 527 1.0 1.1200e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecNormalize 19 1.0 1.7309e-03 1.3 1.42e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 38252 MatMult 199 1.0 2.5053e-02 1.1 1.40e+07 1.1 7.9e+04 5.7e+02 1.4e+02 0 16 20 5 24 5 16 24 32 30 25999 MatMultAdd 148 1.0 1.3936e-02 1.2 7.82e+06 1.1 1.6e+04 1.4e+03 0.0e+00 0 9 4 2 0 3 9 5 16 0 26055 MatSolve 38 1.0 6.1853e-03 1.1 4.06e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 1 5 0 0 0 30315 MatSOR 152 1.0 3.2173e-02 1.1 2.18e+07 1.1 1.8e+05 3.1e+02 0.0e+00 0 25 45 6 0 7 25 54 40 0 31732 MatLUFactorSym 1 1.0 6.9857e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.5531e-03 1.1 4.69e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 13764 MatILUFactorSym 1 1.0 6.4588e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 5.5194e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.5218e-0312.3 1.13e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3504 MatResidual 76 1.0 6.3055e-03 1.2 4.14e+06 1.1 5.9e+04 3.1e+02 0.0e+00 0 5 15 2 0 1 5 18 13 0 30359 MatAssemblyBegin 18 1.0 3.1672e-03 7.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 3 0 MatAssemblyEnd 18 1.0 1.0310e-02 1.2 0.00e+00 0.0 1.3e+04 1.0e+02 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 16000 1.0 6.3519e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 14 0 0 0 0 0 MatGetRowIJ 2 1.0 4.5180e-0475.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 2.7895e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 9.4771e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 5.5814e-04 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.2936e-01 1.0 0.00e+00 0.0 8.6e+02 2.7e+02 1.2e+01 1 0 0 0 2 28 0 0 0 3 0 MatMatMult 1 1.0 9.1429e-03 1.0 2.08e+05 1.0 1.7e+03 8.9e+02 1.6e+01 0 0 0 0 3 2 0 1 1 3 1083 MatMatMultSym 1 1.0 8.2440e-03 1.0 0.00e+00 0.0 1.5e+03 7.0e+02 1.4e+01 0 0 0 0 2 2 0 0 1 3 0 MatMatMultNum 1 1.0 8.9502e-04 1.0 2.08e+05 1.0 2.2e+02 2.2e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 11060 MatRedundantMat 1 1.0 3.0804e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 1.0741e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 5.8627e-04 3.3 0.00e+00 0.0 8.6e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 1.8617e-01 1.0 2.14e+06 1.0 2.3e+04 7.5e+02 2.2e+02 1 2 6 2 36 41 2 7 12 46 543 PCSetUpOnBlocks 19 1.0 2.2240e-03 1.1 3.97e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8045 PCApply 19 1.0 1.0738e-01 1.0 3.36e+07 1.0 3.0e+05 2.8e+02 1.6e+02 1 39 77 9 26 23 39 93 61 33 14674 KSPGMRESOrthog 18 1.0 8.1601e-03 1.3 1.70e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 19 0 0 3 2 19 0 0 4 97367 KSPSetUp 10 1.0 5.8508e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 2.7342e-01 1.0 6.07e+07 1.0 3.2e+05 3.6e+02 4.0e+02 2 70 82 12 66 60 70 99 84 83 10394 SFBcastBegin 1 1.0 3.2558e-0320.1 0.00e+00 0.0 1.4e+03 1.7e+03 1.0e+00 0 0 0 0 0 0 0 0 2 0 0 SFBcastEnd 1 1.0 2.1031e-03163.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 195 200 22896544 0. IS L to G Mapping 3 3 14076928 0. Section 70 53 35616 0. Vector 15 87 11144984 0. Vector Scatter 2 13 575024 0. Matrix 0 29 3506540 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 60332 0. IS L to G Mapping 4 0 0 0. Vector 327 243 3661024 0. Vector Scatter 19 2 2192 0. Matrix 50 8 2166428 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.63211e-06 Average time for zero size MPI_Send(): 1.35601e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 56: solving... ((20104, 1161600), (20104, 1161600)) Solver time: 4.187219e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 56 processors, by jychang48 Wed Mar 2 17:42:50 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.527e+01 1.00017 1.527e+01 Objects: 8.920e+02 1.18933 7.639e+02 Flops: 7.757e+07 1.12535 7.363e+07 4.123e+09 Flops/sec: 5.079e+06 1.12536 4.821e+06 2.700e+08 MPI Messages: 1.388e+04 2.76367 8.987e+03 5.033e+05 MPI Message Lengths: 1.801e+08 15.41287 1.981e+03 9.969e+08 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4853e+01 97.3% 0.0000e+00 0.0% 8.318e+04 16.5% 1.686e+03 85.1% 1.250e+02 20.7% 1: FEM: 4.1873e-01 2.7% 4.1231e+09 100.0% 4.201e+05 83.5% 2.951e+02 14.9% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2200e+0012.6 0.00e+00 0.0 2.0e+04 4.0e+00 4.4e+01 7 0 4 0 7 8 0 24 0 35 0 VecScatterBegin 2 1.0 2.7418e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 9.0599e-06 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.1967e+00 1.1 0.00e+00 0.0 3.5e+04 2.2e+03 2.1e+01 14 0 7 8 3 15 0 42 9 17 0 Mesh Migration 2 1.0 4.0311e-01 1.0 0.00e+00 0.0 4.1e+04 1.6e+04 5.4e+01 3 0 8 68 9 3 0 50 79 43 0 DMPlexInterp 1 1.0 2.1118e+0061941.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.4639e+00 1.1 0.00e+00 0.0 2.6e+04 1.3e+04 2.5e+01 16 0 5 33 4 17 0 31 39 20 0 DMPlexDistCones 2 1.0 9.4431e-02 1.2 0.00e+00 0.0 6.2e+03 3.6e+04 4.0e+00 1 0 1 22 1 1 0 7 26 3 0 DMPlexDistLabels 2 1.0 2.6321e-01 1.0 0.00e+00 0.0 2.5e+04 1.5e+04 2.2e+01 2 0 5 38 4 2 0 30 45 18 0 DMPlexDistribOL 1 1.0 1.5705e-01 1.1 0.00e+00 0.0 5.1e+04 9.2e+03 5.0e+01 1 0 10 47 8 1 0 62 56 40 0 DMPlexDistField 3 1.0 3.1058e-02 2.4 0.00e+00 0.0 8.3e+03 3.8e+03 1.2e+01 0 0 2 3 2 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0646e+0054.2 0.00e+00 0.0 2.8e+04 1.3e+03 6.0e+00 7 0 6 4 1 7 0 33 4 5 0 DMPlexStratify 6 1.5 5.4248e-0150.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 3.2736e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2264e+00 4.1 0.00e+00 0.0 8.0e+04 1.0e+04 4.1e+01 8 0 16 82 7 8 0 96 97 33 0 SFBcastEnd 95 1.0 2.9831e-01 9.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.1031e-0324.5 0.00e+00 0.0 2.5e+03 6.8e+03 3.0e+00 0 0 0 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 9.3877e-03 7.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.6001e-0518.9 0.00e+00 0.0 2.7e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.2803e-04 3.4 0.00e+00 0.0 2.7e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.4439e-0329.0 0.00e+00 0.0 5.6e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 6.0709e-03 1.5 7.29e+06 1.1 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 65437 VecNorm 19 1.0 1.3041e-03 1.4 8.10e+05 1.1 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 33846 VecScale 266 1.0 8.6451e-04 1.2 8.50e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 51332 VecCopy 1 1.0 7.3910e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 1.2527e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 3.4428e-04 1.5 3.51e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 56644 VecAYPX 76 1.0 2.8181e-04 1.5 1.54e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 30478 VecMAXPY 19 1.0 1.9228e-03 1.2 8.05e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 228352 VecScatterBegin 527 1.0 9.7260e-03 2.0 0.00e+00 0.0 3.9e+05 2.9e+02 0.0e+00 0 0 78 12 0 2 0 94 78 0 0 VecScatterEnd 527 1.0 1.1311e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecNormalize 19 1.0 1.6813e-03 1.3 1.21e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 39380 MatMult 199 1.0 2.4066e-02 1.1 1.19e+07 1.1 1.0e+05 4.8e+02 1.4e+02 0 16 20 5 24 6 16 24 33 30 27088 MatMultAdd 148 1.0 1.2908e-02 1.2 6.69e+06 1.1 1.9e+04 1.3e+03 0.0e+00 0 9 4 2 0 3 9 5 16 0 28129 MatSolve 38 1.0 5.4102e-03 1.1 3.52e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 1 5 0 0 0 35022 MatSOR 152 1.0 3.0248e-02 1.1 1.88e+07 1.1 2.3e+05 2.6e+02 0.0e+00 0 25 45 6 0 7 25 54 41 0 33772 MatLUFactorSym 1 1.0 1.0395e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.3649e-03 1.1 4.56e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 17818 MatILUFactorSym 1 1.0 5.8317e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 5.0116e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 2.1529e-0319.4 9.68e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2476 MatResidual 76 1.0 6.1171e-03 1.2 3.58e+06 1.1 7.6e+04 2.6e+02 0.0e+00 0 5 15 2 0 1 5 18 14 0 31389 MatAssemblyBegin 18 1.0 3.7098e-03 7.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 1 0 0 0 3 0 MatAssemblyEnd 18 1.0 1.0228e-02 1.2 0.00e+00 0.0 1.7e+04 8.6e+01 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 13716 1.0 5.4721e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 MatGetRowIJ 2 1.0 5.5599e-0493.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 3.1495e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 8.2636e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 6.3801e-04 8.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.1296e-01 1.0 0.00e+00 0.0 1.0e+03 2.4e+02 1.2e+01 1 0 0 0 2 27 0 0 0 3 0 MatMatMult 1 1.0 8.5900e-03 1.0 1.78e+05 1.0 2.1e+03 8.0e+02 1.6e+01 0 0 0 0 3 2 0 0 1 3 1152 MatMatMultSym 1 1.0 7.7760e-03 1.0 0.00e+00 0.0 1.8e+03 6.3e+02 1.4e+01 0 0 0 0 2 2 0 0 1 3 0 MatMatMultNum 1 1.0 8.0490e-04 1.0 1.78e+05 1.0 2.6e+02 2.0e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 12296 MatRedundantMat 1 1.0 3.4714e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 9.2196e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 4.9806e-04 2.9 0.00e+00 0.0 1.0e+03 1.4e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 1.6844e-01 1.0 1.89e+06 1.0 2.9e+04 6.0e+02 2.2e+02 1 3 6 2 36 40 3 7 12 46 617 PCSetUpOnBlocks 19 1.0 1.9562e-03 1.1 3.40e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9125 PCApply 19 1.0 1.0325e-01 1.0 2.89e+07 1.0 3.9e+05 2.3e+02 1.6e+02 1 38 78 9 26 24 38 94 62 33 15320 KSPGMRESOrthog 18 1.0 7.8821e-03 1.3 1.46e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 19 0 0 3 2 19 0 0 4 100801 KSPSetUp 10 1.0 5.4169e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 2.5057e-01 1.0 5.19e+07 1.0 4.2e+05 3.0e+02 4.0e+02 2 69 83 13 66 60 69 99 85 83 11366 SFBcastBegin 1 1.0 1.5159e-03 8.7 0.00e+00 0.0 1.7e+03 1.5e+03 1.0e+00 0 0 0 0 0 0 0 0 2 0 0 SFBcastEnd 1 1.0 4.6587e-0431.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 213 218 22660336 0. IS L to G Mapping 3 3 12801292 0. Section 70 53 35616 0. Vector 15 87 10301560 0. Vector Scatter 2 13 496760 0. Matrix 0 29 3056328 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 55528 0. IS L to G Mapping 4 0 0 0. Vector 327 243 3202296 0. Vector Scatter 19 2 2192 0. Matrix 50 8 1858040 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 9.39369e-06 Average time for zero size MPI_Send(): 1.51566e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- ================= ml 40 1 ================= Discretization: RT MPI processes 64: solving... ((17544, 1161600), (17544, 1161600)) Solver time: 3.889170e-01 Solver iterations: 18 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Darcy_FE.py on a arch-linux2-c-opt named wf153.localdomain with 64 processors, by jychang48 Wed Mar 2 17:43:09 2016 Using Petsc Development GIT revision: v3.6.3-1924-ge972cd5 GIT Date: 2016-01-01 10:01:13 -0600 Max Max/Min Avg Total Time (sec): 1.513e+01 1.00019 1.513e+01 Objects: 9.060e+02 1.21123 7.647e+02 Flops: 6.855e+07 1.13507 6.507e+07 4.164e+09 Flops/sec: 4.532e+06 1.13513 4.302e+06 2.753e+08 MPI Messages: 1.488e+04 2.83578 9.700e+03 6.208e+05 MPI Message Lengths: 1.790e+08 17.44347 1.649e+03 1.024e+09 MPI Reductions: 6.040e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4737e+01 97.4% 0.0000e+00 0.0% 1.007e+05 16.2% 1.395e+03 84.6% 1.250e+02 20.7% 1: FEM: 3.8892e-01 2.6% 4.1645e+09 100.0% 5.201e+05 83.8% 2.538e+02 15.4% 4.780e+02 79.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 44 1.0 1.2409e+0012.9 0.00e+00 0.0 2.4e+04 4.0e+00 4.4e+01 8 0 4 0 7 8 0 24 0 35 0 VecScatterBegin 2 1.0 1.8835e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 2 1.0 1.0729e-05 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh Partition 2 1.0 2.2316e+00 1.1 0.00e+00 0.0 4.4e+04 1.8e+03 2.1e+01 15 0 7 8 3 15 0 44 9 17 0 Mesh Migration 2 1.0 3.9511e-01 1.0 0.00e+00 0.0 4.9e+04 1.4e+04 5.4e+01 3 0 8 67 9 3 0 48 79 43 0 DMPlexInterp 1 1.0 2.1213e+0062219.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexDistribute 1 1.0 2.5055e+00 1.1 0.00e+00 0.0 3.3e+04 1.0e+04 2.5e+01 17 0 5 32 4 17 0 33 38 20 0 DMPlexDistCones 2 1.0 9.2890e-02 1.2 0.00e+00 0.0 7.2e+03 3.1e+04 4.0e+00 1 0 1 22 1 1 0 7 26 3 0 DMPlexDistLabels 2 1.0 2.5924e-01 1.0 0.00e+00 0.0 2.9e+04 1.3e+04 2.2e+01 2 0 5 38 4 2 0 29 45 18 0 DMPlexDistribOL 1 1.0 1.4374e-01 1.1 0.00e+00 0.0 6.1e+04 8.0e+03 5.0e+01 1 0 10 47 8 1 0 60 56 40 0 DMPlexDistField 3 1.0 3.1577e-02 2.4 0.00e+00 0.0 9.7e+03 3.4e+03 1.2e+01 0 0 2 3 2 0 0 10 4 10 0 DMPlexDistData 2 1.0 1.0807e+0052.5 0.00e+00 0.0 3.5e+04 1.0e+03 6.0e+00 7 0 6 4 1 7 0 35 4 5 0 DMPlexStratify 6 1.5 5.4088e-0157.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 51 1.0 2.7965e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 95 1.0 1.2455e+00 4.1 0.00e+00 0.0 9.7e+04 8.7e+03 4.1e+01 8 0 16 82 7 8 0 96 97 33 0 SFBcastEnd 95 1.0 3.0530e-0110.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 SFReduceBegin 4 1.0 9.7501e-0323.5 0.00e+00 0.0 2.9e+03 5.8e+03 3.0e+00 0 0 0 2 0 0 0 3 2 2 0 SFReduceEnd 4 1.0 1.0731e-02 8.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpBegin 1 1.0 3.3140e-0517.4 0.00e+00 0.0 3.2e+02 1.3e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFFetchOpEnd 1 1.0 1.0800e-04 2.5 0.00e+00 0.0 3.2e+02 1.3e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: FEM BuildTwoSided 1 1.0 1.4801e-0327.5 0.00e+00 0.0 6.5e+02 4.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 18 1.0 5.7650e-03 1.6 6.39e+06 1.1 0.0e+00 0.0e+00 1.8e+01 0 10 0 0 3 1 10 0 0 4 68909 VecNorm 19 1.0 1.2131e-03 1.4 7.10e+05 1.1 0.0e+00 0.0e+00 1.9e+01 0 1 0 0 3 0 1 0 0 4 36388 VecScale 266 1.0 8.2517e-04 1.3 7.55e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 54483 VecCopy 1 1.0 6.3181e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 244 1.0 1.0672e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 77 1.0 3.1257e-04 1.5 3.08e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 62425 VecAYPX 76 1.0 2.6488e-04 1.5 1.35e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 32446 VecMAXPY 19 1.0 1.6687e-03 1.1 7.06e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 263131 VecScatterBegin 527 1.0 9.8925e-03 2.1 0.00e+00 0.0 4.9e+05 2.5e+02 0.0e+00 0 0 78 12 0 2 0 93 79 0 0 VecScatterEnd 527 1.0 1.3103e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 VecNormalize 19 1.0 1.5395e-03 1.3 1.06e+06 1.1 0.0e+00 0.0e+00 1.9e+01 0 2 0 0 3 0 2 0 0 4 43009 MatMult 199 1.0 2.2541e-02 1.1 1.04e+07 1.0 1.2e+05 4.3e+02 1.4e+02 0 16 20 5 24 6 16 23 33 30 28955 MatMultAdd 148 1.0 1.1942e-02 1.2 5.84e+06 1.1 2.2e+04 1.1e+03 0.0e+00 0 9 4 2 0 3 9 4 16 0 30407 MatSolve 38 1.0 4.7653e-03 1.1 3.14e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 1 5 0 0 0 40351 MatSOR 152 1.0 2.8343e-02 1.1 1.65e+07 1.1 2.8e+05 2.3e+02 0.0e+00 0 25 45 6 0 7 25 54 41 0 36104 MatLUFactorSym 1 1.0 1.1396e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 1.2221e-03 1.1 4.69e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 23627 MatILUFactorSym 1 1.0 5.0211e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 2 1.0 4.3702e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 2 1.0 1.7388e-0318.5 8.46e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3066 MatResidual 76 1.0 6.0871e-03 1.2 3.14e+06 1.1 9.4e+04 2.3e+02 0.0e+00 0 5 15 2 0 1 5 18 14 0 31668 MatAssemblyBegin 18 1.0 3.1993e-03 6.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 1 0 0 0 3 0 MatAssemblyEnd 18 1.0 8.8515e-03 1.2 0.00e+00 0.0 2.2e+04 7.4e+01 8.0e+01 0 0 3 0 13 2 0 4 1 17 0 MatGetRow 12000 1.0 4.8211e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 12 0 0 0 0 0 MatGetRowIJ 2 1.0 5.0211e-0484.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 4.3511e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetSubMatrix 4 1.0 7.3576e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 2 1.0 5.8365e-04 7.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 9.8725e-02 1.0 0.00e+00 0.0 1.2e+03 2.2e+02 1.2e+01 1 0 0 0 2 25 0 0 0 3 0 MatMatMult 1 1.0 7.8750e-03 1.0 1.56e+05 1.0 2.4e+03 7.3e+02 1.6e+01 0 0 0 0 3 2 0 0 1 3 1257 MatMatMultSym 1 1.0 7.1521e-03 1.0 0.00e+00 0.0 2.1e+03 5.8e+02 1.4e+01 0 0 0 0 2 2 0 0 1 3 0 MatMatMultNum 1 1.0 7.3504e-04 1.1 1.56e+05 1.0 3.1e+02 1.8e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 13467 MatRedundantMat 1 1.0 4.6396e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatGetLocalMat 2 1.0 8.2898e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 2 1.0 4.4298e-04 2.5 0.00e+00 0.0 1.2e+03 1.2e+03 0.0e+00 0 0 0 0 0 0 0 0 1 0 0 PCSetUp 4 1.0 1.4942e-01 1.0 1.72e+06 1.0 3.6e+04 5.0e+02 2.2e+02 1 3 6 2 36 38 3 7 12 46 726 PCSetUpOnBlocks 19 1.0 1.7056e-03 1.1 2.96e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10448 PCApply 19 1.0 9.6567e-02 1.0 2.55e+07 1.1 4.9e+05 2.0e+02 1.6e+02 1 38 79 10 26 25 38 94 63 33 16484 KSPGMRESOrthog 18 1.0 7.3059e-03 1.5 1.28e+07 1.1 0.0e+00 0.0e+00 1.8e+01 0 19 0 0 3 2 19 0 0 4 108752 KSPSetUp 10 1.0 4.7541e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 KSPSolve 1 1.0 2.2602e-01 1.0 4.55e+07 1.0 5.2e+05 2.6e+02 4.0e+02 1 69 83 13 66 58 69 99 86 83 12644 SFBcastBegin 1 1.0 1.5361e-0312.2 0.00e+00 0.0 2.0e+03 1.4e+03 1.0e+00 0 0 0 0 0 0 0 0 2 0 0 SFBcastEnd 1 1.0 8.6808e-0472.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 6 3 1728 0. Viewer 1 0 0 0. Index Set 227 232 22507744 0. IS L to G Mapping 3 3 12541056 0. Section 70 53 35616 0. Vector 15 87 9643080 0. Vector Scatter 2 13 435320 0. Matrix 0 29 2741220 0. Preconditioner 0 11 10960 0. Krylov Solver 0 11 30856 0. Distributed Mesh 14 8 38248 0. GraphPartitioner 6 5 3060 0. Star Forest Bipartite Graph 74 63 53256 0. Discrete System 14 8 6912 0. --- Event Stage 1: FEM Index Set 50 38 53548 0. IS L to G Mapping 4 0 0 0. Vector 327 243 2852408 0. Vector Scatter 19 2 2192 0. Matrix 50 8 1629940 0. Preconditioner 12 1 896 0. Krylov Solver 12 1 1352 0. ======================================================================================================================== Average time to get PetscTime(): 5.96046e-07 Average time for MPI_Barrier(): 9.20296e-06 Average time for zero size MPI_Send(): 1.65403e-06 #PETSc Option Table entries: -log_summary -solver_fieldsplit_0_ksp_type preonly -solver_fieldsplit_0_pc_type bjacobi -solver_fieldsplit_1_ksp_type preonly -solver_fieldsplit_1_pc_type ml -solver_ksp_rtol 1e-7 -solver_ksp_type gmres -solver_pc_fieldsplit_schur_fact_type upper -solver_pc_fieldsplit_schur_precondition selfp -solver_pc_fieldsplit_type schur -solver_pc_type fieldsplit #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco=/users/jychang48/externalpackages/Chaco-2.2-p2.tar.gz --download-ctetgen=/users/jychang48/externalpackages/ctetgen-0.4.tar.gz --download-exodusii=/users/jychang48/externalpackages/exodus-5.24.tar.bz2 --download-fblaslapack=/users/jychang48/externalpackages/fblaslapack-3.4.2.tar.gz --download-hdf5=/users/jychang48/externalpackages/hdf5-1.8.12.tar.gz --download-hypre=/users/jychang48/externalpackages/hypre-2.10.0b-p1.tar.gz --download-metis=/users/jychang48/externalpackages/metis-5.1.0-p1.tar.gz --download-ml=/users/jychang48/externalpackages/ml-6.2-p3.tar.gz --download-mumps=/users/jychang48/externalpackages/MUMPS_5.0.1-p1.tar.gz --download-netcdf=/users/jychang48/externalpackages/netcdf-4.3.2.tar.gz --download-parmetis=/users/jychang48/externalpackages/parmetis-4.0.3-p2.tar.gz --download-scalapack=/users/jychang48/externalpackages/scalapack-2.0.2.tgz --download-superlu_dist --download-triangle=/users/jychang48/externalpackages/Triangle.tar.gz --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-papi=/usr/projects/hpcsoft/toss2/common/papi/5.4.1 --with-shared-libraries COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt ----------------------------------------- Libraries compiled on Fri Jan 1 21:44:06 2016 on wf-fe2.lanl.gov Machine characteristics: Linux-2.6.32-573.8.1.2chaos.ch5.4.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /users/jychang48/petsc Using PETSc arch: arch-linux2-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/include -I/users/jychang48/petsc/arch-linux2-c-opt/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include -I/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/include/openmpi ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/users/jychang48/petsc/arch-linux2-c-opt/lib -L/users/jychang48/petsc/arch-linux2-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_4.2 -lHYPRE -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -lmpi_cxx -lstdc++ -lml -lmpi_cxx -lstdc++ -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -L/usr/projects/hpcsoft/toss2.2/wolf/openmpi/1.6.5-gcc-4.8/lib -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc/x86_64-unknown-linux-gnu/4.8.2 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib/gcc -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib64 -Wl,-rpath,/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -L/turquoise/usr/projects/hpcsoft/toss2/common/gcc/4.8.2/lib -ldl -lmpi -losmcomp -lrdmacm -libverbs -lsctp -lrt -lnsl -lutil -lpsm_infinipath -lpmi -lnuma -lgcc_s -lpthread -ldl ----------------------------------------- From knepley at gmail.com Wed Mar 2 19:48:20 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 Mar 2016 19:48:20 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <84C7405D-8471-4A13-8D9F-B841EECC243F@mcs.anl.gov> Message-ID: On Wed, Mar 2, 2016 at 7:15 PM, Justin Chang wrote: > Barry, > > Attached are the log_summary output for each preconditioner. > MatPtAP takes all the time. It looks like there is no coarsening at all at the first level. Mark, can you see what is going on here? Matt > Thanks, > Justin > > > On Wednesday, March 2, 2016, Barry Smith wrote: > >> >> Justin, >> >> Do you have the -log_summary output for these runs? >> >> Barry >> >> > On Mar 2, 2016, at 4:28 PM, Justin Chang wrote: >> > >> > Dear all, >> > >> > Using the firedrake project, I am solving this simple mixed poisson >> problem: >> > >> > mesh = UnitCubeMesh(40,40,40) >> > V = FunctionSpace(mesh,"RT",1) >> > Q = FunctionSpace(mesh,"DG",0) >> > W = V*Q >> > >> > v, p = TrialFunctions(W) >> > w, q = TestFunctions(W) >> > >> > f = Function(Q) >> > >> f.interpolate(Expression("12*pi*pi*sin(pi*x[0]*2)*sin(pi*x[1]*2)*sin(2*pi*x[2])")) >> > >> > a = dot(v,w)*dx - p*div(w)*dx + div(v)*q*dx >> > L = f*q*dx >> > >> > u = Function(W) >> > solve(a==L,u,solver_parameters={...}) >> > >> > This problem has 1161600 degrees of freedom. The solver_parameters are: >> > >> > -ksp_type gmres >> > -pc_type fieldsplit >> > -pc_fieldsplit_type schur >> > -pc_fieldsplit_schur_fact_type: upper >> > -pc_fieldsplit_schur_precondition selfp >> > -fieldsplit_0_ksp_type preonly >> > -fieldsplit_0_pc_type bjacobi >> > -fieldsplit_1_ksp_type preonly >> > -fieldsplit_1_pc_type hypre/ml/gamg >> > >> > for the last option, I compared the wall-clock timings for hypre, >> ml,and gamg. Here are the strong-scaling results (across 64 cores, 8 cores >> per Intel Xeon E5-2670 node) for hypre, ml, and gamg: >> > >> > hypre: >> > 1 core: 47.5 s, 12 solver iters >> > 2 cores: 34.1 s, 15 solver iters >> > 4 cores: 21.5 s, 15 solver iters >> > 8 cores: 16.6 s, 15 solver iters >> > 16 cores: 10.2 s, 15 solver iters >> > 24 cores: 7.66 s, 15 solver iters >> > 32 cores: 6.31 s, 15 solver iters >> > 40 cores: 5.68 s, 15 solver iters >> > 48 cores: 5.36 s, 16 solver iters >> > 56 cores: 5.12 s, 16 solver iters >> > 64 cores: 4.99 s, 16 solver iters >> > >> > ml: >> > 1 core: 4.44 s, 14 solver iters >> > 2 cores: 2.85 s, 16 solver iters >> > 4 cores: 1.6 s, 17 solver iters >> > 8 cores: 0.966 s, 17 solver iters >> > 16 cores: 0.585 s, 18 solver iters >> > 24 cores: 0.440 s, 18 solver iters >> > 32 cores: 0.375 s, 18 solver iters >> > 40 cores: 0.332 s, 18 solver iters >> > 48 cores: 0.307 s, 17 solver iters >> > 56 cores: 0.290 s, 18 solver iters >> > 64 cores: 0.281 s, 18 solver items >> > >> > gamg: >> > 1 core: 613 s, 12 solver iters >> > 2 cores: 204 s, 15 solver iters >> > 4 cores: 77.1 s, 15 solver iters >> > 8 cores: 38.1 s, 15 solver iters >> > 16 cores: 15.9 s, 16 solver iters >> > 24 cores: 9.24 s, 16 solver iters >> > 32 cores: 5.92 s, 16 solver iters >> > 40 cores: 4.72 s, 16 solver iters >> > 48 cores: 3.89 s, 16 solver iters >> > 56 cores: 3.65 s, 16 solver iters >> > 64 cores: 3.46 s, 16 solver iters >> > >> > The performance difference between ML and HYPRE makes sense to me, but >> what I am really confused about is GAMG. It seems GAMG is really slow on a >> single core but something internally is causing it to speed up >> super-linearly as I increase the number of MPI processes. Shouldn't ML and >> GAMG have the same performance? I am not sure what log outputs to give you >> guys, but for starters, below is -ksp_view for the single core case with >> GAMG >> > >> > KSP Object:(solver_) 1 MPI processes >> > type: gmres >> > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> > GMRES: happy breakdown tolerance 1e-30 >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. >> > left preconditioning >> > using PRECONDITIONED norm type for convergence test >> > PC Object:(solver_) 1 MPI processes >> > type: fieldsplit >> > FieldSplit with Schur preconditioner, factorization UPPER >> > Preconditioner for the Schur complement formed from Sp, an >> assembled approximation to S, which uses (lumped, if requested) A00's >> diagonal's inverse >> > Split info: >> > Split number 0 Defined by IS >> > Split number 1 Defined by IS >> > KSP solver for A00 block >> > KSP Object: (solver_fieldsplit_0_) 1 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_0_) 1 MPI processes >> > type: bjacobi >> > block Jacobi: number of blocks = 1 >> > Local solve is same for all blocks, in the following KSP and >> PC objects: >> > KSP Object: (solver_fieldsplit_0_sub_) 1 >> MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_0_sub_) 1 >> MPI processes >> > type: ilu >> > ILU: out-of-place factorization >> > 0 levels of fill >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > factor fill ratio given 1., needed 1. >> > Factored matrix follows: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > package used to perform factorization: petsc >> > total: nonzeros=5385600, allocated nonzeros=5385600 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: (solver_fieldsplit_0_) 1 >> MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > total: nonzeros=5385600, allocated nonzeros=5385600 >> > total number of mallocs used during MatSetValues calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: (solver_fieldsplit_0_) 1 MPI >> processes >> > type: seqaij >> > rows=777600, cols=777600 >> > total: nonzeros=5385600, allocated nonzeros=5385600 >> > total number of mallocs used during MatSetValues calls =0 >> > not using I-node routines >> > KSP solver for S = A11 - A10 inv(A00) A01 >> > KSP Object: (solver_fieldsplit_1_) 1 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_1_) 1 MPI processes >> > type: gamg >> > MG: type is MULTIPLICATIVE, levels=5 cycles=v >> > Cycles per PCApply=1 >> > Using Galerkin computed coarse grid matrices >> > GAMG specific options >> > Threshold for dropping small values from graph 0. >> > AGG specific options >> > Symmetric graph false >> > Coarse grid solver -- level ------------------------------- >> > KSP Object: (solver_fieldsplit_1_mg_coarse_) >> 1 MPI processes >> > type: preonly >> > maximum iterations=1, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_1_mg_coarse_) >> 1 MPI processes >> > type: bjacobi >> > block Jacobi: number of blocks = 1 >> > Local solve is same for all blocks, in the following KSP >> and PC objects: >> > KSP Object: >> (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes >> > type: preonly >> > maximum iterations=1, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: >> (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes >> > type: lu >> > LU: out-of-place factorization >> > tolerance for zero pivot 2.22045e-14 >> > using diagonal shift on blocks to prevent zero pivot >> [INBLOCKS] >> > matrix ordering: nd >> > factor fill ratio given 5., needed 1. >> > Factored matrix follows: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=9, cols=9 >> > package used to perform factorization: petsc >> > total: nonzeros=81, allocated nonzeros=81 >> > total number of mallocs used during >> MatSetValues calls =0 >> > using I-node routines: found 2 nodes, limit >> used is 5 >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=9, cols=9 >> > total: nonzeros=81, allocated nonzeros=81 >> > total number of mallocs used during MatSetValues >> calls =0 >> > using I-node routines: found 2 nodes, limit used is >> 5 >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=9, cols=9 >> > total: nonzeros=81, allocated nonzeros=81 >> > total number of mallocs used during MatSetValues calls =0 >> > using I-node routines: found 2 nodes, limit used is 5 >> > Down solver (pre-smoother) on level 1 >> ------------------------------- >> > KSP Object: (solver_fieldsplit_1_mg_levels_1_) >> 1 MPI processes >> > type: chebyshev >> > Chebyshev: eigenvalue estimates: min = 0.0999525, max = >> 1.09948 >> > Chebyshev: eigenvalues estimated using gmres with >> translations [0. 0.1; 0. 1.1] >> > KSP Object: >> (solver_fieldsplit_1_mg_levels_1_esteig_) 1 MPI processes >> > type: gmres >> > GMRES: restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> > GMRES: happy breakdown tolerance 1e-30 >> > maximum iterations=10, initial guess is zero >> > tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using PRECONDITIONED norm type for convergence test >> > maximum iterations=2 >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using nonzero initial guess >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_1_mg_levels_1_) >> 1 MPI processes >> > type: sor >> > SOR: type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1. >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=207, cols=207 >> > total: nonzeros=42849, allocated nonzeros=42849 >> > total number of mallocs used during MatSetValues calls =0 >> > using I-node routines: found 42 nodes, limit used is 5 >> > Up solver (post-smoother) same as down solver (pre-smoother) >> > Down solver (pre-smoother) on level 2 >> ------------------------------- >> > KSP Object: (solver_fieldsplit_1_mg_levels_2_) >> 1 MPI processes >> > type: chebyshev >> > Chebyshev: eigenvalue estimates: min = 0.0996628, max = >> 1.09629 >> > Chebyshev: eigenvalues estimated using gmres with >> translations [0. 0.1; 0. 1.1] >> > KSP Object: >> (solver_fieldsplit_1_mg_levels_2_esteig_) 1 MPI processes >> > type: gmres >> > GMRES: restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> > GMRES: happy breakdown tolerance 1e-30 >> > maximum iterations=10, initial guess is zero >> > tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using PRECONDITIONED norm type for convergence test >> > maximum iterations=2 >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using nonzero initial guess >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_1_mg_levels_2_) >> 1 MPI processes >> > type: sor >> > SOR: type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1. >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=5373, cols=5373 >> > total: nonzeros=28852043, allocated nonzeros=28852043 >> > total number of mallocs used during MatSetValues calls =0 >> > using I-node routines: found 1481 nodes, limit used is 5 >> > Up solver (post-smoother) same as down solver (pre-smoother) >> > Down solver (pre-smoother) on level 3 >> ------------------------------- >> > KSP Object: (solver_fieldsplit_1_mg_levels_3_) >> 1 MPI processes >> > type: chebyshev >> > Chebyshev: eigenvalue estimates: min = 0.0994294, max = >> 1.09372 >> > Chebyshev: eigenvalues estimated using gmres with >> translations [0. 0.1; 0. 1.1] >> > KSP Object: >> (solver_fieldsplit_1_mg_levels_3_esteig_) 1 MPI processes >> > type: gmres >> > GMRES: restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> > GMRES: happy breakdown tolerance 1e-30 >> > maximum iterations=10, initial guess is zero >> > tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using PRECONDITIONED norm type for convergence test >> > maximum iterations=2 >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using nonzero initial guess >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_1_mg_levels_3_) >> 1 MPI processes >> > type: sor >> > SOR: type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1. >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=52147, cols=52147 >> > total: nonzeros=38604909, allocated nonzeros=38604909 >> > total number of mallocs used during MatSetValues calls =2 >> > not using I-node routines >> > Up solver (post-smoother) same as down solver (pre-smoother) >> > Down solver (pre-smoother) on level 4 >> ------------------------------- >> > KSP Object: (solver_fieldsplit_1_mg_levels_4_) >> 1 MPI processes >> > type: chebyshev >> > Chebyshev: eigenvalue estimates: min = 0.158979, max = >> 1.74876 >> > Chebyshev: eigenvalues estimated using gmres with >> translations [0. 0.1; 0. 1.1] >> > KSP Object: >> (solver_fieldsplit_1_mg_levels_4_esteig_) 1 MPI processes >> > type: gmres >> > GMRES: restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> > GMRES: happy breakdown tolerance 1e-30 >> > maximum iterations=10, initial guess is zero >> > tolerances: relative=1e-12, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using PRECONDITIONED norm type for convergence test >> > maximum iterations=2 >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using nonzero initial guess >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_1_mg_levels_4_) >> 1 MPI processes >> > type: sor >> > SOR: type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1. >> > linear system matrix followed by preconditioner matrix: >> > Mat Object: (solver_fieldsplit_1_) 1 >> MPI processes >> > type: schurcomplement >> > rows=384000, cols=384000 >> > Schur complement A11 - A10 inv(A00) A01 >> > A11 >> > Mat Object: (solver_fieldsplit_1_) >> 1 MPI processes >> > type: seqaij >> > rows=384000, cols=384000 >> > total: nonzeros=384000, allocated nonzeros=384000 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > A10 >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=384000, cols=777600 >> > total: nonzeros=1919999, allocated nonzeros=1919999 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > KSP of A00 >> > KSP Object: (solver_fieldsplit_0_) >> 1 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_0_) >> 1 MPI processes >> > type: bjacobi >> > block Jacobi: number of blocks = 1 >> > Local solve is same for all blocks, in the >> following KSP and PC objects: >> > KSP Object: >> (solver_fieldsplit_0_sub_) 1 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: >> (solver_fieldsplit_0_sub_) 1 MPI processes >> > type: ilu >> > ILU: out-of-place factorization >> > 0 levels of fill >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > factor fill ratio given 1., needed 1. >> > Factored matrix follows: >> > Mat Object: >> 1 MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > package used to perform factorization: >> petsc >> > total: nonzeros=5385600, allocated >> nonzeros=5385600 >> > total number of mallocs used during >> MatSetValues calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: >> (solver_fieldsplit_0_) 1 MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > total: nonzeros=5385600, allocated >> nonzeros=5385600 >> > total number of mallocs used during >> MatSetValues calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: >> (solver_fieldsplit_0_) 1 MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > total: nonzeros=5385600, allocated >> nonzeros=5385600 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > A01 >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=777600, cols=384000 >> > total: nonzeros=1919999, allocated nonzeros=1919999 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=384000, cols=384000 >> > total: nonzeros=3416452, allocated nonzeros=3416452 >> > total number of mallocs used during MatSetValues calls =0 >> > not using I-node routines >> > Up solver (post-smoother) same as down solver (pre-smoother) >> > linear system matrix followed by preconditioner matrix: >> > Mat Object: (solver_fieldsplit_1_) 1 MPI >> processes >> > type: schurcomplement >> > rows=384000, cols=384000 >> > Schur complement A11 - A10 inv(A00) A01 >> > A11 >> > Mat Object: (solver_fieldsplit_1_) >> 1 MPI processes >> > type: seqaij >> > rows=384000, cols=384000 >> > total: nonzeros=384000, allocated nonzeros=384000 >> > total number of mallocs used during MatSetValues calls >> =0 >> > not using I-node routines >> > A10 >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=384000, cols=777600 >> > total: nonzeros=1919999, allocated nonzeros=1919999 >> > total number of mallocs used during MatSetValues calls >> =0 >> > not using I-node routines >> > KSP of A00 >> > KSP Object: (solver_fieldsplit_0_) >> 1 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (solver_fieldsplit_0_) >> 1 MPI processes >> > type: bjacobi >> > block Jacobi: number of blocks = 1 >> > Local solve is same for all blocks, in the following >> KSP and PC objects: >> > KSP Object: >> (solver_fieldsplit_0_sub_) 1 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: >> (solver_fieldsplit_0_sub_) 1 MPI processes >> > type: ilu >> > ILU: out-of-place factorization >> > 0 levels of fill >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > factor fill ratio given 1., needed 1. >> > Factored matrix follows: >> > Mat Object: 1 MPI >> processes >> > type: seqaij >> > rows=777600, cols=777600 >> > package used to perform factorization: petsc >> > total: nonzeros=5385600, allocated >> nonzeros=5385600 >> > total number of mallocs used during >> MatSetValues calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: >> (solver_fieldsplit_0_) 1 MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > total: nonzeros=5385600, allocated >> nonzeros=5385600 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: (solver_fieldsplit_0_) >> 1 MPI processes >> > type: seqaij >> > rows=777600, cols=777600 >> > total: nonzeros=5385600, allocated nonzeros=5385600 >> > total number of mallocs used during MatSetValues >> calls =0 >> > not using I-node routines >> > A01 >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=777600, cols=384000 >> > total: nonzeros=1919999, allocated nonzeros=1919999 >> > total number of mallocs used during MatSetValues calls >> =0 >> > not using I-node routines >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=384000, cols=384000 >> > total: nonzeros=3416452, allocated nonzeros=3416452 >> > total number of mallocs used during MatSetValues calls =0 >> > not using I-node routines >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: nest >> > rows=1161600, cols=1161600 >> > Matrix object: >> > type=nest, rows=2, cols=2 >> > MatNest structure: >> > (0,0) : prefix="solver_fieldsplit_0_", type=seqaij, >> rows=777600, cols=777600 >> > (0,1) : type=seqaij, rows=777600, cols=384000 >> > (1,0) : type=seqaij, rows=384000, cols=777600 >> > (1,1) : prefix="solver_fieldsplit_1_", type=seqaij, >> rows=384000, cols=384000 >> > >> > Any insight as to what's happening? Btw this firedrake/petsc-mapdes is >> from way back in october 2015 (yes much has changed since but >> reinstalling/updating firedrake and petsc on LANL's firewall HPC machines >> is a big pain in the ass). >> > >> > Thanks, >> > Justin >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Thu Mar 3 03:36:01 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Thu, 3 Mar 2016 09:36:01 +0000 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: Message-ID: <56D80581.8030903@imperial.ac.uk> On 02/03/16 22:28, Justin Chang wrote: ... > Down solver (pre-smoother) on level 3 > > KSP Object: (solver_fieldsplit_1_mg_levels_3_) > linear system matrix = precond matrix: ... > Mat Object: 1 MPI processes > > type: seqaij > > rows=52147, cols=52147 > > total: nonzeros=38604909, allocated nonzeros=38604909 > > total number of mallocs used during MatSetValues calls =2 > > not using I-node routines > > Down solver (pre-smoother) on level 4 > > KSP Object: (solver_fieldsplit_1_mg_levels_4_) > linear system matrix followed by preconditioner matrix: > > Mat Object: (solver_fieldsplit_1_) ... > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=3416452, allocated nonzeros=3416452 This looks pretty suspicious to me. The original matrix on the finest level has 3.8e5 rows and ~3.4e6 nonzeros. The next level up, the coarsening produces 5.2e4 rows, but 38e6 nonzeros. FWIW, although Justin's PETSc is from Oct 2015, I get the same behaviour with: ad5697c (Master as of 1st March). If I compare with the coarse operators that ML produces on the same problem: The original matrix has, again: Mat Object: 1 MPI processes type: seqaij rows=384000, cols=384000 total: nonzeros=3416452, allocated nonzeros=3416452 total number of mallocs used during MatSetValues calls=0 not using I-node routines While the next finest level has: Mat Object: 1 MPI processes type: seqaij rows=65258, cols=65258 total: nonzeros=1318400, allocated nonzeros=1318400 total number of mallocs used during MatSetValues calls=0 not using I-node routines So we have 6.5e4 rows and 1.3e6 nonzeros, which seems more plausible. Cheers, Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: OpenPGP digital signature URL: From Sander.Arens at UGent.be Thu Mar 3 06:20:21 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Thu, 3 Mar 2016 13:20:21 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J != P, which caused to write the mass matrix into the (otherwise zero) (1,1) block of the Jacobian and which was the reason for the linesearch to fail. However, after fixing that and trying to solve it with FieldSplit with LU factorization for the (0,0) block it failed because there were zero pivots for all rows. Anyway, I found out that attaching the mass matrix to the Lagrange multiplier field also worked. Another related question for my elasticity problem: after creating the rigid body modes with DMPlexCreateRigidBody and attaching it to the displacement field, does the matrix block size of the (0,0) block still have to be set for good performance with gamg? If so, how can I do this? Thanks, Sander On 2 March 2016 at 12:25, Matthew Knepley wrote: > On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens > wrote: > >> Hi, >> >> I'm trying to set a mass matrix preconditioner for the Schur complement >> of an incompressible finite elasticity problem. I tried using the command >> PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, NULL, NULL, >> NULL) (field 1 is the Lagrange multiplier field). >> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in the >> function evaluation after Newton iteration 1. (Btw, I'm using the next >> branch). >> >> Is this because I didn't use PetscDSSetJacobianPreconditioner for the >> other blocks (which uses the Jacobian itself for preconditioning)? If so, >> how can I tell Petsc to use the Jacobian for those blocks? >> > > 1) I put that code in very recently, and do not even have sufficient test, > so it may be buggy > > 2) If you are using FieldSplit, you can control which blocks come from A > and which come from the preconditioner P > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat > > >> I guess when using PetscDSSetJacobianPreconditioner the preconditioner is >> recomputed at every Newton step, so for a constant mass matrix this might >> not be ideal. How can I avoid recomputing this at every Newton iteration? >> > > Maybe we need another flag like > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html > > or we need to expand > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html > > to separately cover the preconditioner matrix. However, both matrices are > computed by one call so this would > involve interface changes to user code, which we do not like to do. Right > now it seems like a small optimization. > I would want to wait and see whether it would really be maningful. > > Thanks, > > Matt > > >> Thanks, >> Sander >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 3 07:21:38 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Mar 2016 07:21:38 -0600 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens wrote: > Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J != P, > which caused to write the mass matrix into the (otherwise zero) (1,1) block > of the Jacobian and which was the reason for the linesearch to fail. > However, after fixing that and trying to solve it with FieldSplit with LU > factorization for the (0,0) block it failed because there were zero pivots > for all rows. > > Anyway, I found out that attaching the mass matrix to the Lagrange > multiplier field also worked. > > Another related question for my elasticity problem: after creating the > rigid body modes with DMPlexCreateRigidBody and attaching it to the > displacement field, does the matrix block size of the (0,0) block still > have to be set for good performance with gamg? If so, how can I do this? > Yes, it should be enough to set the block size of the preconditioner matrix. Matt > Thanks, > Sander > > On 2 March 2016 at 12:25, Matthew Knepley wrote: > >> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >> wrote: >> >>> Hi, >>> >>> I'm trying to set a mass matrix preconditioner for the Schur complement >>> of an incompressible finite elasticity problem. I tried using the command >>> PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, NULL, NULL, >>> NULL) (field 1 is the Lagrange multiplier field). >>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in the >>> function evaluation after Newton iteration 1. (Btw, I'm using the next >>> branch). >>> >>> Is this because I didn't use PetscDSSetJacobianPreconditioner for the >>> other blocks (which uses the Jacobian itself for preconditioning)? If so, >>> how can I tell Petsc to use the Jacobian for those blocks? >>> >> >> 1) I put that code in very recently, and do not even have sufficient >> test, so it may be buggy >> >> 2) If you are using FieldSplit, you can control which blocks come from A >> and which come from the preconditioner P >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >> >> >>> I guess when using PetscDSSetJacobianPreconditioner the preconditioner >>> is recomputed at every Newton step, so for a constant mass matrix this >>> might not be ideal. How can I avoid recomputing this at every Newton >>> iteration? >>> >> >> Maybe we need another flag like >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >> >> or we need to expand >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >> >> to separately cover the preconditioner matrix. However, both matrices are >> computed by one call so this would >> involve interface changes to user code, which we do not like to do. Right >> now it seems like a small optimization. >> I would want to wait and see whether it would really be maningful. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Sander >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sander.Arens at UGent.be Thu Mar 3 07:49:48 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Thu, 3 Mar 2016 14:49:48 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: And how can I do this? Because when I look at all the options with -help I can strangely enough only find -fieldsplit_pressure_mat_block_size and not -fieldsplit_displacement_mat_block_size. Thanks, Sander On 3 March 2016 at 14:21, Matthew Knepley wrote: > On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens > wrote: > >> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J != P, >> which caused to write the mass matrix into the (otherwise zero) (1,1) block >> of the Jacobian and which was the reason for the linesearch to fail. >> However, after fixing that and trying to solve it with FieldSplit with LU >> factorization for the (0,0) block it failed because there were zero pivots >> for all rows. >> >> Anyway, I found out that attaching the mass matrix to the Lagrange >> multiplier field also worked. >> >> Another related question for my elasticity problem: after creating the >> rigid body modes with DMPlexCreateRigidBody and attaching it to the >> displacement field, does the matrix block size of the (0,0) block still >> have to be set for good performance with gamg? If so, how can I do this? >> > > Yes, it should be enough to set the block size of the preconditioner > matrix. > > Matt > > >> Thanks, >> Sander >> >> On 2 March 2016 at 12:25, Matthew Knepley wrote: >> >>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >>> wrote: >>> >>>> Hi, >>>> >>>> I'm trying to set a mass matrix preconditioner for the Schur complement >>>> of an incompressible finite elasticity problem. I tried using the command >>>> PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, NULL, NULL, >>>> NULL) (field 1 is the Lagrange multiplier field). >>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in the >>>> function evaluation after Newton iteration 1. (Btw, I'm using the next >>>> branch). >>>> >>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for the >>>> other blocks (which uses the Jacobian itself for preconditioning)? If so, >>>> how can I tell Petsc to use the Jacobian for those blocks? >>>> >>> >>> 1) I put that code in very recently, and do not even have sufficient >>> test, so it may be buggy >>> >>> 2) If you are using FieldSplit, you can control which blocks come from A >>> and which come from the preconditioner P >>> >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>> >>> >>>> I guess when using PetscDSSetJacobianPreconditioner the preconditioner >>>> is recomputed at every Newton step, so for a constant mass matrix this >>>> might not be ideal. How can I avoid recomputing this at every Newton >>>> iteration? >>>> >>> >>> Maybe we need another flag like >>> >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>> >>> or we need to expand >>> >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>> >>> to separately cover the preconditioner matrix. However, both matrices >>> are computed by one call so this would >>> involve interface changes to user code, which we do not like to do. >>> Right now it seems like a small optimization. >>> I would want to wait and see whether it would really be maningful. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Sander >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 3 07:52:12 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Mar 2016 07:52:12 -0600 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens wrote: > And how can I do this? Because when I look at all the options with -help I > can strangely enough only find -fieldsplit_pressure_mat_block_size and not > -fieldsplit_displacement_mat_block_size. > Maybe I am missing something here. The matrix from which you calculate the preconditioner using GAMG must be created somewhere. Why is the block size not specified at creation time? If the matrix is created by one of the assembly functions in Plex, and the PetscSection has a number of field components, then the block size will already be set. Thanks, Matt > Thanks, > Sander > > On 3 March 2016 at 14:21, Matthew Knepley wrote: > >> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >> wrote: >> >>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J != >>> P, which caused to write the mass matrix into the (otherwise zero) (1,1) >>> block of the Jacobian and which was the reason for the linesearch to fail. >>> However, after fixing that and trying to solve it with FieldSplit with >>> LU factorization for the (0,0) block it failed because there were zero >>> pivots for all rows. >>> >>> Anyway, I found out that attaching the mass matrix to the Lagrange >>> multiplier field also worked. >>> >>> Another related question for my elasticity problem: after creating the >>> rigid body modes with DMPlexCreateRigidBody and attaching it to the >>> displacement field, does the matrix block size of the (0,0) block still >>> have to be set for good performance with gamg? If so, how can I do this? >>> >> >> Yes, it should be enough to set the block size of the preconditioner >> matrix. >> >> Matt >> >> >>> Thanks, >>> Sander >>> >>> On 2 March 2016 at 12:25, Matthew Knepley wrote: >>> >>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>> complement of an incompressible finite elasticity problem. I tried using >>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in >>>>> the function evaluation after Newton iteration 1. (Btw, I'm using the next >>>>> branch). >>>>> >>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for the >>>>> other blocks (which uses the Jacobian itself for preconditioning)? If so, >>>>> how can I tell Petsc to use the Jacobian for those blocks? >>>>> >>>> >>>> 1) I put that code in very recently, and do not even have sufficient >>>> test, so it may be buggy >>>> >>>> 2) If you are using FieldSplit, you can control which blocks come from >>>> A and which come from the preconditioner P >>>> >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>> >>>> >>>>> I guess when using PetscDSSetJacobianPreconditioner the preconditioner >>>>> is recomputed at every Newton step, so for a constant mass matrix this >>>>> might not be ideal. How can I avoid recomputing this at every Newton >>>>> iteration? >>>>> >>>> >>>> Maybe we need another flag like >>>> >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>> >>>> or we need to expand >>>> >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>> >>>> to separately cover the preconditioner matrix. However, both matrices >>>> are computed by one call so this would >>>> involve interface changes to user code, which we do not like to do. >>>> Right now it seems like a small optimization. >>>> I would want to wait and see whether it would really be maningful. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Sander >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sander.Arens at UGent.be Thu Mar 3 08:05:49 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Thu, 3 Mar 2016 15:05:49 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: Yes, the matrix is created by one of the assembly functions in Plex, so that answers my question. I was confused by this because I couldn't see this information when using -snes_view. Thanks for the helpful replies, Sander On 3 March 2016 at 14:52, Matthew Knepley wrote: > On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens > wrote: > >> And how can I do this? Because when I look at all the options with -help >> I can strangely enough only find -fieldsplit_pressure_mat_block_size and >> not -fieldsplit_displacement_mat_block_size. >> > > Maybe I am missing something here. The matrix from which you calculate the > preconditioner using GAMG must be created somewhere. Why > is the block size not specified at creation time? If the matrix is created > by one of the assembly functions in Plex, and the PetscSection has > a number of field components, then the block size will already be set. > > Thanks, > > Matt > > >> Thanks, >> Sander >> >> On 3 March 2016 at 14:21, Matthew Knepley wrote: >> >>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >>> wrote: >>> >>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J != >>>> P, which caused to write the mass matrix into the (otherwise zero) (1,1) >>>> block of the Jacobian and which was the reason for the linesearch to fail. >>>> However, after fixing that and trying to solve it with FieldSplit with >>>> LU factorization for the (0,0) block it failed because there were zero >>>> pivots for all rows. >>>> >>>> Anyway, I found out that attaching the mass matrix to the Lagrange >>>> multiplier field also worked. >>>> >>>> Another related question for my elasticity problem: after creating the >>>> rigid body modes with DMPlexCreateRigidBody and attaching it to the >>>> displacement field, does the matrix block size of the (0,0) block still >>>> have to be set for good performance with gamg? If so, how can I do this? >>>> >>> >>> Yes, it should be enough to set the block size of the preconditioner >>> matrix. >>> >>> Matt >>> >>> >>>> Thanks, >>>> Sander >>>> >>>> On 2 March 2016 at 12:25, Matthew Knepley wrote: >>>> >>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in >>>>>> the function evaluation after Newton iteration 1. (Btw, I'm using the next >>>>>> branch). >>>>>> >>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for the >>>>>> other blocks (which uses the Jacobian itself for preconditioning)? If so, >>>>>> how can I tell Petsc to use the Jacobian for those blocks? >>>>>> >>>>> >>>>> 1) I put that code in very recently, and do not even have sufficient >>>>> test, so it may be buggy >>>>> >>>>> 2) If you are using FieldSplit, you can control which blocks come from >>>>> A and which come from the preconditioner P >>>>> >>>>> >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>> >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>> >>>>> >>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>> Newton iteration? >>>>>> >>>>> >>>>> Maybe we need another flag like >>>>> >>>>> >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>> >>>>> or we need to expand >>>>> >>>>> >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>> >>>>> to separately cover the preconditioner matrix. However, both matrices >>>>> are computed by one call so this would >>>>> involve interface changes to user code, which we do not like to do. >>>>> Right now it seems like a small optimization. >>>>> I would want to wait and see whether it would really be maningful. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Sander >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 3 08:33:08 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Mar 2016 08:33:08 -0600 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens wrote: > Yes, the matrix is created by one of the assembly functions in Plex, so > that answers my question. I was confused by this because I couldn't see > this information when using -snes_view. > Hmm, then something is wrong because the block size should be printed for the matrix at the end of the solver in -snes_view, Can you send that to me? Thanks, Matt > Thanks for the helpful replies, > Sander > > On 3 March 2016 at 14:52, Matthew Knepley wrote: > >> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >> wrote: >> >>> And how can I do this? Because when I look at all the options with -help >>> I can strangely enough only find -fieldsplit_pressure_mat_block_size and >>> not -fieldsplit_displacement_mat_block_size. >>> >> >> Maybe I am missing something here. The matrix from which you calculate >> the preconditioner using GAMG must be created somewhere. Why >> is the block size not specified at creation time? If the matrix is >> created by one of the assembly functions in Plex, and the PetscSection has >> a number of field components, then the block size will already be set. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Sander >>> >>> On 3 March 2016 at 14:21, Matthew Knepley wrote: >>> >>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >>>> wrote: >>>> >>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J != >>>>> P, which caused to write the mass matrix into the (otherwise zero) (1,1) >>>>> block of the Jacobian and which was the reason for the linesearch to fail. >>>>> However, after fixing that and trying to solve it with FieldSplit with >>>>> LU factorization for the (0,0) block it failed because there were zero >>>>> pivots for all rows. >>>>> >>>>> Anyway, I found out that attaching the mass matrix to the Lagrange >>>>> multiplier field also worked. >>>>> >>>>> Another related question for my elasticity problem: after creating the >>>>> rigid body modes with DMPlexCreateRigidBody and attaching it to the >>>>> displacement field, does the matrix block size of the (0,0) block still >>>>> have to be set for good performance with gamg? If so, how can I do this? >>>>> >>>> >>>> Yes, it should be enough to set the block size of the preconditioner >>>> matrix. >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Sander >>>>> >>>>> On 2 March 2016 at 12:25, Matthew Knepley wrote: >>>>> >>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in >>>>>>> the function evaluation after Newton iteration 1. (Btw, I'm using the next >>>>>>> branch). >>>>>>> >>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for >>>>>>> the other blocks (which uses the Jacobian itself for preconditioning)? If >>>>>>> so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>> >>>>>> >>>>>> 1) I put that code in very recently, and do not even have sufficient >>>>>> test, so it may be buggy >>>>>> >>>>>> 2) If you are using FieldSplit, you can control which blocks come >>>>>> from A and which come from the preconditioner P >>>>>> >>>>>> >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>> >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>> >>>>>> >>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>> Newton iteration? >>>>>>> >>>>>> >>>>>> Maybe we need another flag like >>>>>> >>>>>> >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>> >>>>>> or we need to expand >>>>>> >>>>>> >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>> >>>>>> to separately cover the preconditioner matrix. However, both matrices >>>>>> are computed by one call so this would >>>>>> involve interface changes to user code, which we do not like to do. >>>>>> Right now it seems like a small optimization. >>>>>> I would want to wait and see whether it would really be maningful. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Sander >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sander.Arens at UGent.be Thu Mar 3 10:28:23 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Thu, 3 Mar 2016 17:28:23 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: Sure, here it is. Thanks, Sander On 3 March 2016 at 15:33, Matthew Knepley wrote: > On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens > wrote: > >> Yes, the matrix is created by one of the assembly functions in Plex, so >> that answers my question. I was confused by this because I couldn't see >> this information when using -snes_view. >> > > Hmm, then something is wrong because the block size should be printed for > the matrix at the end of the solver in -snes_view, Can you > send that to me? > > Thanks, > > Matt > > >> Thanks for the helpful replies, >> Sander >> >> On 3 March 2016 at 14:52, Matthew Knepley wrote: >> >>> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >>> wrote: >>> >>>> And how can I do this? Because when I look at all the options with >>>> -help I can strangely enough only find -fieldsplit_pressure_mat_block_size >>>> and not -fieldsplit_displacement_mat_block_size. >>>> >>> >>> Maybe I am missing something here. The matrix from which you calculate >>> the preconditioner using GAMG must be created somewhere. Why >>> is the block size not specified at creation time? If the matrix is >>> created by one of the assembly functions in Plex, and the PetscSection has >>> a number of field components, then the block size will already be set. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Sander >>>> >>>> On 3 March 2016 at 14:21, Matthew Knepley wrote: >>>> >>>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >>>>> wrote: >>>>> >>>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J >>>>>> != P, which caused to write the mass matrix into the (otherwise zero) (1,1) >>>>>> block of the Jacobian and which was the reason for the linesearch to fail. >>>>>> However, after fixing that and trying to solve it with FieldSplit >>>>>> with LU factorization for the (0,0) block it failed because there were zero >>>>>> pivots for all rows. >>>>>> >>>>>> Anyway, I found out that attaching the mass matrix to the Lagrange >>>>>> multiplier field also worked. >>>>>> >>>>>> Another related question for my elasticity problem: after creating >>>>>> the rigid body modes with DMPlexCreateRigidBody and attaching it to the >>>>>> displacement field, does the matrix block size of the (0,0) block still >>>>>> have to be set for good performance with gamg? If so, how can I do this? >>>>>> >>>>> >>>>> Yes, it should be enough to set the block size of the preconditioner >>>>> matrix. >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Sander >>>>>> >>>>>> On 2 March 2016 at 12:25, Matthew Knepley wrote: >>>>>> >>>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf in >>>>>>>> the function evaluation after Newton iteration 1. (Btw, I'm using the next >>>>>>>> branch). >>>>>>>> >>>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for >>>>>>>> the other blocks (which uses the Jacobian itself for preconditioning)? If >>>>>>>> so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>>> >>>>>>> >>>>>>> 1) I put that code in very recently, and do not even have sufficient >>>>>>> test, so it may be buggy >>>>>>> >>>>>>> 2) If you are using FieldSplit, you can control which blocks come >>>>>>> from A and which come from the preconditioner P >>>>>>> >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>>> >>>>>>> >>>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>>> Newton iteration? >>>>>>>> >>>>>>> >>>>>>> Maybe we need another flag like >>>>>>> >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>>> >>>>>>> or we need to expand >>>>>>> >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>>> >>>>>>> to separately cover the preconditioner matrix. However, both >>>>>>> matrices are computed by one call so this would >>>>>>> involve interface changes to user code, which we do not like to do. >>>>>>> Right now it seems like a small optimization. >>>>>>> I would want to wait and see whether it would really be maningful. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Sander >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: snes Type: application/octet-stream Size: 20713 bytes Desc: not available URL: From knepley at gmail.com Thu Mar 3 10:29:19 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Mar 2016 10:29:19 -0600 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: You can see the block size in the output Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 Thanks, Matt On Thu, Mar 3, 2016 at 10:28 AM, Sander Arens wrote: > Sure, here it is. > > Thanks, > Sander > > On 3 March 2016 at 15:33, Matthew Knepley wrote: > >> On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens >> wrote: >> >>> Yes, the matrix is created by one of the assembly functions in Plex, so >>> that answers my question. I was confused by this because I couldn't see >>> this information when using -snes_view. >>> >> >> Hmm, then something is wrong because the block size should be printed for >> the matrix at the end of the solver in -snes_view, Can you >> send that to me? >> >> Thanks, >> >> Matt >> >> >>> Thanks for the helpful replies, >>> Sander >>> >>> On 3 March 2016 at 14:52, Matthew Knepley wrote: >>> >>>> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >>>> wrote: >>>> >>>>> And how can I do this? Because when I look at all the options with >>>>> -help I can strangely enough only find -fieldsplit_pressure_mat_block_size >>>>> and not -fieldsplit_displacement_mat_block_size. >>>>> >>>> >>>> Maybe I am missing something here. The matrix from which you calculate >>>> the preconditioner using GAMG must be created somewhere. Why >>>> is the block size not specified at creation time? If the matrix is >>>> created by one of the assembly functions in Plex, and the PetscSection has >>>> a number of field components, then the block size will already be set. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Sander >>>>> >>>>> On 3 March 2016 at 14:21, Matthew Knepley wrote: >>>>> >>>>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >>>>>> wrote: >>>>>> >>>>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J >>>>>>> != P, which caused to write the mass matrix into the (otherwise zero) (1,1) >>>>>>> block of the Jacobian and which was the reason for the linesearch to fail. >>>>>>> However, after fixing that and trying to solve it with FieldSplit >>>>>>> with LU factorization for the (0,0) block it failed because there were zero >>>>>>> pivots for all rows. >>>>>>> >>>>>>> Anyway, I found out that attaching the mass matrix to the Lagrange >>>>>>> multiplier field also worked. >>>>>>> >>>>>>> Another related question for my elasticity problem: after creating >>>>>>> the rigid body modes with DMPlexCreateRigidBody and attaching it to the >>>>>>> displacement field, does the matrix block size of the (0,0) block still >>>>>>> have to be set for good performance with gamg? If so, how can I do this? >>>>>>> >>>>>> >>>>>> Yes, it should be enough to set the block size of the preconditioner >>>>>> matrix. >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Sander >>>>>>> >>>>>>> On 2 March 2016 at 12:25, Matthew Knepley wrote: >>>>>>> >>>>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens >>>>>>> > wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf >>>>>>>>> in the function evaluation after Newton iteration 1. (Btw, I'm using the >>>>>>>>> next branch). >>>>>>>>> >>>>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for >>>>>>>>> the other blocks (which uses the Jacobian itself for preconditioning)? If >>>>>>>>> so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>>>> >>>>>>>> >>>>>>>> 1) I put that code in very recently, and do not even have >>>>>>>> sufficient test, so it may be buggy >>>>>>>> >>>>>>>> 2) If you are using FieldSplit, you can control which blocks come >>>>>>>> from A and which come from the preconditioner P >>>>>>>> >>>>>>>> >>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>>>> >>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>>>> >>>>>>>> >>>>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>>>> Newton iteration? >>>>>>>>> >>>>>>>> >>>>>>>> Maybe we need another flag like >>>>>>>> >>>>>>>> >>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>>>> >>>>>>>> or we need to expand >>>>>>>> >>>>>>>> >>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>>>> >>>>>>>> to separately cover the preconditioner matrix. However, both >>>>>>>> matrices are computed by one call so this would >>>>>>>> involve interface changes to user code, which we do not like to do. >>>>>>>> Right now it seems like a small optimization. >>>>>>>> I would want to wait and see whether it would really be maningful. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sander >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sander.Arens at UGent.be Thu Mar 3 10:48:20 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Thu, 3 Mar 2016 17:48:20 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: So is the block size here refering to the dimension of the near-nullspace or has it something to do with the coarsening? I would have thought to also see the block size in this part: Mat Object: 1 MPI processes type: seqaij rows=1701, cols=1701 total: nonzeros=109359, allocated nonzeros=109359 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node routines: found 567 nodes, limit used is 5 Thanks, Sander On 3 March 2016 at 17:29, Matthew Knepley wrote: > You can see the block size in the output > > Mat Object: 1 MPI processes > type: seqaij > rows=12, cols=12, bs=6 > total: nonzeros=144, allocated nonzeros=144 > total number of mallocs used during MatSetValues calls > =0 > using I-node routines: found 3 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=12, cols=12, bs=6 > total: nonzeros=144, allocated nonzeros=144 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 3 nodes, limit used is 5 > > Thanks, > > Matt > > On Thu, Mar 3, 2016 at 10:28 AM, Sander Arens > wrote: > >> Sure, here it is. >> >> Thanks, >> Sander >> >> On 3 March 2016 at 15:33, Matthew Knepley wrote: >> >>> On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens >>> wrote: >>> >>>> Yes, the matrix is created by one of the assembly functions in Plex, so >>>> that answers my question. I was confused by this because I couldn't see >>>> this information when using -snes_view. >>>> >>> >>> Hmm, then something is wrong because the block size should be printed >>> for the matrix at the end of the solver in -snes_view, Can you >>> send that to me? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks for the helpful replies, >>>> Sander >>>> >>>> On 3 March 2016 at 14:52, Matthew Knepley wrote: >>>> >>>>> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >>>>> wrote: >>>>> >>>>>> And how can I do this? Because when I look at all the options with >>>>>> -help I can strangely enough only find -fieldsplit_pressure_mat_block_size >>>>>> and not -fieldsplit_displacement_mat_block_size. >>>>>> >>>>> >>>>> Maybe I am missing something here. The matrix from which you calculate >>>>> the preconditioner using GAMG must be created somewhere. Why >>>>> is the block size not specified at creation time? If the matrix is >>>>> created by one of the assembly functions in Plex, and the PetscSection has >>>>> a number of field components, then the block size will already be set. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Sander >>>>>> >>>>>> On 3 March 2016 at 14:21, Matthew Knepley wrote: >>>>>> >>>>>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >>>>>>> wrote: >>>>>>> >>>>>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with J >>>>>>>> != P, which caused to write the mass matrix into the (otherwise zero) (1,1) >>>>>>>> block of the Jacobian and which was the reason for the linesearch to fail. >>>>>>>> However, after fixing that and trying to solve it with FieldSplit >>>>>>>> with LU factorization for the (0,0) block it failed because there were zero >>>>>>>> pivots for all rows. >>>>>>>> >>>>>>>> Anyway, I found out that attaching the mass matrix to the Lagrange >>>>>>>> multiplier field also worked. >>>>>>>> >>>>>>>> Another related question for my elasticity problem: after creating >>>>>>>> the rigid body modes with DMPlexCreateRigidBody and attaching it to the >>>>>>>> displacement field, does the matrix block size of the (0,0) block still >>>>>>>> have to be set for good performance with gamg? If so, how can I do this? >>>>>>>> >>>>>>> >>>>>>> Yes, it should be enough to set the block size of the preconditioner >>>>>>> matrix. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Sander >>>>>>>> >>>>>>>> On 2 March 2016 at 12:25, Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens < >>>>>>>>> Sander.Arens at ugent.be> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf >>>>>>>>>> in the function evaluation after Newton iteration 1. (Btw, I'm using the >>>>>>>>>> next branch). >>>>>>>>>> >>>>>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner for >>>>>>>>>> the other blocks (which uses the Jacobian itself for preconditioning)? If >>>>>>>>>> so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>>>>> >>>>>>>>> >>>>>>>>> 1) I put that code in very recently, and do not even have >>>>>>>>> sufficient test, so it may be buggy >>>>>>>>> >>>>>>>>> 2) If you are using FieldSplit, you can control which blocks come >>>>>>>>> from A and which come from the preconditioner P >>>>>>>>> >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>>>>> >>>>>>>>> >>>>>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>>>>> Newton iteration? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Maybe we need another flag like >>>>>>>>> >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>>>>> >>>>>>>>> or we need to expand >>>>>>>>> >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>>>>> >>>>>>>>> to separately cover the preconditioner matrix. However, both >>>>>>>>> matrices are computed by one call so this would >>>>>>>>> involve interface changes to user code, which we do not like to >>>>>>>>> do. Right now it seems like a small optimization. >>>>>>>>> I would want to wait and see whether it would really be maningful. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Sander >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 3 11:00:13 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Mar 2016 11:00:13 -0600 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: On Thu, Mar 3, 2016 at 10:48 AM, Sander Arens wrote: > So is the block size here refering to the dimension of the near-nullspace > or has it something to do with the coarsening? > > I would have thought to also see the block size in this part: > Dang, that is a problem. It should have the correct block size when FS pulls it out of the overall matrix. This should be passed down using the block size of the IS. Something has been broken here. I will put it on my list of things to look at. Thanks, Matt > Mat Object: 1 MPI processes > type: seqaij > rows=1701, cols=1701 > total: nonzeros=109359, allocated nonzeros=109359 > total number of mallocs used during MatSetValues calls =0 > has attached near null space > using I-node routines: found 567 nodes, limit used is > 5 > > Thanks, > Sander > > > On 3 March 2016 at 17:29, Matthew Knepley wrote: > >> You can see the block size in the output >> >> Mat Object: 1 MPI processes >> type: seqaij >> rows=12, cols=12, bs=6 >> total: nonzeros=144, allocated nonzeros=144 >> total number of mallocs used during MatSetValues >> calls =0 >> using I-node routines: found 3 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=12, cols=12, bs=6 >> total: nonzeros=144, allocated nonzeros=144 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 3 nodes, limit used is 5 >> >> Thanks, >> >> Matt >> >> On Thu, Mar 3, 2016 at 10:28 AM, Sander Arens >> wrote: >> >>> Sure, here it is. >>> >>> Thanks, >>> Sander >>> >>> On 3 March 2016 at 15:33, Matthew Knepley wrote: >>> >>>> On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens >>>> wrote: >>>> >>>>> Yes, the matrix is created by one of the assembly functions in Plex, >>>>> so that answers my question. I was confused by this because I couldn't see >>>>> this information when using -snes_view. >>>>> >>>> >>>> Hmm, then something is wrong because the block size should be printed >>>> for the matrix at the end of the solver in -snes_view, Can you >>>> send that to me? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks for the helpful replies, >>>>> Sander >>>>> >>>>> On 3 March 2016 at 14:52, Matthew Knepley wrote: >>>>> >>>>>> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >>>>>> wrote: >>>>>> >>>>>>> And how can I do this? Because when I look at all the options with >>>>>>> -help I can strangely enough only find -fieldsplit_pressure_mat_block_size >>>>>>> and not -fieldsplit_displacement_mat_block_size. >>>>>>> >>>>>> >>>>>> Maybe I am missing something here. The matrix from which you >>>>>> calculate the preconditioner using GAMG must be created somewhere. Why >>>>>> is the block size not specified at creation time? If the matrix is >>>>>> created by one of the assembly functions in Plex, and the PetscSection has >>>>>> a number of field components, then the block size will already be set. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Sander >>>>>>> >>>>>>> On 3 March 2016 at 14:21, Matthew Knepley wrote: >>>>>>> >>>>>>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens >>>>>>> > wrote: >>>>>>>> >>>>>>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with >>>>>>>>> J != P, which caused to write the mass matrix into the (otherwise zero) >>>>>>>>> (1,1) block of the Jacobian and which was the reason for the linesearch to >>>>>>>>> fail. >>>>>>>>> However, after fixing that and trying to solve it with FieldSplit >>>>>>>>> with LU factorization for the (0,0) block it failed because there were zero >>>>>>>>> pivots for all rows. >>>>>>>>> >>>>>>>>> Anyway, I found out that attaching the mass matrix to the Lagrange >>>>>>>>> multiplier field also worked. >>>>>>>>> >>>>>>>>> Another related question for my elasticity problem: after creating >>>>>>>>> the rigid body modes with DMPlexCreateRigidBody and attaching it to the >>>>>>>>> displacement field, does the matrix block size of the (0,0) block still >>>>>>>>> have to be set for good performance with gamg? If so, how can I do this? >>>>>>>>> >>>>>>>> >>>>>>>> Yes, it should be enough to set the block size of the >>>>>>>> preconditioner matrix. >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sander >>>>>>>>> >>>>>>>>> On 2 March 2016 at 12:25, Matthew Knepley >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens < >>>>>>>>>> Sander.Arens at ugent.be> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or Inf >>>>>>>>>>> in the function evaluation after Newton iteration 1. (Btw, I'm using the >>>>>>>>>>> next branch). >>>>>>>>>>> >>>>>>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner >>>>>>>>>>> for the other blocks (which uses the Jacobian itself for preconditioning)? >>>>>>>>>>> If so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 1) I put that code in very recently, and do not even have >>>>>>>>>> sufficient test, so it may be buggy >>>>>>>>>> >>>>>>>>>> 2) If you are using FieldSplit, you can control which blocks come >>>>>>>>>> from A and which come from the preconditioner P >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>>>>>> >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>>>>>> Newton iteration? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Maybe we need another flag like >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>>>>>> >>>>>>>>>> or we need to expand >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>>>>>> >>>>>>>>>> to separately cover the preconditioner matrix. However, both >>>>>>>>>> matrices are computed by one call so this would >>>>>>>>>> involve interface changes to user code, which we do not like to >>>>>>>>>> do. Right now it seems like a small optimization. >>>>>>>>>> I would want to wait and see whether it would really be maningful. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Sander >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sander.Arens at UGent.be Thu Mar 3 11:02:44 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Thu, 3 Mar 2016 18:02:44 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: Ok, thanks for clearing that up! On 3 March 2016 at 18:00, Matthew Knepley wrote: > On Thu, Mar 3, 2016 at 10:48 AM, Sander Arens > wrote: > >> So is the block size here refering to the dimension of the near-nullspace >> or has it something to do with the coarsening? >> >> I would have thought to also see the block size in this part: >> > > Dang, that is a problem. It should have the correct block size when FS > pulls it out of the overall matrix. This should be > passed down using the block size of the IS. Something has been broken > here. I will put it on my list of things to look at. > > Thanks, > > Matt > > >> Mat Object: 1 MPI processes >> type: seqaij >> rows=1701, cols=1701 >> total: nonzeros=109359, allocated nonzeros=109359 >> total number of mallocs used during MatSetValues calls >> =0 >> has attached near null space >> using I-node routines: found 567 nodes, limit used >> is 5 >> >> Thanks, >> Sander >> >> >> On 3 March 2016 at 17:29, Matthew Knepley wrote: >> >>> You can see the block size in the output >>> >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=12, cols=12, bs=6 >>> total: nonzeros=144, allocated nonzeros=144 >>> total number of mallocs used during MatSetValues >>> calls =0 >>> using I-node routines: found 3 nodes, limit used >>> is 5 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=12, cols=12, bs=6 >>> total: nonzeros=144, allocated nonzeros=144 >>> total number of mallocs used during MatSetValues calls =0 >>> using I-node routines: found 3 nodes, limit used is 5 >>> >>> Thanks, >>> >>> Matt >>> >>> On Thu, Mar 3, 2016 at 10:28 AM, Sander Arens >>> wrote: >>> >>>> Sure, here it is. >>>> >>>> Thanks, >>>> Sander >>>> >>>> On 3 March 2016 at 15:33, Matthew Knepley wrote: >>>> >>>>> On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens >>>>> wrote: >>>>> >>>>>> Yes, the matrix is created by one of the assembly functions in Plex, >>>>>> so that answers my question. I was confused by this because I couldn't see >>>>>> this information when using -snes_view. >>>>>> >>>>> >>>>> Hmm, then something is wrong because the block size should be printed >>>>> for the matrix at the end of the solver in -snes_view, Can you >>>>> send that to me? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks for the helpful replies, >>>>>> Sander >>>>>> >>>>>> On 3 March 2016 at 14:52, Matthew Knepley wrote: >>>>>> >>>>>>> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >>>>>>> wrote: >>>>>>> >>>>>>>> And how can I do this? Because when I look at all the options with >>>>>>>> -help I can strangely enough only find -fieldsplit_pressure_mat_block_size >>>>>>>> and not -fieldsplit_displacement_mat_block_size. >>>>>>>> >>>>>>> >>>>>>> Maybe I am missing something here. The matrix from which you >>>>>>> calculate the preconditioner using GAMG must be created somewhere. Why >>>>>>> is the block size not specified at creation time? If the matrix is >>>>>>> created by one of the assembly functions in Plex, and the PetscSection has >>>>>>> a number of field components, then the block size will already be >>>>>>> set. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Sander >>>>>>>> >>>>>>>> On 3 March 2016 at 14:21, Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens < >>>>>>>>> Sander.Arens at ugent.be> wrote: >>>>>>>>> >>>>>>>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) with >>>>>>>>>> J != P, which caused to write the mass matrix into the (otherwise zero) >>>>>>>>>> (1,1) block of the Jacobian and which was the reason for the linesearch to >>>>>>>>>> fail. >>>>>>>>>> However, after fixing that and trying to solve it with FieldSplit >>>>>>>>>> with LU factorization for the (0,0) block it failed because there were zero >>>>>>>>>> pivots for all rows. >>>>>>>>>> >>>>>>>>>> Anyway, I found out that attaching the mass matrix to the >>>>>>>>>> Lagrange multiplier field also worked. >>>>>>>>>> >>>>>>>>>> Another related question for my elasticity problem: after >>>>>>>>>> creating the rigid body modes with DMPlexCreateRigidBody and attaching it >>>>>>>>>> to the displacement field, does the matrix block size of the (0,0) block >>>>>>>>>> still have to be set for good performance with gamg? If so, how can I do >>>>>>>>>> this? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, it should be enough to set the block size of the >>>>>>>>> preconditioner matrix. >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Sander >>>>>>>>>> >>>>>>>>>> On 2 March 2016 at 12:25, Matthew Knepley >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens < >>>>>>>>>>> Sander.Arens at ugent.be> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or >>>>>>>>>>>> Inf in the function evaluation after Newton iteration 1. (Btw, I'm using >>>>>>>>>>>> the next branch). >>>>>>>>>>>> >>>>>>>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner >>>>>>>>>>>> for the other blocks (which uses the Jacobian itself for preconditioning)? >>>>>>>>>>>> If so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 1) I put that code in very recently, and do not even have >>>>>>>>>>> sufficient test, so it may be buggy >>>>>>>>>>> >>>>>>>>>>> 2) If you are using FieldSplit, you can control which blocks >>>>>>>>>>> come from A and which come from the preconditioner P >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>>>>>>> >>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>>>>>>> Newton iteration? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Maybe we need another flag like >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>>>>>>> >>>>>>>>>>> or we need to expand >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>>>>>>> >>>>>>>>>>> to separately cover the preconditioner matrix. However, both >>>>>>>>>>> matrices are computed by one call so this would >>>>>>>>>>> involve interface changes to user code, which we do not like to >>>>>>>>>>> do. Right now it seems like a small optimization. >>>>>>>>>>> I would want to wait and see whether it would really be >>>>>>>>>>> maningful. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Sander >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Mar 3 14:19:11 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 3 Mar 2016 15:19:11 -0500 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: Message-ID: On Wed, Mar 2, 2016 at 5:28 PM, Justin Chang wrote: > Dear all, > > Using the firedrake project, I am solving this simple mixed poisson > problem: > > mesh = UnitCubeMesh(40,40,40) > V = FunctionSpace(mesh,"RT",1) > Q = FunctionSpace(mesh,"DG",0) > W = V*Q > > v, p = TrialFunctions(W) > w, q = TestFunctions(W) > > f = Function(Q) > > f.interpolate(Expression("12*pi*pi*sin(pi*x[0]*2)*sin(pi*x[1]*2)*sin(2*pi*x[2])")) > > a = dot(v,w)*dx - p*div(w)*dx + div(v)*q*dx > L = f*q*dx > > u = Function(W) > solve(a==L,u,solver_parameters={...}) > > This problem has 1161600 degrees of freedom. The solver_parameters are: > > -ksp_type gmres > -pc_type fieldsplit > -pc_fieldsplit_type schur > -pc_fieldsplit_schur_fact_type: upper > -pc_fieldsplit_schur_precondition selfp > -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type bjacobi > -fieldsplit_1_ksp_type preonly > -fieldsplit_1_pc_type hypre/ml/gamg > > for the last option, I compared the wall-clock timings for hypre, ml,and > gamg. Here are the strong-scaling results (across 64 cores, 8 cores per > Intel Xeon E5-2670 node) for hypre, ml, and gamg: > > hypre: > 1 core: 47.5 s, 12 solver iters > 2 cores: 34.1 s, 15 solver iters > 4 cores: 21.5 s, 15 solver iters > 8 cores: 16.6 s, 15 solver iters > 16 cores: 10.2 s, 15 solver iters > 24 cores: 7.66 s, 15 solver iters > 32 cores: 6.31 s, 15 solver iters > 40 cores: 5.68 s, 15 solver iters > 48 cores: 5.36 s, 16 solver iters > 56 cores: 5.12 s, 16 solver iters > 64 cores: 4.99 s, 16 solver iters > > ml: > 1 core: 4.44 s, 14 solver iters > 2 cores: 2.85 s, 16 solver iters > 4 cores: 1.6 s, 17 solver iters > 8 cores: 0.966 s, 17 solver iters > 16 cores: 0.585 s, 18 solver iters > 24 cores: 0.440 s, 18 solver iters > 32 cores: 0.375 s, 18 solver iters > 40 cores: 0.332 s, 18 solver iters > 48 cores: 0.307 s, 17 solver iters > 56 cores: 0.290 s, 18 solver iters > 64 cores: 0.281 s, 18 solver items > > gamg: > 1 core: 613 s, 12 solver iters > 2 cores: 204 s, 15 solver iters > 4 cores: 77.1 s, 15 solver iters > 8 cores: 38.1 s, 15 solver iters > 16 cores: 15.9 s, 16 solver iters > 24 cores: 9.24 s, 16 solver iters > 32 cores: 5.92 s, 16 solver iters > 40 cores: 4.72 s, 16 solver iters > 48 cores: 3.89 s, 16 solver iters > 56 cores: 3.65 s, 16 solver iters > 64 cores: 3.46 s, 16 solver iters > > The performance difference between ML and HYPRE makes sense to me, but > what I am really confused about is GAMG. It seems GAMG is really slow on a > single core but something internally is causing it to speed up > super-linearly as I increase the number of MPI processes. Shouldn't ML and > GAMG have the same performance? I am not sure what log outputs to give you > guys, but for starters, below is -ksp_view for the single core case with > GAMG > > KSP Object:(solver_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object:(solver_) 1 MPI processes > > type: fieldsplit > > FieldSplit with Schur preconditioner, factorization UPPER > > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses (lumped, if requested) A00's diagonal's > inverse > > Split info: > > Split number 0 Defined by IS > > Split number 1 Defined by IS > > KSP solver for A00 block > > KSP Object: (solver_fieldsplit_0_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_) 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following KSP and PC > objects: > > KSP Object: (solver_fieldsplit_0_sub_) 1 MPI > processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_sub_) 1 MPI > processes > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 1., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > package used to perform factorization: petsc > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) 1 > MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP solver for S = A11 - A10 inv(A00) A01 > > KSP Object: (solver_fieldsplit_1_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_) 1 MPI processes > > type: gamg > > MG: type is MULTIPLICATIVE, levels=5 cycles=v > > Cycles per PCApply=1 > > Using Galerkin computed coarse grid matrices > > GAMG specific options > > Threshold for dropping small values from graph 0. > > AGG specific options > > Symmetric graph false > > Coarse grid solver -- level ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_coarse_) > 1 MPI processes > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_coarse_) 1 > MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following KSP and > PC objects: > > KSP Object: > (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_coarse_sub_) > 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > > matrix ordering: nd > > factor fill ratio given 5., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=9, cols=9 > > package used to perform factorization: petsc > > total: nonzeros=81, allocated nonzeros=81 > > total number of mallocs used during MatSetValues > calls =0 > > using I-node routines: found 2 nodes, limit used > is 5 > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=9, cols=9 > > total: nonzeros=81, allocated nonzeros=81 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 2 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=9, cols=9 > > total: nonzeros=81, allocated nonzeros=81 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 2 nodes, limit used is 5 > > Down solver (pre-smoother) on level 1 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_1_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0999525, max = > 1.09948 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_1_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_1_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=207, cols=207 > > total: nonzeros=42849, allocated nonzeros=42849 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 42 nodes, limit used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 2 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_2_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0996628, max = > 1.09629 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_2_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_2_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=5373, cols=5373 > > total: nonzeros=28852043, allocated nonzeros=28852043 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 1481 nodes, limit used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 3 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_3_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0994294, max = > 1.09372 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_3_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_3_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=52147, cols=52147 > > total: nonzeros=38604909, allocated nonzeros=38604909 > > total number of mallocs used during MatSetValues calls =2 > > not using I-node routines > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 4 > ------------------------------- > > KSP Object: (solver_fieldsplit_1_mg_levels_4_) > 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.158979, max = > 1.74876 > > Chebyshev: eigenvalues estimated using gmres with > translations [0. 0.1; 0. 1.1] > > KSP Object: > (solver_fieldsplit_1_mg_levels_4_esteig_) 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, > divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_1_mg_levels_4_) > 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1. > > linear system matrix followed by preconditioner matrix: > > Mat Object: (solver_fieldsplit_1_) 1 > MPI processes > > type: schurcomplement > > rows=384000, cols=384000 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (solver_fieldsplit_1_) > 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=384000, allocated nonzeros=384000 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > A10 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=777600 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > KSP of A00 > > KSP Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following > KSP and PC objects: > > KSP Object: > (solver_fieldsplit_0_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: > (solver_fieldsplit_0_sub_) 1 MPI processes > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 1., needed 1. > > Factored matrix follows: > > Mat Object: 1 > MPI processes > > type: seqaij > > rows=777600, cols=777600 > > package used to perform factorization: > petsc > > total: nonzeros=5385600, allocated > nonzeros=5385600 > > total number of mallocs used during > MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: > (solver_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated > nonzeros=5385600 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > A01 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=777600, cols=384000 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=3416452, allocated nonzeros=3416452 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Up solver (post-smoother) same as down solver (pre-smoother) > > linear system matrix followed by preconditioner matrix: > > Mat Object: (solver_fieldsplit_1_) 1 MPI processes > > type: schurcomplement > > rows=384000, cols=384000 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (solver_fieldsplit_1_) > 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=384000, allocated nonzeros=384000 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > A10 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=777600 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP of A00 > > KSP Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following KSP > and PC objects: > > KSP Object: (solver_fieldsplit_0_sub_) > 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (solver_fieldsplit_0_sub_) > 1 MPI processes > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 1., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI > processes > > type: seqaij > > rows=777600, cols=777600 > > package used to perform factorization: petsc > > total: nonzeros=5385600, allocated > nonzeros=5385600 > > total number of mallocs used during > MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues > calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: (solver_fieldsplit_0_) > 1 MPI processes > > type: seqaij > > rows=777600, cols=777600 > > total: nonzeros=5385600, allocated nonzeros=5385600 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > A01 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=777600, cols=384000 > > total: nonzeros=1919999, allocated nonzeros=1919999 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Mat Object: 1 MPI processes > > type: seqaij > > rows=384000, cols=384000 > > total: nonzeros=3416452, allocated nonzeros=3416452 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: nest > > rows=1161600, cols=116160 > > Matrix object: > > type=nest, rows=2, cols=2 > > MatNest structure: > > (0,0) : prefix="solver_fieldsplit_0_", type=seqaij, rows=777600, > cols=777600 > > (0,1) : type=seqaij, rows=777600, cols=384000 > > (1,0) : type=seqaij, rows=384000, cols=777600 > > (1,1) : prefix="solver_fieldsplit_1_", type=seqaij, rows=384000, > cols=384000 > > Any insight as to what's happening? Btw this firedrake/petsc-mapdes is > from way back in october 2015 (yes much has > This should not be a problem. > changed since but reinstalling/updating firedrake and petsc on LANL's > firewall HPC machines is a big pain in the ass). > > Thanks, > Justin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Mar 3 14:48:53 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 3 Mar 2016 15:48:53 -0500 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: <56D80581.8030903@imperial.ac.uk> References: <56D80581.8030903@imperial.ac.uk> Message-ID: You have a very sparse 3D problem, with 9 non-zeros per row. It is coarsening very slowly and creating huge coarse grids. which are expensive to construct. The superlinear speedup is from cache effects, most likely. First try with: -pc_gamg_square_graph 10 ML must have some AI in there to do this automatically, because gamg are pretty similar algorithmically. There is a threshold parameter that is important (-pc_gamg_threshold <0.0>) and I think ML has the same default. ML is doing OK, but I would guess that if you use like 0.02 for MLs threshold you would see some improvement. Hypre is doing pretty bad also. I suspect that it is getting confused as well. I know less about how to deal with hypre. If you use -info and grep on GAMG you will see about 20 lines that will tell you the number of equations on level and the average number of non-zeros per row. In 3D the reduction per level should be -- very approximately -- 30x and the number of non-zeros per row should not explode, but getting up to several hundred is OK. If you care to test this we should be able to get ML and GAMG to agree pretty well. ML is a nice solver, but our core numerics should be about the same. I tested this on a 3D elasticity problem a few years ago. That said, I think your ML solve is pretty good. Mark On Thu, Mar 3, 2016 at 4:36 AM, Lawrence Mitchell < lawrence.mitchell at imperial.ac.uk> wrote: > On 02/03/16 22:28, Justin Chang wrote: > ... > > > > Down solver (pre-smoother) on level 3 > > > > KSP Object: (solver_fieldsplit_1_mg_levels_3_) > > linear system matrix = precond matrix: > ... > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=52147, cols=52147 > > > > total: nonzeros=38604909, allocated nonzeros=38604909 > > > > total number of mallocs used during MatSetValues calls =2 > > > > not using I-node routines > > > > Down solver (pre-smoother) on level 4 > > > > KSP Object: (solver_fieldsplit_1_mg_levels_4_) > > linear system matrix followed by preconditioner matrix: > > > > Mat Object: (solver_fieldsplit_1_) > > ... > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=384000, cols=384000 > > > > total: nonzeros=3416452, allocated nonzeros=3416452 > > > This looks pretty suspicious to me. The original matrix on the finest > level has 3.8e5 rows and ~3.4e6 nonzeros. The next level up, the > coarsening produces 5.2e4 rows, but 38e6 nonzeros. > > FWIW, although Justin's PETSc is from Oct 2015, I get the same > behaviour with: > > ad5697c (Master as of 1st March). > > If I compare with the coarse operators that ML produces on the same > problem: > > The original matrix has, again: > > Mat Object: 1 MPI processes > type: seqaij > rows=384000, cols=384000 > total: nonzeros=3416452, allocated nonzeros=3416452 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > While the next finest level has: > > Mat Object: 1 MPI processes > type: seqaij > rows=65258, cols=65258 > total: nonzeros=1318400, allocated nonzeros=1318400 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > So we have 6.5e4 rows and 1.3e6 nonzeros, which seems more plausible. > > Cheers, > > Lawrence > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From praveenpetsc at gmail.com Thu Mar 3 23:19:17 2016 From: praveenpetsc at gmail.com (praveen kumar) Date: Fri, 4 Mar 2016 10:49:17 +0530 Subject: [petsc-users] user defined grids In-Reply-To: <13BD5997-57EE-4095-9D76-56BB1B308C22@mcs.anl.gov> References: <13BD5997-57EE-4095-9D76-56BB1B308C22@mcs.anl.gov> Message-ID: Thanks Barry. I think I didn?t put the question properly. I want to generate a 3D non uniform grid. I went through the archieves and read about DMSetCoordinates. As I have large No. of grid points, it is difficult to assign coordinates manually. Thanks, Praveen On Thu, Mar 3, 2016 at 12:21 AM, Barry Smith wrote: > > > On Mar 2, 2016, at 9:01 AM, praveen kumar > wrote: > > > > Dear all, > > > > I am employing PETSc for DD in a serial fortran code and I would be > using the solver from serial code itself. In my programme there is > subroutine which reads an input file for the grids. I have two questions: > > > > > > > > 1. The input file contains : No. of segments in each direction, Length > of each segment, Grid expansion ratio for each segment, dx(min) and dx(max) > (min and max size of sub-division for each segment), > > I don't understand what this grid expansion ratio means. > > > No. of uniform sub-divisions in each segment. Will I be able to include > all these details in DMDAcreate3D? > > The DMDACreate3d() only defines the topology of the mesh (number of > mesh points in each direction etc), not the geometry. If you want to > provide geometry information (i.e. length of grid segments) then you use > DMDAGetCoordinateArray() to set the local values. See DMDAVecGetArray for > the form of the "coordinates" array. > > > Is there any example? If no, then is there any way to retain the input > file section and still use PETSc? > > Should be possible but you will need to do a little programming and > poking around to use all the information. > > Barry > > > > > > > > > 2. Moreover application requires that, I call the Grid subroutine after > some fixed number of iterations. > > > > > > > > Would you suggest how to fix the above two? > > > > > > > > Thanks, > > > > Praveen > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Mar 3 23:35:06 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 3 Mar 2016 23:35:06 -0600 Subject: [petsc-users] user defined grids In-Reply-To: References: <13BD5997-57EE-4095-9D76-56BB1B308C22@mcs.anl.gov> Message-ID: <5BE88259-66B4-4E99-9F41-2714943462FE@mcs.anl.gov> > On Mar 3, 2016, at 11:19 PM, praveen kumar wrote: > > Thanks Barry. I think I didn?t put the question properly. > > I want to generate a 3D non uniform grid. I went through the archieves and read about DMSetCoordinates. As I have large No. of grid points, it is difficult to assign coordinates manually. Is it a grid that is just stretched in the x, y, and z directions (so that the actual amount of data is just the nx, ny, and nz coordinate values) or is the grid twisted so that the amount of data in your file is nx*ny*nz (that is each grid point must be defined in the file)? If it is just stretched then have process zero read the x, y, and z spaces and send them all to all the processes (this is just nx + ny + nz data so small). Then have each process fill up its coordinate values using the part of the sent values that it needs. It can determine that part it needs by using DMDAGetCorners() to tell the start and end in each coordinate direction of the values. Barry > > > Thanks, > > Praveen > > > On Thu, Mar 3, 2016 at 12:21 AM, Barry Smith wrote: > > > On Mar 2, 2016, at 9:01 AM, praveen kumar wrote: > > > > Dear all, > > > > I am employing PETSc for DD in a serial fortran code and I would be using the solver from serial code itself. In my programme there is subroutine which reads an input file for the grids. I have two questions: > > > > > > > > 1. The input file contains : No. of segments in each direction, Length of each segment, Grid expansion ratio for each segment, dx(min) and dx(max) (min and max size of sub-division for each segment), > > I don't understand what this grid expansion ratio means. > > > No. of uniform sub-divisions in each segment. Will I be able to include all these details in DMDAcreate3D? > > The DMDACreate3d() only defines the topology of the mesh (number of mesh points in each direction etc), not the geometry. If you want to provide geometry information (i.e. length of grid segments) then you use DMDAGetCoordinateArray() to set the local values. See DMDAVecGetArray for the form of the "coordinates" array. > > > Is there any example? If no, then is there any way to retain the input file section and still use PETSc? > > Should be possible but you will need to do a little programming and poking around to use all the information. > > Barry > > > > > > > > > 2. Moreover application requires that, I call the Grid subroutine after some fixed number of iterations. > > > > > > > > Would you suggest how to fix the above two? > > > > > > > > Thanks, > > > > Praveen > > > > From jychang48 at gmail.com Fri Mar 4 01:05:50 2016 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 4 Mar 2016 01:05:50 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: Mark, Using "-pc_gamg_square_graph 10" didn't change anything. I used values of 1, 10, 100, and 1000 and the performance seemed unaffected. Changing the threshold of -pc_gamg_threshold to 0.8 did decrease wall-clock time but it required more iterations. I am not really sure how I go about tinkering around with GAMG or even ML. Do you have a manual/reference/paper/etc that describes what's going on within gamg? Thanks, Justin On Thursday, March 3, 2016, Mark Adams wrote: > You have a very sparse 3D problem, with 9 non-zeros per row. It is > coarsening very slowly and creating huge coarse grids. which are expensive > to construct. The superlinear speedup is from cache effects, most likely. > First try with: > > -pc_gamg_square_graph 10 > > ML must have some AI in there to do this automatically, because gamg are > pretty similar algorithmically. There is a threshold parameter that is > important (-pc_gamg_threshold <0.0>) and I think ML has the same default. > ML is doing OK, but I would guess that if you use like 0.02 for MLs > threshold you would see some improvement. > > Hypre is doing pretty bad also. I suspect that it is getting confused as > well. I know less about how to deal with hypre. > > If you use -info and grep on GAMG you will see about 20 lines that will > tell you the number of equations on level and the average number of > non-zeros per row. In 3D the reduction per level should be -- very > approximately -- 30x and the number of non-zeros per row should not > explode, but getting up to several hundred is OK. > > If you care to test this we should be able to get ML and GAMG to agree > pretty well. ML is a nice solver, but our core numerics should be about > the same. I tested this on a 3D elasticity problem a few years ago. That > said, I think your ML solve is pretty good. > > Mark > > > > > On Thu, Mar 3, 2016 at 4:36 AM, Lawrence Mitchell < > lawrence.mitchell at imperial.ac.uk > > wrote: > >> On 02/03/16 22:28, Justin Chang wrote: >> ... >> >> >> > Down solver (pre-smoother) on level 3 >> > >> > KSP Object: (solver_fieldsplit_1_mg_levels_3_) >> > linear system matrix = precond matrix: >> ... >> > Mat Object: 1 MPI processes >> > >> > type: seqaij >> > >> > rows=52147, cols=52147 >> > >> > total: nonzeros=38604909, allocated nonzeros=38604909 >> > >> > total number of mallocs used during MatSetValues calls =2 >> > >> > not using I-node routines >> > >> > Down solver (pre-smoother) on level 4 >> > >> > KSP Object: (solver_fieldsplit_1_mg_levels_4_) >> > linear system matrix followed by preconditioner matrix: >> > >> > Mat Object: (solver_fieldsplit_1_) >> >> ... >> > >> > Mat Object: 1 MPI processes >> > >> > type: seqaij >> > >> > rows=384000, cols=384000 >> > >> > total: nonzeros=3416452, allocated nonzeros=3416452 >> >> >> This looks pretty suspicious to me. The original matrix on the finest >> level has 3.8e5 rows and ~3.4e6 nonzeros. The next level up, the >> coarsening produces 5.2e4 rows, but 38e6 nonzeros. >> >> FWIW, although Justin's PETSc is from Oct 2015, I get the same >> behaviour with: >> >> ad5697c (Master as of 1st March). >> >> If I compare with the coarse operators that ML produces on the same >> problem: >> >> The original matrix has, again: >> >> Mat Object: 1 MPI processes >> type: seqaij >> rows=384000, cols=384000 >> total: nonzeros=3416452, allocated nonzeros=3416452 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> >> While the next finest level has: >> >> Mat Object: 1 MPI processes >> type: seqaij >> rows=65258, cols=65258 >> total: nonzeros=1318400, allocated nonzeros=1318400 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> >> So we have 6.5e4 rows and 1.3e6 nonzeros, which seems more plausible. >> >> Cheers, >> >> Lawrence >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 4 07:40:50 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 4 Mar 2016 08:40:50 -0500 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: On Fri, Mar 4, 2016 at 2:05 AM, Justin Chang wrote: > Mark, > > Using "-pc_gamg_square_graph 10" didn't change anything. I used values of > 1, 10, 100, and 1000 and the performance seemed unaffected. > Humm. please run with -info and grep on GAMG, This will be about 20 lines but -info is very noisy. Perhaps if you can do the same thing for ML (I'm not sure if the ML interface supports verbose output like this) that would be useful. BTW, -pc_gamg_square_graph N is the number of levels to square the graph. You will see in the verbose output the number of non-zeros per row in your problems starts at 9 and goes up to ~740, and ~5370, and then 207 the the coarse grid where there are 207 columns. > > Changing the threshold of -pc_gamg_threshold to 0.8 did decrease > wall-clock time but it required more iterations. > That is very large. A more reasonable scan would be: 0, 0.01, 0.04, 0.08. > > I am not really sure how I go about tinkering around with GAMG or even ML. > Do you have a manual/reference/paper/etc that describes what's going on > within gamg? > There is a section in the manual. It goes through some of these troubleshooting techniques. > > Thanks, > Justin > > > On Thursday, March 3, 2016, Mark Adams wrote: > >> You have a very sparse 3D problem, with 9 non-zeros per row. It is >> coarsening very slowly and creating huge coarse grids. which are expensive >> to construct. The superlinear speedup is from cache effects, most likely. >> First try with: >> >> -pc_gamg_square_graph 10 >> >> ML must have some AI in there to do this automatically, because gamg are >> pretty similar algorithmically. There is a threshold parameter that is >> important (-pc_gamg_threshold <0.0>) and I think ML has the same default. >> ML is doing OK, but I would guess that if you use like 0.02 for MLs >> threshold you would see some improvement. >> >> Hypre is doing pretty bad also. I suspect that it is getting confused as >> well. I know less about how to deal with hypre. >> >> If you use -info and grep on GAMG you will see about 20 lines that will >> tell you the number of equations on level and the average number of >> non-zeros per row. In 3D the reduction per level should be -- very >> approximately -- 30x and the number of non-zeros per row should not >> explode, but getting up to several hundred is OK. >> >> If you care to test this we should be able to get ML and GAMG to agree >> pretty well. ML is a nice solver, but our core numerics should be about >> the same. I tested this on a 3D elasticity problem a few years ago. That >> said, I think your ML solve is pretty good. >> >> Mark >> >> >> >> >> On Thu, Mar 3, 2016 at 4:36 AM, Lawrence Mitchell < >> lawrence.mitchell at imperial.ac.uk> wrote: >> >>> On 02/03/16 22:28, Justin Chang wrote: >>> ... >>> >>> >>> > Down solver (pre-smoother) on level 3 >>> > >>> > KSP Object: (solver_fieldsplit_1_mg_levels_3_) >>> > linear system matrix = precond matrix: >>> ... >>> > Mat Object: 1 MPI processes >>> > >>> > type: seqaij >>> > >>> > rows=52147, cols=52147 >>> > >>> > total: nonzeros=38604909, allocated nonzeros=38604909 >>> > >>> > total number of mallocs used during MatSetValues calls =2 >>> > >>> > not using I-node routines >>> > >>> > Down solver (pre-smoother) on level 4 >>> > >>> > KSP Object: (solver_fieldsplit_1_mg_levels_4_) >>> > linear system matrix followed by preconditioner matrix: >>> > >>> > Mat Object: (solver_fieldsplit_1_) >>> >>> ... >>> > >>> > Mat Object: 1 MPI processes >>> > >>> > type: seqaij >>> > >>> > rows=384000, cols=384000 >>> > >>> > total: nonzeros=3416452, allocated nonzeros=3416452 >>> >>> >>> This looks pretty suspicious to me. The original matrix on the finest >>> level has 3.8e5 rows and ~3.4e6 nonzeros. The next level up, the >>> coarsening produces 5.2e4 rows, but 38e6 nonzeros. >>> >>> FWIW, although Justin's PETSc is from Oct 2015, I get the same >>> behaviour with: >>> >>> ad5697c (Master as of 1st March). >>> >>> If I compare with the coarse operators that ML produces on the same >>> problem: >>> >>> The original matrix has, again: >>> >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=384000, cols=384000 >>> total: nonzeros=3416452, allocated nonzeros=3416452 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> >>> While the next finest level has: >>> >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=65258, cols=65258 >>> total: nonzeros=1318400, allocated nonzeros=1318400 >>> total number of mallocs used during MatSetValues calls=0 >>> not using I-node routines >>> >>> So we have 6.5e4 rows and 1.3e6 nonzeros, which seems more plausible. >>> >>> Cheers, >>> >>> Lawrence >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From aotero at fi.uba.ar Fri Mar 4 08:25:03 2016 From: aotero at fi.uba.ar (Alejandro D Otero) Date: Fri, 4 Mar 2016 11:25:03 -0300 Subject: [petsc-users] Saving vector with hdf5 viewer Message-ID: Hello, I am trying to save some field stored in a vector which has values associated with vertexes, edges and cells in a DMPlex. This vector was created (using petsc4py) from a petcs section, setting this as default section and then creating the vector from the DMPlex. The thing is that when I save that vector using the hdf5 viewer I found 2 entities inside it. One corresponding to the values for all DoF in the DMPlex (under the group fields), and another with values only for nodes (under the group: vertex_fields). That meaning a useless duplication of data, is it possible to set the viewer only to store one of those groups? Preferably the complete one, but in the future only the vertex one could be usefull. Thanks in advance, Alejandro -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Fri Mar 4 09:24:05 2016 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 4 Mar 2016 09:24:05 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: So with -pc_gamg_square_graph 10 I get the following: [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold 0., 8.79533 nnz ave. (N=48000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 min=1.040410e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 active pes [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold 0., 623.135 nnz ave. (N=6672) [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 min=2.474586e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 active pes [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold 0., 724. nnz ave. (N=724) [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 min=2.759552e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold 0., 8.79533 nnz ave. (N=48000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 min=1.040410e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 active pes [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold 0., 623.135 nnz ave. (N=6672) [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 min=2.474586e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 active pes [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold 0., 724. nnz ave. (N=724) [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 min=2.759552e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold 0., 8.863 nnz ave. (N=162000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 min=8.260696e-03 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 active pes [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold 0., 704.128 nnz ave. (N=22085) [0] PC*GAMG*Prolongator_AGG(): New grid 2283 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 min=1.484874e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=2283, n data cols=1, nnz/row (ave)=2283, 1 active pes [0] PC*GAMG*FilterGraph(): 3.64497% nnz after filtering, with threshold 0., 2283. nnz ave. (N=2283) [0] PC*GAMG*Prolongator_AGG(): New grid 97 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.043254e+00 min=1.321528e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=97, n data cols=1, nnz/row (ave)=97, 1 active pes [0] PC*GAMG*FilterGraph(): 66.8403% nnz after filtering, with threshold 0., 97. nnz ave. (N=97) [0] PC*GAMG*Prolongator_AGG(): New grid 5 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.653762e+00 min=4.460582e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 4) N=5, n data cols=1, nnz/row (ave)=5, 1 active pes [0] PCSetUp_*GAMG*(): 5 levels, grid complexity = 15.4673 Btw i did a smaller problem. Unit cube of 30x30x30 not 40x40x40. I used those smaller threshold values you mentioned but nothing really changed Thanks, Justin On Fri, Mar 4, 2016 at 7:40 AM, Mark Adams wrote: > > > On Fri, Mar 4, 2016 at 2:05 AM, Justin Chang wrote: > >> Mark, >> >> Using "-pc_gamg_square_graph 10" didn't change anything. I used values of >> 1, 10, 100, and 1000 and the performance seemed unaffected. >> > > Humm. please run with -info and grep on GAMG, This will be about 20 lines > but -info is very noisy. Perhaps if you can do the same thing for ML (I'm > not sure if the ML interface supports verbose output like this) that would > be useful. > > BTW, -pc_gamg_square_graph N is the number of levels to square the graph. > You will see in the verbose output the number of non-zeros per row in your > problems starts at 9 and goes up to ~740, and ~5370, and then 207 the the > coarse grid where there are 207 columns. > > >> >> Changing the threshold of -pc_gamg_threshold to 0.8 did decrease >> wall-clock time but it required more iterations. >> > > That is very large. A more reasonable scan would be: 0, 0.01, 0.04, 0.08. > > >> >> I am not really sure how I go about tinkering around with GAMG or even >> ML. Do you have a manual/reference/paper/etc that describes what's going on >> within gamg? >> > > There is a section in the manual. It goes through some of these > troubleshooting techniques. > > >> >> Thanks, >> Justin >> >> >> On Thursday, March 3, 2016, Mark Adams wrote: >> >>> You have a very sparse 3D problem, with 9 non-zeros per row. It is >>> coarsening very slowly and creating huge coarse grids. which are expensive >>> to construct. The superlinear speedup is from cache effects, most likely. >>> First try with: >>> >>> -pc_gamg_square_graph 10 >>> >>> ML must have some AI in there to do this automatically, because gamg are >>> pretty similar algorithmically. There is a threshold parameter that is >>> important (-pc_gamg_threshold <0.0>) and I think ML has the same default. >>> ML is doing OK, but I would guess that if you use like 0.02 for MLs >>> threshold you would see some improvement. >>> >>> Hypre is doing pretty bad also. I suspect that it is getting confused >>> as well. I know less about how to deal with hypre. >>> >>> If you use -info and grep on GAMG you will see about 20 lines that will >>> tell you the number of equations on level and the average number of >>> non-zeros per row. In 3D the reduction per level should be -- very >>> approximately -- 30x and the number of non-zeros per row should not >>> explode, but getting up to several hundred is OK. >>> >>> If you care to test this we should be able to get ML and GAMG to agree >>> pretty well. ML is a nice solver, but our core numerics should be about >>> the same. I tested this on a 3D elasticity problem a few years ago. That >>> said, I think your ML solve is pretty good. >>> >>> Mark >>> >>> >>> >>> >>> On Thu, Mar 3, 2016 at 4:36 AM, Lawrence Mitchell < >>> lawrence.mitchell at imperial.ac.uk> wrote: >>> >>>> On 02/03/16 22:28, Justin Chang wrote: >>>> ... >>>> >>>> >>>> > Down solver (pre-smoother) on level 3 >>>> > >>>> > KSP Object: (solver_fieldsplit_1_mg_levels_3_) >>>> > linear system matrix = precond matrix: >>>> ... >>>> > Mat Object: 1 MPI processes >>>> > >>>> > type: seqaij >>>> > >>>> > rows=52147, cols=52147 >>>> > >>>> > total: nonzeros=38604909, allocated nonzeros=38604909 >>>> > >>>> > total number of mallocs used during MatSetValues calls >>>> =2 >>>> > >>>> > not using I-node routines >>>> > >>>> > Down solver (pre-smoother) on level 4 >>>> > >>>> > KSP Object: (solver_fieldsplit_1_mg_levels_4_) >>>> > linear system matrix followed by preconditioner matrix: >>>> > >>>> > Mat Object: (solver_fieldsplit_1_) >>>> >>>> ... >>>> > >>>> > Mat Object: 1 MPI processes >>>> > >>>> > type: seqaij >>>> > >>>> > rows=384000, cols=384000 >>>> > >>>> > total: nonzeros=3416452, allocated nonzeros=3416452 >>>> >>>> >>>> This looks pretty suspicious to me. The original matrix on the finest >>>> level has 3.8e5 rows and ~3.4e6 nonzeros. The next level up, the >>>> coarsening produces 5.2e4 rows, but 38e6 nonzeros. >>>> >>>> FWIW, although Justin's PETSc is from Oct 2015, I get the same >>>> behaviour with: >>>> >>>> ad5697c (Master as of 1st March). >>>> >>>> If I compare with the coarse operators that ML produces on the same >>>> problem: >>>> >>>> The original matrix has, again: >>>> >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=384000, cols=384000 >>>> total: nonzeros=3416452, allocated nonzeros=3416452 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> >>>> While the next finest level has: >>>> >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=65258, cols=65258 >>>> total: nonzeros=1318400, allocated nonzeros=1318400 >>>> total number of mallocs used during MatSetValues calls=0 >>>> not using I-node routines >>>> >>>> So we have 6.5e4 rows and 1.3e6 nonzeros, which seems more plausible. >>>> >>>> Cheers, >>>> >>>> Lawrence >>>> >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Fri Mar 4 09:31:11 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Fri, 4 Mar 2016 15:31:11 +0000 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: > On 4 Mar 2016, at 15:24, Justin Chang wrote: > > So with -pc_gamg_square_graph 10 I get the following: Because you're using gamg inside the fieldsplit, I think you need: -fieldsplit_1_pc_gamg_square_graph 10 > [0] PCSetUp_GAMG(): level 0) N=48000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 > [0] PCGAMGFilterGraph(): 55.7114% nnz after filtering, with threshold 0., 8.79533 nnz ave. (N=48000) > [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square ^^^^^ Cheers, Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From jychang48 at gmail.com Fri Mar 4 09:35:20 2016 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 4 Mar 2016 08:35:20 -0700 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: You're right. This is what I have: [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold 0., 8.79533 nnz ave. (N=48000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 min=1.040410e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 active pes [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold 0., 623.135 nnz ave. (N=6672) [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 min=2.474586e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 active pes [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold 0., 724. nnz ave. (N=724) [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 min=2.759552e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold 0., 8.79533 nnz ave. (N=48000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 min=1.040410e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 active pes [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold 0., 623.135 nnz ave. (N=6672) [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 min=2.474586e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 active pes [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold 0., 724. nnz ave. (N=724) [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 min=2.759552e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold 0., 8.863 nnz ave. (N=162000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 min=8.260696e-03 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 active pes [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold 0., 704.128 nnz ave. (N=22085) [0] PC*GAMG*Prolongator_AGG(): New grid 2283 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 min=1.484874e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=2283, n data cols=1, nnz/row (ave)=2283, 1 active pes [0] PC*GAMG*FilterGraph(): 3.64497% nnz after filtering, with threshold 0., 2283. nnz ave. (N=2283) [0] PC*GAMG*Prolongator_AGG(): New grid 97 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.043254e+00 min=1.321528e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=97, n data cols=1, nnz/row (ave)=97, 1 active pes [0] PC*GAMG*FilterGraph(): 66.8403% nnz after filtering, with threshold 0., 97. nnz ave. (N=97) [0] PC*GAMG*Prolongator_AGG(): New grid 5 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.653762e+00 min=4.460582e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 4) N=5, n data cols=1, nnz/row (ave)=5, 1 active pes [0] PCSetUp_*GAMG*(): 5 levels, grid complexity = 15.4673 [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold 0., 8.863 nnz ave. (N=162000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 min=8.260696e-03 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 active pes [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold 0., 704.128 nnz ave. (N=22085) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 545 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 min=1.484874e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=545, n data cols=1, nnz/row (ave)=545, 1 active pes [0] PC*GAMG*FilterGraph(): 7.55997% nnz after filtering, with threshold 0., 545. nnz ave. (N=545) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 11 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.368729e+00 min=1.563750e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=11, n data cols=1, nnz/row (ave)=11, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0376 On Fri, Mar 4, 2016 at 8:31 AM, Lawrence Mitchell < lawrence.mitchell at imperial.ac.uk> wrote: > > > On 4 Mar 2016, at 15:24, Justin Chang wrote: > > > > So with -pc_gamg_square_graph 10 I get the following: > > Because you're using gamg inside the fieldsplit, I think you need: > > -fieldsplit_1_pc_gamg_square_graph 10 > > > > > [0] PCSetUp_GAMG(): level 0) N=48000, n data rows=1, n data cols=1, > nnz/row (ave)=9, np=1 > > [0] PCGAMGFilterGraph(): 55.7114% nnz after filtering, with > threshold 0., 8.79533 nnz ave. (N=48000) > > [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square > ^^^^^ > > Cheers, > > Lawrence > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Fri Mar 4 09:38:56 2016 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 4 Mar 2016 08:38:56 -0700 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: Time to solution went from 100 seconds to 30 seconds once i used 10 graphs. Using 20 graphs started to increase in time slightly On Fri, Mar 4, 2016 at 8:35 AM, Justin Chang wrote: > You're right. This is what I have: > > [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, > nnz/row (ave)=9, np=1 > > [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold > 0., 8.79533 nnz ave. (N=48000) > > [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square > > [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 > min=1.040410e-02 PC=jacobi > > [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold > 0., 623.135 nnz ave. (N=6672) > > [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 > min=2.474586e-02 PC=jacobi > > [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold > 0., 724. nnz ave. (N=724) > > [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 > min=2.759552e-01 PC=jacobi > > [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active > pes > > [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 > > [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, > nnz/row (ave)=9, np=1 > > [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold > 0., 8.79533 nnz ave. (N=48000) > > [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square > > [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 > min=1.040410e-02 PC=jacobi > > [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold > 0., 623.135 nnz ave. (N=6672) > > [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 > min=2.474586e-02 PC=jacobi > > [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold > 0., 724. nnz ave. (N=724) > > [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 > min=2.759552e-01 PC=jacobi > > [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active > pes > > [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 > > [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, > nnz/row (ave)=9, np=1 > > [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold > 0., 8.863 nnz ave. (N=162000) > > [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square > > [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 > min=8.260696e-03 PC=jacobi > > [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold > 0., 704.128 nnz ave. (N=22085) > > [0] PC*GAMG*Prolongator_AGG(): New grid 2283 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 > min=1.484874e-02 PC=jacobi > > [0] PCSetUp_*GAMG*(): 2) N=2283, n data cols=1, nnz/row (ave)=2283, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 3.64497% nnz after filtering, with threshold > 0., 2283. nnz ave. (N=2283) > > [0] PC*GAMG*Prolongator_AGG(): New grid 97 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.043254e+00 > min=1.321528e-01 PC=jacobi > > [0] PCSetUp_*GAMG*(): 3) N=97, n data cols=1, nnz/row (ave)=97, 1 active > pes > > [0] PC*GAMG*FilterGraph(): 66.8403% nnz after filtering, with threshold > 0., 97. nnz ave. (N=97) > > [0] PC*GAMG*Prolongator_AGG(): New grid 5 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.653762e+00 > min=4.460582e-01 PC=jacobi > > [0] PCSetUp_*GAMG*(): 4) N=5, n data cols=1, nnz/row (ave)=5, 1 active pes > > [0] PCSetUp_*GAMG*(): 5 levels, grid complexity = 15.4673 > > [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, > nnz/row (ave)=9, np=1 > > [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold > 0., 8.863 nnz ave. (N=162000) > > [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square > > [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 > min=8.260696e-03 PC=jacobi > > [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold > 0., 704.128 nnz ave. (N=22085) > > [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square > > [0] PC*GAMG*Prolongator_AGG(): New grid 545 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 > min=1.484874e-02 PC=jacobi > > [0] PCSetUp_*GAMG*(): 2) N=545, n data cols=1, nnz/row (ave)=545, 1 > active pes > > [0] PC*GAMG*FilterGraph(): 7.55997% nnz after filtering, with threshold > 0., 545. nnz ave. (N=545) > > [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square > > [0] PC*GAMG*Prolongator_AGG(): New grid 11 nodes > > [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.368729e+00 > min=1.563750e-01 PC=jacobi > > [0] PCSetUp_*GAMG*(): 3) N=11, n data cols=1, nnz/row (ave)=11, 1 active > pes > > [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0376 > > On Fri, Mar 4, 2016 at 8:31 AM, Lawrence Mitchell < > lawrence.mitchell at imperial.ac.uk> wrote: > >> >> > On 4 Mar 2016, at 15:24, Justin Chang wrote: >> > >> > So with -pc_gamg_square_graph 10 I get the following: >> >> Because you're using gamg inside the fieldsplit, I think you need: >> >> -fieldsplit_1_pc_gamg_square_graph 10 >> >> >> >> > [0] PCSetUp_GAMG(): level 0) N=48000, n data rows=1, n data cols=1, >> nnz/row (ave)=9, np=1 >> > [0] PCGAMGFilterGraph(): 55.7114% nnz after filtering, with >> threshold 0., 8.79533 nnz ave. (N=48000) >> > [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square >> ^^^^^ >> >> Cheers, >> >> Lawrence >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 4 10:17:53 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 Mar 2016 10:17:53 -0600 Subject: [petsc-users] Saving vector with hdf5 viewer In-Reply-To: References: Message-ID: On Fri, Mar 4, 2016 at 8:25 AM, Alejandro D Otero wrote: > Hello, I am trying to save some field stored in a vector which has values > associated with vertexes, edges and cells in a DMPlex. > This vector was created (using petsc4py) from a petcs section, setting > this as default section and then creating the vector from the DMPlex. > The thing is that when I save that vector using the hdf5 viewer I found 2 > entities inside it. One corresponding to the values for all DoF in the > DMPlex (under the group fields), and another with values only for nodes > (under the group: vertex_fields). > That meaning a useless duplication of data, is it possible to set the > viewer only to store one of those groups? Preferably the complete one, but > in the future only the vertex one could be usefull. > The problem here is that hdf5 is being used for two different purposes: serialization and visualization. I automatically downsample both the grid and the fields so that they can be visualized using Paraview. I understand that you need control over which form is output. I will put this in soon. My first objective was just to get visualization working with exotic elements and multiple fields. Thanks, Matt > Thanks in advance, > > Alejandro > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 4 16:04:05 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 4 Mar 2016 17:04:05 -0500 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: You seem to have 3 of one type of solve that is give 'square_graph 1': 0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square This has 9 nnz-row and 44% are zero: [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold 0., 8.79533 nnz ave. So you want to use a _negative_ threshold. This will keep zeros in the graph and help it to coarsen faster. And you could try adding more levels to square the graph. The second type of solve has the 'square_graph 10'. It looks like the first solve. It should also use a negative threshold also. ML has a default of zero for the threshold, but it seems to keep zeros whereas GAMG does not. Mark On Fri, Mar 4, 2016 at 10:38 AM, Justin Chang wrote: > Time to solution went from 100 seconds to 30 seconds once i used 10 > graphs. Using 20 graphs started to increase in time slightly > > On Fri, Mar 4, 2016 at 8:35 AM, Justin Chang wrote: > >> You're right. This is what I have: >> >> [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, >> nnz/row (ave)=9, np=1 >> >> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold >> 0., 8.79533 nnz ave. (N=48000) >> >> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 >> min=1.040410e-02 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold >> 0., 623.135 nnz ave. (N=6672) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 >> min=2.474586e-02 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold >> 0., 724. nnz ave. (N=724) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 >> min=2.759552e-01 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active >> pes >> >> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 >> >> [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, >> nnz/row (ave)=9, np=1 >> >> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold >> 0., 8.79533 nnz ave. (N=48000) >> >> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 >> min=1.040410e-02 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with threshold >> 0., 623.135 nnz ave. (N=6672) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 >> min=2.474586e-02 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with threshold >> 0., 724. nnz ave. (N=724) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 >> min=2.759552e-01 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 active >> pes >> >> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 >> >> [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, >> nnz/row (ave)=9, np=1 >> >> [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold >> 0., 8.863 nnz ave. (N=162000) >> >> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 >> min=8.260696e-03 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold >> 0., 704.128 nnz ave. (N=22085) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 2283 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 >> min=1.484874e-02 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 2) N=2283, n data cols=1, nnz/row (ave)=2283, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 3.64497% nnz after filtering, with threshold >> 0., 2283. nnz ave. (N=2283) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 97 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.043254e+00 >> min=1.321528e-01 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 3) N=97, n data cols=1, nnz/row (ave)=97, 1 active >> pes >> >> [0] PC*GAMG*FilterGraph(): 66.8403% nnz after filtering, with threshold >> 0., 97. nnz ave. (N=97) >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 5 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.653762e+00 >> min=4.460582e-01 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 4) N=5, n data cols=1, nnz/row (ave)=5, 1 active >> pes >> >> [0] PCSetUp_*GAMG*(): 5 levels, grid complexity = 15.4673 >> >> [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, >> nnz/row (ave)=9, np=1 >> >> [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold >> 0., 8.863 nnz ave. (N=162000) >> >> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 >> min=8.260696e-03 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold >> 0., 704.128 nnz ave. (N=22085) >> >> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 545 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 >> min=1.484874e-02 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 2) N=545, n data cols=1, nnz/row (ave)=545, 1 >> active pes >> >> [0] PC*GAMG*FilterGraph(): 7.55997% nnz after filtering, with threshold >> 0., 545. nnz ave. (N=545) >> >> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square >> >> [0] PC*GAMG*Prolongator_AGG(): New grid 11 nodes >> >> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.368729e+00 >> min=1.563750e-01 PC=jacobi >> >> [0] PCSetUp_*GAMG*(): 3) N=11, n data cols=1, nnz/row (ave)=11, 1 active >> pes >> >> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0376 >> >> On Fri, Mar 4, 2016 at 8:31 AM, Lawrence Mitchell < >> lawrence.mitchell at imperial.ac.uk> wrote: >> >>> >>> > On 4 Mar 2016, at 15:24, Justin Chang wrote: >>> > >>> > So with -pc_gamg_square_graph 10 I get the following: >>> >>> Because you're using gamg inside the fieldsplit, I think you need: >>> >>> -fieldsplit_1_pc_gamg_square_graph 10 >>> >>> >>> >>> > [0] PCSetUp_GAMG(): level 0) N=48000, n data rows=1, n data cols=1, >>> nnz/row (ave)=9, np=1 >>> > [0] PCGAMGFilterGraph(): 55.7114% nnz after filtering, with >>> threshold 0., 8.79533 nnz ave. (N=48000) >>> > [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square >>> ^^^^^ >>> >>> Cheers, >>> >>> Lawrence >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 4 16:36:27 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 4 Mar 2016 17:36:27 -0500 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: And it looks like you have a well behaved Laplacian here (M-matrix) so I would guess 'richardson' would be faster as the smoother, instead of 'chebyshev'. On Fri, Mar 4, 2016 at 5:04 PM, Mark Adams wrote: > You seem to have 3 of one type of solve that is give 'square_graph 1': > > 0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square > > This has 9 nnz-row and 44% are zero: > > [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with threshold > 0., 8.79533 nnz ave. > > So you want to use a _negative_ threshold. This will keep zeros in the > graph and help it to coarsen faster. And you could try adding more levels > to square the graph. > > The second type of solve has the 'square_graph 10'. It looks like the > first solve. It should also use a negative threshold also. > > ML has a default of zero for the threshold, but it seems to keep zeros > whereas GAMG does not. > > Mark > > > On Fri, Mar 4, 2016 at 10:38 AM, Justin Chang wrote: > >> Time to solution went from 100 seconds to 30 seconds once i used 10 >> graphs. Using 20 graphs started to increase in time slightly >> >> On Fri, Mar 4, 2016 at 8:35 AM, Justin Chang wrote: >> >>> You're right. This is what I have: >>> >>> [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, >>> nnz/row (ave)=9, np=1 >>> >>> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with >>> threshold 0., 8.79533 nnz ave. (N=48000) >>> >>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 >>> min=1.040410e-02 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with >>> threshold 0., 623.135 nnz ave. (N=6672) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 >>> min=2.474586e-02 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with >>> threshold 0., 724. nnz ave. (N=724) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 >>> min=2.759552e-01 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 >>> active pes >>> >>> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 >>> >>> [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, >>> nnz/row (ave)=9, np=1 >>> >>> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with >>> threshold 0., 8.79533 nnz ave. (N=48000) >>> >>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 >>> min=1.040410e-02 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with >>> threshold 0., 623.135 nnz ave. (N=6672) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 >>> min=2.474586e-02 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with >>> threshold 0., 724. nnz ave. (N=724) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 >>> min=2.759552e-01 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 >>> active pes >>> >>> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 >>> >>> [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, >>> nnz/row (ave)=9, np=1 >>> >>> [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with >>> threshold 0., 8.863 nnz ave. (N=162000) >>> >>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 >>> min=8.260696e-03 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold >>> 0., 704.128 nnz ave. (N=22085) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 2283 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 >>> min=1.484874e-02 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 2) N=2283, n data cols=1, nnz/row (ave)=2283, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 3.64497% nnz after filtering, with >>> threshold 0., 2283. nnz ave. (N=2283) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 97 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.043254e+00 >>> min=1.321528e-01 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 3) N=97, n data cols=1, nnz/row (ave)=97, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 66.8403% nnz after filtering, with >>> threshold 0., 97. nnz ave. (N=97) >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 5 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.653762e+00 >>> min=4.460582e-01 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 4) N=5, n data cols=1, nnz/row (ave)=5, 1 active >>> pes >>> >>> [0] PCSetUp_*GAMG*(): 5 levels, grid complexity = 15.4673 >>> >>> [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, >>> nnz/row (ave)=9, np=1 >>> >>> [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with >>> threshold 0., 8.863 nnz ave. (N=162000) >>> >>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 >>> min=8.260696e-03 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold >>> 0., 704.128 nnz ave. (N=22085) >>> >>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 545 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 >>> min=1.484874e-02 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 2) N=545, n data cols=1, nnz/row (ave)=545, 1 >>> active pes >>> >>> [0] PC*GAMG*FilterGraph(): 7.55997% nnz after filtering, with >>> threshold 0., 545. nnz ave. (N=545) >>> >>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square >>> >>> [0] PC*GAMG*Prolongator_AGG(): New grid 11 nodes >>> >>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.368729e+00 >>> min=1.563750e-01 PC=jacobi >>> >>> [0] PCSetUp_*GAMG*(): 3) N=11, n data cols=1, nnz/row (ave)=11, 1 >>> active pes >>> >>> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0376 >>> >>> On Fri, Mar 4, 2016 at 8:31 AM, Lawrence Mitchell < >>> lawrence.mitchell at imperial.ac.uk> wrote: >>> >>>> >>>> > On 4 Mar 2016, at 15:24, Justin Chang wrote: >>>> > >>>> > So with -pc_gamg_square_graph 10 I get the following: >>>> >>>> Because you're using gamg inside the fieldsplit, I think you need: >>>> >>>> -fieldsplit_1_pc_gamg_square_graph 10 >>>> >>>> >>>> >>>> > [0] PCSetUp_GAMG(): level 0) N=48000, n data rows=1, n data cols=1, >>>> nnz/row (ave)=9, np=1 >>>> > [0] PCGAMGFilterGraph(): 55.7114% nnz after filtering, with >>>> threshold 0., 8.79533 nnz ave. (N=48000) >>>> > [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square >>>> ^^^^^ >>>> >>>> Cheers, >>>> >>>> Lawrence >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bikash at umich.edu Sat Mar 5 05:06:50 2016 From: bikash at umich.edu (Bikash Kanungo) Date: Sat, 5 Mar 2016 06:06:50 -0500 Subject: [petsc-users] SLEPc BV data structure Message-ID: Hi, I was wondering if the BV data structure support ghosted vectors i.e., Vec created with VecCreateGhost. Specifically, if I choose to populate the BV data structure through BVInsertVecs, perform orthogonalization and then retrieve the Vecs using BVGetColumn, then will the retrieved Vecs have the extra ghost nodes in them? Regards, Bikash -- Bikash S. Kanungo PhD Student Computational Materials Physics Group Mechanical Engineering University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aero.aju at gmail.com Sat Mar 5 09:23:08 2016 From: aero.aju at gmail.com (Ajit Desai) Date: Sat, 5 Mar 2016 10:23:08 -0500 Subject: [petsc-users] Using MUMPS solver with PETSc Message-ID: Hello everyone, I am trying to use PETSc-MUMPS solver to solve linear problem of type "A x = b " I have wrote a subroutine in Fortran 90 for uni-processor. given below. I am calling this subroutine inside iterative solver. This solver is very fast for first few iterations but then becomes slow and if matrix is of large-scale, it becomes very-very slow. If somebody can help me to understand, if I am doing something wrong here? Thanks you. !!!************************************************* !!! Subroutine SUBROUTINE PETScMUMPS(ksp,pc,A,x,b,ierr) #include #include #include #include #include Vec x,b Mat A KSP ksp PC pc Mat F PetscInt ival,icntl PetscErrorCode ierr call KSPCreate(PETSC_COMM_SELF,ksp,ierr) call KSPSetOperators(ksp,A,A,ierr) call KSPSetType(ksp,KSPPREONLY,ierr) call KSPGetPC(ksp,pc,ierr) !call PCSetType(pc,PCLU,ierr) !! LU Factorization call PCSetType(pc,PCCHOLESKY,ierr) !! Cholesky Factorization call PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr) call PCFactorSetUpMatSolverPackage(pc,ierr) call PCFactorGetMatrix(pc,F,ierr) !! sequential ordering icntl = 7 ival = 2 call MatMumpsSetIcntl(F,icntl,ival,ierr) call KSPSetFromOptions(ksp,ierr) call KSPGetPC(ksp,pc,ierr) call KSPSolve(ksp,x,b,ierr) END SUBROUTINE PETScMUMPS !!!************************************************* *Ajit Desai* *--* * PhD Scholar, Carleton University * * Ottawa, Canada* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sat Mar 5 11:21:04 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 5 Mar 2016 18:21:04 +0100 Subject: [petsc-users] SLEPc BV data structure In-Reply-To: References: Message-ID: > El 5 mar 2016, a las 12:06, Bikash Kanungo escribi?: > > Hi, > > I was wondering if the BV data structure support ghosted vectors i.e., Vec created with VecCreateGhost. Specifically, if I choose to populate the BV data structure through BVInsertVecs, perform orthogonalization and then retrieve the Vecs using BVGetColumn, then will the retrieved Vecs have the extra ghost nodes in them? > > Regards, > Bikash > Try the following: 1. Create a ghosted vector of the appropriate dimension. 2. Create a BV of type BVVECS 3. Set the dimensions of the BV via BVSetSizesFromVec() passing the vector from 1. Then BVInsertVecs() should work as you expect (provided that VecDuplicate() keeps ghost nodes). Note that this only works for BVVECS, which is slower than the default type. Jose From hzhang at mcs.anl.gov Sat Mar 5 11:42:45 2016 From: hzhang at mcs.anl.gov (Hong) Date: Sat, 5 Mar 2016 11:42:45 -0600 Subject: [petsc-users] Using MUMPS solver with PETSc In-Reply-To: References: Message-ID: Ajit : > > I am trying to use PETSc-MUMPS solver to solve linear problem of type "A x > = b " > I have wrote a subroutine in Fortran 90 for uni-processor. given below. > I am calling this subroutine inside iterative solver. > Code looks fine. > > This solver is very fast for first few iterations but then becomes slow > Run your code with runtime option '-log_summary', which displays which routine dominates computation. > and if matrix is of large-scale, it becomes very-very slow. > Direct solvers are not scalable in general. Matrix factors might become very dense. You may experiment different matrix orderings. Again, run with '-log_summary'. Hong > > If somebody can help me to understand, if I am doing something wrong here? > > Thanks you. > > !!!************************************************* > > !!! Subroutine > > SUBROUTINE PETScMUMPS(ksp,pc,A,x,b,ierr) > > > #include > > #include > > #include > > #include > > #include > > > Vec x,b > > Mat A > > KSP ksp > > PC pc > > > Mat F > > PetscInt ival,icntl > > > PetscErrorCode ierr > > > call KSPCreate(PETSC_COMM_SELF,ksp,ierr) > > call KSPSetOperators(ksp,A,A,ierr) > > call KSPSetType(ksp,KSPPREONLY,ierr) > > call KSPGetPC(ksp,pc,ierr) > > > !call PCSetType(pc,PCLU,ierr) !! LU Factorization > > call PCSetType(pc,PCCHOLESKY,ierr) !! Cholesky Factorization > > > call PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr) > > call PCFactorSetUpMatSolverPackage(pc,ierr) > > call PCFactorGetMatrix(pc,F,ierr) > > > !! sequential ordering > > icntl = 7 > > ival = 2 > > call MatMumpsSetIcntl(F,icntl,ival,ierr) > > > call KSPSetFromOptions(ksp,ierr) > > call KSPGetPC(ksp,pc,ierr) > > > call KSPSolve(ksp,x,b,ierr) > > > END SUBROUTINE PETScMUMPS > > !!!************************************************* > > > > *Ajit Desai* > *--* > * PhD Scholar, Carleton University * > * Ottawa, Canada* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sat Mar 5 12:01:29 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 5 Mar 2016 19:01:29 +0100 Subject: [petsc-users] Using MUMPS solver with PETSc In-Reply-To: References: Message-ID: <518D5E81-79B0-4AEA-B05B-5BCA4B567844@dsic.upv.es> Also, if you call this subroutine several times, make sure to call KSPDestroy() before exit. Alternatively, reuse the KSP object from one call to the next. Jose > El 5 mar 2016, a las 18:42, Hong escribi?: > > Ajit : > I am trying to use PETSc-MUMPS solver to solve linear problem of type "A x = b " > I have wrote a subroutine in Fortran 90 for uni-processor. given below. > I am calling this subroutine inside iterative solver. > Code looks fine. > > This solver is very fast for first few iterations but then becomes slow > Run your code with runtime option '-log_summary', which displays which routine dominates computation. > > and if matrix is of large-scale, it becomes very-very slow. > Direct solvers are not scalable in general. Matrix factors might become very dense. You may experiment different matrix orderings. Again, run with '-log_summary'. > > Hong > > If somebody can help me to understand, if I am doing something wrong here? > > Thanks you. > > !!!************************************************* > !!! Subroutine > SUBROUTINE PETScMUMPS(ksp,pc,A,x,b,ierr) > > #include > #include > #include > #include > #include > > Vec x,b > Mat A > KSP ksp > PC pc > > Mat F > PetscInt ival,icntl > > PetscErrorCode ierr > > call KSPCreate(PETSC_COMM_SELF,ksp,ierr) > call KSPSetOperators(ksp,A,A,ierr) > call KSPSetType(ksp,KSPPREONLY,ierr) > call KSPGetPC(ksp,pc,ierr) > > !call PCSetType(pc,PCLU,ierr) !! LU Factorization > call PCSetType(pc,PCCHOLESKY,ierr) !! Cholesky Factorization > > call PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr) > call PCFactorSetUpMatSolverPackage(pc,ierr) > call PCFactorGetMatrix(pc,F,ierr) > > !! sequential ordering > icntl = 7 > ival = 2 > call MatMumpsSetIcntl(F,icntl,ival,ierr) > > call KSPSetFromOptions(ksp,ierr) > call KSPGetPC(ksp,pc,ierr) > > call KSPSolve(ksp,x,b,ierr) > > END SUBROUTINE PETScMUMPS > !!!************************************************* > > > Ajit Desai > -- > PhD Scholar, Carleton University > Ottawa, Canada > From bikash at umich.edu Sat Mar 5 12:17:36 2016 From: bikash at umich.edu (Bikash Kanungo) Date: Sat, 5 Mar 2016 13:17:36 -0500 Subject: [petsc-users] SLEPc BV data structure In-Reply-To: References: Message-ID: Thanks Jose. I'll try your suggestion and let you know if it works. On Sat, Mar 5, 2016 at 12:21 PM, Jose E. Roman wrote: > > > El 5 mar 2016, a las 12:06, Bikash Kanungo escribi?: > > > > Hi, > > > > I was wondering if the BV data structure support ghosted vectors i.e., > Vec created with VecCreateGhost. Specifically, if I choose to populate the > BV data structure through BVInsertVecs, perform orthogonalization and then > retrieve the Vecs using BVGetColumn, then will the retrieved Vecs have the > extra ghost nodes in them? > > > > Regards, > > Bikash > > > > Try the following: > > 1. Create a ghosted vector of the appropriate dimension. > 2. Create a BV of type BVVECS > 3. Set the dimensions of the BV via BVSetSizesFromVec() passing the vector > from 1. > > Then BVInsertVecs() should work as you expect (provided that > VecDuplicate() keeps ghost nodes). Note that this only works for BVVECS, > which is slower than the default type. > > Jose > > -- Bikash S. Kanungo PhD Student Computational Materials Physics Group Mechanical Engineering University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From davydden at gmail.com Mon Mar 7 04:21:21 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 7 Mar 2016 11:21:21 +0100 Subject: [petsc-users] MatIsSymmetric / MatIsHermitian issues for MPI+complex Message-ID: <7CFE9B1A-41F2-4FC9-B420-7D6A7B984D91@gmail.com> Dear all, I call MatIsSymmetric and MatIsHermitian for MPI complex-valued matrix in PETSc. For a moment, i store a mass matrix (real-valued, symmetric) in a matrix. However, the result of both functions is NOT PETSC_TRUE. The same program works fine with a single MPI core. Are there any known issues of MatIsSymmetric / MatIsHermitian for COMPLEX+MPI matrices? I am quite certain that my matrix is indeed symmetric (i also test vmult and Tvmult on a vector and the results is the same). Kind regards, Denis From davydden at gmail.com Mon Mar 7 04:26:27 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 7 Mar 2016 11:26:27 +0100 Subject: [petsc-users] MatIsSymmetric / MatIsHermitian issues for MPI+complex In-Reply-To: <7CFE9B1A-41F2-4FC9-B420-7D6A7B984D91@gmail.com> References: <7CFE9B1A-41F2-4FC9-B420-7D6A7B984D91@gmail.com> Message-ID: <3BA41C86-F5C8-475A-AAF0-BEE8F5D76EB2@gmail.com> Nevermind, i should have check the error code: 45: [1]PETSC ERROR: No support for this operation for this object type 45: [1]PETSC ERROR: Matrix of type does not support checking for symmetric > On 7 Mar 2016, at 11:21, Denis Davydov wrote: > > Dear all, > > I call MatIsSymmetric and MatIsHermitian for MPI complex-valued matrix in PETSc. > For a moment, i store a mass matrix (real-valued, symmetric) in a matrix. > However, the result of both functions is NOT PETSC_TRUE. > The same program works fine with a single MPI core. > > Are there any known issues of MatIsSymmetric / MatIsHermitian for COMPLEX+MPI matrices? > I am quite certain that my matrix is indeed symmetric (i also test vmult and Tvmult on a vector and the results is the same). > > Kind regards, > Denis > From Lukasz.Kaczmarczyk at glasgow.ac.uk Mon Mar 7 06:58:45 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Mon, 7 Mar 2016 12:58:45 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity Message-ID: Hello, I run multi-grid solver, with adaptivity, works well, however It is some space for improving efficiency. I using hierarchical approximation basis, for which construction of interpolation operators is simple, it is simple injection. After each refinement level (increase of order of approximation on some element) I rebuild multigrid pre-conditioner with additional level. It is a way to add dynamically new levels without need of rebuilding whole MG pre-conditioner. Looking at execution profile I noticed that 50%-60% of time is spent on MatPtAP function during PCSetUP stage. Kind regards, Lukasz From Lukasz.Kaczmarczyk at glasgow.ac.uk Mon Mar 7 07:36:26 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Mon, 7 Mar 2016 13:36:26 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: References: Message-ID: > On 7 Mar 2016, at 12:58, Lukasz Kaczmarczyk wrote: > > Hello, > > I run multi-grid solver, with adaptivity, works well, however It is some space for improving efficiency. I using hierarchical approximation basis, for which > construction of interpolation operators is simple, it is simple injection. > > After each refinement level (increase of order of approximation on some element) I rebuild multigrid pre-conditioner with additional level. It is a way to add dynamically new levels without need of rebuilding whole MG pre-conditioner. > > Looking at execution profile I noticed that 50%-60% of time is spent on MatPtAP function during PCSetUP stage. > > Kind regards, > Lukasz Hello Again, I going to have solution, such that temporally change matrix operator for MatPtAP when MG pre-conditioner is constructed, and have shell interoperation matrix P, such that in my MatPtAP, the MatGetSubMatrix is used, where needed IS for sub matrix is in the context of P matrix. In the context of P matrix I could keep as well PtAP matrix, and reused it if operation is repeated for the same level at next adaptivity step. Is such solution is looks sound for you? Or it can be done in less complex way? Kind regards, Lukasz From knepley at gmail.com Mon Mar 7 07:50:19 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 7 Mar 2016 07:50:19 -0600 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: References: Message-ID: On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk < Lukasz.Kaczmarczyk at glasgow.ac.uk> wrote: > Hello, > > I run multi-grid solver, with adaptivity, works well, however It is some > space for improving efficiency. I using hierarchical approximation basis, > for which > construction of interpolation operators is simple, it is simple injection. > > After each refinement level (increase of order of approximation on some > element) I rebuild multigrid pre-conditioner with additional level. It is a > way to add dynamically new levels without need of rebuilding whole MG > pre-conditioner. > That does not currently exist, however it would not be hard to add, since the MG structure jsut consists of arrays of pointers. > Looking at execution profile I noticed that 50%-60% of time is spent on > MatPtAP function during PCSetUP stage. > Which means you are using a Galerkin projection to define the coarse operator. Do you have a direct way of defining this operator (rediscretization)? Thanks, Matt > Kind regards, > Lukasz -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Lukasz.Kaczmarczyk at glasgow.ac.uk Mon Mar 7 08:16:21 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Mon, 7 Mar 2016 14:16:21 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: References: Message-ID: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> On 7 Mar 2016, at 13:50, Matthew Knepley > wrote: On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk > wrote: Hello, I run multi-grid solver, with adaptivity, works well, however It is some space for improving efficiency. I using hierarchical approximation basis, for which construction of interpolation operators is simple, it is simple injection. After each refinement level (increase of order of approximation on some element) I rebuild multigrid pre-conditioner with additional level. It is a way to add dynamically new levels without need of rebuilding whole MG pre-conditioner. That does not currently exist, however it would not be hard to add, since the MG structure jsut consists of arrays of pointers. Looking at execution profile I noticed that 50%-60% of time is spent on MatPtAP function during PCSetUP stage. Which means you are using a Galerkin projection to define the coarse operator. Do you have a direct way of defining this operator (rediscretization)? Matt, Thanks for swift response. You are right, I using Galerkin projection. Yes, I have a way to get directly coarse operator, it is some sub matrix of whole matrix. I taking advantage here form hierarchical approximation. I could reimplement PCSetUp_MG to set the MG structure directly, but this probably not good approach, since my implementation which will work with current petsc version could be incompatible which future changes in native MG data structures. The alternative option is to hack MatPtAP itself, and until petsc MG will use this, whatever changes you will make in MG in the future my code will work. PS. Current, not efficient enough implementation is here, http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/_p_c_m_g_set_up_via_approx_orders_8cpp_source.html Kind regards, Lukasz -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Mon Mar 7 08:21:15 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Mon, 7 Mar 2016 14:21:15 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> Message-ID: <56DD8E5B.2000108@imperial.ac.uk> On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: > >> On 7 Mar 2016, at 13:50, Matthew Knepley > > wrote: >> >> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk >> > > wrote: >> >> Hello, >> >> I run multi-grid solver, with adaptivity, works well, however It >> is some space for improving efficiency. I using hierarchical >> approximation basis, for which >> construction of interpolation operators is simple, it is simple >> injection. >> >> After each refinement level (increase of order of approximation >> on some element) I rebuild multigrid pre-conditioner with >> additional level. It is a way to add dynamically new levels >> without need of rebuilding whole MG pre-conditioner. >> >> >> That does not currently exist, however it would not be hard to add, >> since the MG structure jsut consists of >> arrays of pointers. >> >> >> Looking at execution profile I noticed that 50%-60% of time is >> spent on MatPtAP function during PCSetUP stage. >> >> >> Which means you are using a Galerkin projection to define the coarse >> operator. Do you have a direct way of defining >> this operator (rediscretization)? > > Matt, > > > Thanks for swift response. You are right, I using Galerkin projection. > > Yes, I have a way to get directly coarse operator, it is some sub > matrix of whole matrix. I taking advantage here form hierarchical > approximation. > > I could reimplement PCSetUp_MG to set the MG structure directly, but > this probably not good approach, since my implementation which will > work with current petsc version could be incompatible which future > changes in native MG data structures. The alternative option is to > hack MatPtAP itself, and until petsc MG will use this, whatever > changes you will make in MG in the future my code will work. Why not provide a shell DM to the KSP that knows how to compute the operators (and how to refine/coarsen and therefore restrict/interpolate). Then there's no need to use Galerkin coarse grid operators, and the KSP will just call back to your code to create the appropriate matrices. Cheers, Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: OpenPGP digital signature URL: From Lukasz.Kaczmarczyk at glasgow.ac.uk Mon Mar 7 08:32:31 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Mon, 7 Mar 2016 14:32:31 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: <56DD8E5B.2000108@imperial.ac.uk> References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> <56DD8E5B.2000108@imperial.ac.uk> Message-ID: <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> > On 7 Mar 2016, at 14:21, Lawrence Mitchell wrote: > > On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: >> >>> On 7 Mar 2016, at 13:50, Matthew Knepley >> > wrote: >>> >>> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk >>> >> > wrote: >>> >>> Hello, >>> >>> I run multi-grid solver, with adaptivity, works well, however It >>> is some space for improving efficiency. I using hierarchical >>> approximation basis, for which >>> construction of interpolation operators is simple, it is simple >>> injection. >>> >>> After each refinement level (increase of order of approximation >>> on some element) I rebuild multigrid pre-conditioner with >>> additional level. It is a way to add dynamically new levels >>> without need of rebuilding whole MG pre-conditioner. >>> >>> >>> That does not currently exist, however it would not be hard to add, >>> since the MG structure jsut consists of >>> arrays of pointers. >>> >>> >>> Looking at execution profile I noticed that 50%-60% of time is >>> spent on MatPtAP function during PCSetUP stage. >>> >>> >>> Which means you are using a Galerkin projection to define the coarse >>> operator. Do you have a direct way of defining >>> this operator (rediscretization)? >> >> Matt, >> >> >> Thanks for swift response. You are right, I using Galerkin projection. >> >> Yes, I have a way to get directly coarse operator, it is some sub >> matrix of whole matrix. I taking advantage here form hierarchical >> approximation. >> >> I could reimplement PCSetUp_MG to set the MG structure directly, but >> this probably not good approach, since my implementation which will >> work with current petsc version could be incompatible which future >> changes in native MG data structures. The alternative option is to >> hack MatPtAP itself, and until petsc MG will use this, whatever >> changes you will make in MG in the future my code will work. > > Why not provide a shell DM to the KSP that knows how to compute the > operators (and how to refine/coarsen and therefore > restrict/interpolate). Then there's no need to use Galerkin coarse > grid operators, and the KSP will just call back to your code to create > the appropriate matrices. Hello Lawrence, Thanks, it is good advice. I have already my DM shell, however I have not looked yet how make it in the context of MG. Now is probably time to do that. DM shell http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/group__dm.html Regards, Lukasz From mfadams at lbl.gov Mon Mar 7 09:55:03 2016 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 7 Mar 2016 10:55:03 -0500 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> <56DD8E5B.2000108@imperial.ac.uk> <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> Message-ID: You can just set the coarse grid matrix/operator instead of using Galerkin. If you have a shell (matrix free) P then you will need to create and set this yourself. Our Galerkin requires a matrix P. On Mon, Mar 7, 2016 at 9:32 AM, Lukasz Kaczmarczyk < Lukasz.Kaczmarczyk at glasgow.ac.uk> wrote: > > > On 7 Mar 2016, at 14:21, Lawrence Mitchell < > lawrence.mitchell at imperial.ac.uk> wrote: > > > > On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: > >> > >>> On 7 Mar 2016, at 13:50, Matthew Knepley >>> > wrote: > >>> > >>> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk > >>> >>> > wrote: > >>> > >>> Hello, > >>> > >>> I run multi-grid solver, with adaptivity, works well, however It > >>> is some space for improving efficiency. I using hierarchical > >>> approximation basis, for which > >>> construction of interpolation operators is simple, it is simple > >>> injection. > >>> > >>> After each refinement level (increase of order of approximation > >>> on some element) I rebuild multigrid pre-conditioner with > >>> additional level. It is a way to add dynamically new levels > >>> without need of rebuilding whole MG pre-conditioner. > >>> > >>> > >>> That does not currently exist, however it would not be hard to add, > >>> since the MG structure jsut consists of > >>> arrays of pointers. > >>> > >>> > >>> Looking at execution profile I noticed that 50%-60% of time is > >>> spent on MatPtAP function during PCSetUP stage. > >>> > >>> > >>> Which means you are using a Galerkin projection to define the coarse > >>> operator. Do you have a direct way of defining > >>> this operator (rediscretization)? > >> > >> Matt, > >> > >> > >> Thanks for swift response. You are right, I using Galerkin projection. > >> > >> Yes, I have a way to get directly coarse operator, it is some sub > >> matrix of whole matrix. I taking advantage here form hierarchical > >> approximation. > >> > >> I could reimplement PCSetUp_MG to set the MG structure directly, but > >> this probably not good approach, since my implementation which will > >> work with current petsc version could be incompatible which future > >> changes in native MG data structures. The alternative option is to > >> hack MatPtAP itself, and until petsc MG will use this, whatever > >> changes you will make in MG in the future my code will work. > > > > Why not provide a shell DM to the KSP that knows how to compute the > > operators (and how to refine/coarsen and therefore > > restrict/interpolate). Then there's no need to use Galerkin coarse > > grid operators, and the KSP will just call back to your code to create > > the appropriate matrices. > > Hello Lawrence, > > Thanks, it is good advice. > I have already my DM shell, however I have not looked yet how make it in > the context of MG. Now is probably time to do that. > > DM shell > http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/group__dm.html > > > Regards, > Lukasz -------------- next part -------------- An HTML attachment was scrubbed... URL: From Lukasz.Kaczmarczyk at glasgow.ac.uk Mon Mar 7 10:28:20 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Mon, 7 Mar 2016 16:28:20 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> <56DD8E5B.2000108@imperial.ac.uk> <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> Message-ID: Many thanks all for help. I started to implement function for DM. I understand that minimal implementation is that for the DM i need to have, is to have DMCoarsen and in each level for all DMs, set operators DMKSPSetComputeOperators and DMCreateInterpolation. Matrix matrix free P from DMCreateInterpolation have to have operators for mult and mult_traspose. Is that is all? Kind regards, Lukasz On 7 Mar 2016, at 15:55, Mark Adams > wrote: You can just set the coarse grid matrix/operator instead of using Galerkin. If you have a shell (matrix free) P then you will need to create and set this yourself. Our Galerkin requires a matrix P. On Mon, Mar 7, 2016 at 9:32 AM, Lukasz Kaczmarczyk > wrote: > On 7 Mar 2016, at 14:21, Lawrence Mitchell > wrote: > > On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: >> >>> On 7 Mar 2016, at 13:50, Matthew Knepley >>> >> wrote: >>> >>> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk >>> >>> >> wrote: >>> >>> Hello, >>> >>> I run multi-grid solver, with adaptivity, works well, however It >>> is some space for improving efficiency. I using hierarchical >>> approximation basis, for which >>> construction of interpolation operators is simple, it is simple >>> injection. >>> >>> After each refinement level (increase of order of approximation >>> on some element) I rebuild multigrid pre-conditioner with >>> additional level. It is a way to add dynamically new levels >>> without need of rebuilding whole MG pre-conditioner. >>> >>> >>> That does not currently exist, however it would not be hard to add, >>> since the MG structure jsut consists of >>> arrays of pointers. >>> >>> >>> Looking at execution profile I noticed that 50%-60% of time is >>> spent on MatPtAP function during PCSetUP stage. >>> >>> >>> Which means you are using a Galerkin projection to define the coarse >>> operator. Do you have a direct way of defining >>> this operator (rediscretization)? >> >> Matt, >> >> >> Thanks for swift response. You are right, I using Galerkin projection. >> >> Yes, I have a way to get directly coarse operator, it is some sub >> matrix of whole matrix. I taking advantage here form hierarchical >> approximation. >> >> I could reimplement PCSetUp_MG to set the MG structure directly, but >> this probably not good approach, since my implementation which will >> work with current petsc version could be incompatible which future >> changes in native MG data structures. The alternative option is to >> hack MatPtAP itself, and until petsc MG will use this, whatever >> changes you will make in MG in the future my code will work. > > Why not provide a shell DM to the KSP that knows how to compute the > operators (and how to refine/coarsen and therefore > restrict/interpolate). Then there's no need to use Galerkin coarse > grid operators, and the KSP will just call back to your code to create > the appropriate matrices. Hello Lawrence, Thanks, it is good advice. I have already my DM shell, however I have not looked yet how make it in the context of MG. Now is probably time to do that. DM shell http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/group__dm.html Regards, Lukasz -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 7 10:42:41 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 7 Mar 2016 10:42:41 -0600 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> <56DD8E5B.2000108@imperial.ac.uk> <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> Message-ID: On Mon, Mar 7, 2016 at 10:28 AM, Lukasz Kaczmarczyk < Lukasz.Kaczmarczyk at glasgow.ac.uk> wrote: > Many thanks all for help. I started to implement function for DM. > > I understand that minimal implementation is that for the DM i need to > have, is to have DMCoarsen and in each level for all DMs, set operators > DMKSPSetComputeOperators and DMCreateInterpolation. Matrix matrix free P > from DMCreateInterpolation have to have operators for mult and > mult_traspose. Is that is all? > Yes, that should be it. It would be nice to have some example that does this if you would be willing to contribute some version of your code. Thanks, Matt > Kind regards, > Lukasz > > > > On 7 Mar 2016, at 15:55, Mark Adams wrote: > > You can just set the coarse grid matrix/operator instead of using > Galerkin. If you have a shell (matrix free) P then you will need to create > and set this yourself. Our Galerkin requires a matrix P. > > On Mon, Mar 7, 2016 at 9:32 AM, Lukasz Kaczmarczyk < > Lukasz.Kaczmarczyk at glasgow.ac.uk> wrote: > >> >> > On 7 Mar 2016, at 14:21, Lawrence Mitchell < >> lawrence.mitchell at imperial.ac.uk> wrote: >> > >> > On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: >> >> >> >>> On 7 Mar 2016, at 13:50, Matthew Knepley > >>> > wrote: >> >>> >> >>> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk >> >>> > >>> > wrote: >> >>> >> >>> Hello, >> >>> >> >>> I run multi-grid solver, with adaptivity, works well, however It >> >>> is some space for improving efficiency. I using hierarchical >> >>> approximation basis, for which >> >>> construction of interpolation operators is simple, it is simple >> >>> injection. >> >>> >> >>> After each refinement level (increase of order of approximation >> >>> on some element) I rebuild multigrid pre-conditioner with >> >>> additional level. It is a way to add dynamically new levels >> >>> without need of rebuilding whole MG pre-conditioner. >> >>> >> >>> >> >>> That does not currently exist, however it would not be hard to add, >> >>> since the MG structure jsut consists of >> >>> arrays of pointers. >> >>> >> >>> >> >>> Looking at execution profile I noticed that 50%-60% of time is >> >>> spent on MatPtAP function during PCSetUP stage. >> >>> >> >>> >> >>> Which means you are using a Galerkin projection to define the coarse >> >>> operator. Do you have a direct way of defining >> >>> this operator (rediscretization)? >> >> >> >> Matt, >> >> >> >> >> >> Thanks for swift response. You are right, I using Galerkin projection. >> >> >> >> Yes, I have a way to get directly coarse operator, it is some sub >> >> matrix of whole matrix. I taking advantage here form hierarchical >> >> approximation. >> >> >> >> I could reimplement PCSetUp_MG to set the MG structure directly, but >> >> this probably not good approach, since my implementation which will >> >> work with current petsc version could be incompatible which future >> >> changes in native MG data structures. The alternative option is to >> >> hack MatPtAP itself, and until petsc MG will use this, whatever >> >> changes you will make in MG in the future my code will work. >> > >> > Why not provide a shell DM to the KSP that knows how to compute the >> > operators (and how to refine/coarsen and therefore >> > restrict/interpolate). Then there's no need to use Galerkin coarse >> > grid operators, and the KSP will just call back to your code to create >> > the appropriate matrices. >> >> Hello Lawrence, >> >> Thanks, it is good advice. >> I have already my DM shell, however I have not looked yet how make it in >> the context of MG. Now is probably time to do that. >> >> DM shell >> http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/group__dm.html >> >> >> Regards, >> Lukasz > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From juan at tf.uni-kiel.de Mon Mar 7 11:21:24 2016 From: juan at tf.uni-kiel.de (Julian Andrej) Date: Mon, 07 Mar 2016 18:21:24 +0100 Subject: [petsc-users] Mass matrix with PetscFE In-Reply-To: References: <56CDA67F.6000906@tf.uni-kiel.de> <56CDB469.806@tf.uni-kiel.de> <56CDB84F.8020309@tf.uni-kiel.de> <518cc2f74e6b2267660acaf3871d52f9@tf.uni-kiel.de> <56CEB565.5010203@tf.uni-kiel.de> Message-ID: <556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de> Any news about this? I've seen you merged the dmforest branch into next. On 2016-02-26 01:22, Matthew Knepley wrote: > I am sorry about the delay. I have your example working but it exposed > a bug in Plex so I need to push the fix first. I should have > everything for you early next week. > > Thanks > > Matt > > On Feb 25, 2016 2:04 AM, "Julian Andrej" wrote: > >> After a bit of rethinking the problem, the discrepancy between the >> size of matrix A and the mass matrix M arises because of the >> Dirichlet boundary conditions. So why aren't the BCs not imposed on >> the mass matrix? Do I need to handle Dirichlet BCs differently in >> this context (like zero rows and put one the diagonal?) >> >> On 24.02.2016 20 [1]:54, juan wrote: >> I attached another example which creates the correct mass matrix >> but also overwrites the DM for the SNES solve. Somehow i cannot >> manage >> to really copy the DM to dm_mass and use that. If i try to do that >> with >> DMClone(dm, &dm_mass) i get a smaller mass matrix (which is not of >> size A). >> >> Maybe this helps in the discussion. >> >> Relevant code starts at line 455. >> >> On 2016-02-24 15:03, Julian Andrej wrote: >> Thanks Matt, >> >> I attached the modified example. >> >> the corresponding code (and only changes to ex12) is starting at >> line >> 832. >> >> It also seems that the mass matrix is of size 169x169 and the >> stiffness matrix is of dimension 225x225. I'd assume that if i >> multiply test and trial function i'd get a matrix of same size (if >> the >> space/quadrature is the same for the stiffness matrix) >> >> On 24.02.2016 14 [2]:56, Matthew Knepley wrote: >> On Wed, Feb 24, 2016 at 7:47 AM, Julian Andrej > > wrote: >> >> I'm now using the petsc git master branch. >> >> I tried adding my code to the ex12 >> >> DM dm_mass; >> PetscDS prob_mass; >> PetscFE fe; >> Mat M; >> PetscFECreateDefault(dm, user.dim, 1, PETSC_TRUE, NULL, -1, >> &fe); >> >> DMClone(dm, &dm_mass); >> DMGetDS(dm_mass, &prob_mass); >> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) fe); >> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, NULL, >> NULL); >> DMCreateMatrix(dm_mass, &M); >> >> MatSetOptionsPrefix(M, "M_";) >> >> and receive the error on running >> ./exe -interpolate -refinement_limit 0.0125 -petscspace_order 2 >> -M_mat_view binary >> >> WARNING! There are options you set that were not used! >> WARNING! could be spelling mistake, etc! >> Option left: name:-M_mat_view value: binary >> >> I don't know if the matrix is actually there and assembled or if >> the >> option is ommitted because something is wrong. >> >> Its difficult to know when I cannot see the whole code. You can >> always >> insert >> >> MatViewFromOptions(M, NULL, "-mat_view"); >> >> Using >> MatView(M, PETSC_VIEWER_STDOUT_WORLD); >> >> gives me a reasonable output to stdout. >> >> Good. >> >> But saving the matrix and analysing it in matlab, results in an >> all >> zero matrix. >> >> PetscViewerBinaryOpen(PETSC_COMM_WORLD, "Mout",FILE_MODE_WRITE, >> &viewer); >> MatView(M, viewer); >> >> I cannot explain this, but it has to be something like you are >> viewing >> the matrix before it is >> actually assembled. Feel free to send the code. It sounds like it is >> mostly working. >> >> Matt >> >> Any hints? >> >> On 24.02.2016 13 [3] :58, Matthew Knepley >> wrote: >> >> On Wed, Feb 24, 2016 at 6:47 AM, Julian Andrej >> >> >> >> wrote: >> >> Hi, >> >> i'm trying to assemble a mass matrix with the >> PetscFE/DMPlex >> interface. I found something in the examples of TAO >> >> > https://bitbucket.org/petsc/petsc/src/da8116b0e8d067e39fd79740a8a864b0fe207998/src/tao/examples/tutorials/ex3.c?at=master&fileviewer=file-view-default >> >> but using the lines >> >> DMClone(dm, &dm_mass); >> DMSetNumFields(dm_mass, 1); >> DMPlexCopyCoordinates(dm, dm_mass); >> DMGetDS(dm_mass, &prob_mass); >> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, >> NULL, NULL); >> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) >> fe); >> DMPlexSNESComputeJacobianFEM(dm_mass, u, M, M, NULL); >> DMCreateMatrix(dm_mass, &M); >> >> leads to errors in DMPlexSNESComputeJacobianFEM (u is a >> global vector). >> >> I don't can understand the necessary commands until >> DMPlexSNESComputeJacobianFEM. What does it do and why >> is it >> necessary? (especially why does the naming involve >> SNES?) >> >> Is there another/easier/better way to create a mass >> matrix (the >> inner product of the function space and the test >> space)? >> >> 1) That example needs updating. First, look at SNES ex12 >> which >> is up to >> date. >> >> 2) I assume you are using 3.6. If you use the development >> version, you >> can remove DMPlexCopyCoordinates(). >> >> 3) You need to create the matrix BEFORE calling the assembly >> >> 4) Always always always send the entire error messge >> >> Matt >> >> Regards >> Julian Andrej >> >> -- >> What most experimenters take for granted before they begin >> their >> experiments is infinitely more interesting than any results >> to which >> their experiments lead. >> -- Norbert Wiener >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener > > > Links: > ------ > [1] tel:24.02.2016%2020 > [2] tel:24.02.2016%2014 > [3] tel:24.02.2016%2013 From Lukasz.Kaczmarczyk at glasgow.ac.uk Mon Mar 7 11:41:05 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Mon, 7 Mar 2016 17:41:05 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> <56DD8E5B.2000108@imperial.ac.uk> <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> Message-ID: <6150D526-76C6-4F22-8721-C9876313C1BC@glasgow.ac.uk> On 7 Mar 2016, at 16:42, Matthew Knepley > wrote: On Mon, Mar 7, 2016 at 10:28 AM, Lukasz Kaczmarczyk > wrote: Many thanks all for help. I started to implement function for DM. I understand that minimal implementation is that for the DM i need to have, is to have DMCoarsen and in each level for all DMs, set operators DMKSPSetComputeOperators and DMCreateInterpolation. Matrix matrix free P from DMCreateInterpolation have to have operators for mult and mult_traspose. Is that is all? Yes, that should be it. It would be nice to have some example that does this if you would be willing to contribute some version of your code. No problem, I will do that will pleasure. Kind regards, Lukasz On 7 Mar 2016, at 15:55, Mark Adams > wrote: You can just set the coarse grid matrix/operator instead of using Galerkin. If you have a shell (matrix free) P then you will need to create and set this yourself. Our Galerkin requires a matrix P. On Mon, Mar 7, 2016 at 9:32 AM, Lukasz Kaczmarczyk > wrote: > On 7 Mar 2016, at 14:21, Lawrence Mitchell > wrote: > > On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: >> >>> On 7 Mar 2016, at 13:50, Matthew Knepley >>> >> wrote: >>> >>> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk >>> >>> >> wrote: >>> >>> Hello, >>> >>> I run multi-grid solver, with adaptivity, works well, however It >>> is some space for improving efficiency. I using hierarchical >>> approximation basis, for which >>> construction of interpolation operators is simple, it is simple >>> injection. >>> >>> After each refinement level (increase of order of approximation >>> on some element) I rebuild multigrid pre-conditioner with >>> additional level. It is a way to add dynamically new levels >>> without need of rebuilding whole MG pre-conditioner. >>> >>> >>> That does not currently exist, however it would not be hard to add, >>> since the MG structure jsut consists of >>> arrays of pointers. >>> >>> >>> Looking at execution profile I noticed that 50%-60% of time is >>> spent on MatPtAP function during PCSetUP stage. >>> >>> >>> Which means you are using a Galerkin projection to define the coarse >>> operator. Do you have a direct way of defining >>> this operator (rediscretization)? >> >> Matt, >> >> >> Thanks for swift response. You are right, I using Galerkin projection. >> >> Yes, I have a way to get directly coarse operator, it is some sub >> matrix of whole matrix. I taking advantage here form hierarchical >> approximation. >> >> I could reimplement PCSetUp_MG to set the MG structure directly, but >> this probably not good approach, since my implementation which will >> work with current petsc version could be incompatible which future >> changes in native MG data structures. The alternative option is to >> hack MatPtAP itself, and until petsc MG will use this, whatever >> changes you will make in MG in the future my code will work. > > Why not provide a shell DM to the KSP that knows how to compute the > operators (and how to refine/coarsen and therefore > restrict/interpolate). Then there's no need to use Galerkin coarse > grid operators, and the KSP will just call back to your code to create > the appropriate matrices. Hello Lawrence, Thanks, it is good advice. I have already my DM shell, however I have not looked yet how make it in the context of MG. Now is probably time to do that. DM shell http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/group__dm.html Regards, Lukasz -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Mar 7 13:14:32 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 7 Mar 2016 13:14:32 -0600 Subject: [petsc-users] MatIsSymmetric / MatIsHermitian issues for MPI+complex In-Reply-To: <3BA41C86-F5C8-475A-AAF0-BEE8F5D76EB2@gmail.com> References: <7CFE9B1A-41F2-4FC9-B420-7D6A7B984D91@gmail.com> <3BA41C86-F5C8-475A-AAF0-BEE8F5D76EB2@gmail.com> Message-ID: <0A468523-4845-4AD3-894A-9B73759815BB@mcs.anl.gov> If you know the matrix is symmetric or Hermitian you can call MatSetOption() to tell the matrix. Then it doesn't need to try to check. Barry > On Mar 7, 2016, at 4:26 AM, Denis Davydov wrote: > > Nevermind, i should have check the error code: > > 45: [1]PETSC ERROR: No support for this operation for this object type > 45: [1]PETSC ERROR: Matrix of type does not support checking for symmetric > > >> On 7 Mar 2016, at 11:21, Denis Davydov wrote: >> >> Dear all, >> >> I call MatIsSymmetric and MatIsHermitian for MPI complex-valued matrix in PETSc. >> For a moment, i store a mass matrix (real-valued, symmetric) in a matrix. >> However, the result of both functions is NOT PETSC_TRUE. >> The same program works fine with a single MPI core. >> >> Are there any known issues of MatIsSymmetric / MatIsHermitian for COMPLEX+MPI matrices? >> I am quite certain that my matrix is indeed symmetric (i also test vmult and Tvmult on a vector and the results is the same). >> >> Kind regards, >> Denis >> > From mono at mek.dtu.dk Mon Mar 7 13:28:42 2016 From: mono at mek.dtu.dk (=?iso-8859-1?Q?Morten_Nobel-J=F8rgensen?=) Date: Mon, 7 Mar 2016 19:28:42 +0000 Subject: [petsc-users] DMPlex : Assemble global stiffness matrix problem Message-ID: <6B03D347796DED499A2696FC095CE81A013A556A@ait-pex02mbx04.win.dtu.dk> I have some problems using DMPlex on unstructured grids in 3D. After I have created the DMPlex and assigned dofs (3 dofs on each node), I run into some problems when assembling the global stiffness matrix. I have created a small example in the attached cc file. My problems are: * It seems like the matrix (created using DMCreateMatrix) contains no non-zero elements. I was under the impression that the sparsity pattern of the matrix would be created automatically when the dofs has been assigned to the default section. * (Probably as a consequence of this) when assigning values to the matrix I get an: "Argument of of range. New nonzero at (0,0) caused a malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check" * Finally, I'm reading the nodes of each element using the get-transitive-clojure (where I test if each point is inside the node range), but I have a hard time understanding if the returned values are sorted. And if not, how to sort the values (e.g. using orientation which the get-transitive-clojure function also returns). I hope someone can guide me in the right direction :) Kind regards, Morten -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dmplex_mat_problem.cc Type: application/octet-stream Size: 4608 bytes Desc: dmplex_mat_problem.cc URL: From jychang48 at gmail.com Tue Mar 8 01:05:45 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 8 Mar 2016 01:05:45 -0600 Subject: [petsc-users] Tao TRON solver tolerances Message-ID: Hi all, So I am solving a convex optimization problem of the following form: min 1/2 x^T*H*x - x^T*f s.t. 0 < x < 1 Using the TAOTRON solver, I also have CG/ILU for KSP/PC. The following TAO solver tolerances are used for my specific problem: -tao_gatol 1e-12 -tao_grtol 1e-7 I noticed that the KSP tolerance truly defines the performance of this solver. Attached are three run cases with -ksp_rtol 1e-7, 1e-3, and 1e-1 with "-ksp_converged_reason -ksp_monitor_true_residual -tao_view -tao_converged_reason -log_view". It seems that the lower the KSP tolerance, the faster the time-to-solution where the number of KSP/TAO solve iterations remains roughly the same. So my question is, is this "normal"? That is, if using TRON, one may relax the KSP tolerances because the convergence of the solver is primarily due to the objective functional from TRON and not necessarily the KSP solve itself? Is there a general rule of thumb for this, because it would seem to me that for any TRON solve I do, i could just set a really low KSP rtol and still get roughly the same performance. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ================================================== MESHID = 2 ================================================== ========================================== 1 processors: ========================================== TSTEP ANALYSIS TIME ITER FLOPS/s iter = 0, Function value: -8.83019e-08, Residual: 0.214312 0 KSP preconditioned resid norm 2.721099185886e+01 true resid norm 2.014138134175e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.869407744459e+00 true resid norm 5.822300398427e-02 ||r(i)||/||b|| 2.890715537151e-01 2 KSP preconditioned resid norm 2.242110432761e+00 true resid norm 2.437887036293e-02 ||r(i)||/||b|| 1.210387209759e-01 Linear solve converged due to CONVERGED_RTOL iterations 2 iter = 1, Function value: -3.0227, Residual: 0.106411 0 KSP preconditioned resid norm 1.186015682827e+01 true resid norm 9.951846352790e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.164408153341e+00 true resid norm 5.442552117697e-02 ||r(i)||/||b|| 5.468886802268e-01 2 KSP preconditioned resid norm 3.853466576672e+00 true resid norm 4.181722382562e-02 ||r(i)||/||b|| 4.201956334856e-01 3 KSP preconditioned resid norm 2.431361972872e+00 true resid norm 2.470649935207e-02 ||r(i)||/||b|| 2.482604581726e-01 4 KSP preconditioned resid norm 1.609420674026e+00 true resid norm 1.813070633647e-02 ||r(i)||/||b|| 1.821843474441e-01 5 KSP preconditioned resid norm 9.854058612210e-01 true resid norm 1.107435193707e-02 ||r(i)||/||b|| 1.112793701238e-01 Linear solve converged due to CONVERGED_RTOL iterations 5 iter = 2, Function value: -3.99311, Residual: 0.0561794 0 KSP preconditioned resid norm 4.745132151277e+00 true resid norm 5.157265414092e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.328460972818e+00 true resid norm 2.524613942204e-02 ||r(i)||/||b|| 4.895256961772e-01 2 KSP preconditioned resid norm 1.986092654798e+00 true resid norm 2.326548422397e-02 ||r(i)||/||b|| 4.511205523843e-01 3 KSP preconditioned resid norm 1.581250833192e+00 true resid norm 1.833592737837e-02 ||r(i)||/||b|| 3.555358490620e-01 4 KSP preconditioned resid norm 1.264530116640e+00 true resid norm 1.498893001019e-02 ||r(i)||/||b|| 2.906371653713e-01 5 KSP preconditioned resid norm 9.039045119540e-01 true resid norm 1.088737764483e-02 ||r(i)||/||b|| 2.111075690439e-01 6 KSP preconditioned resid norm 6.770469952665e-01 true resid norm 8.290769243775e-03 ||r(i)||/||b|| 1.607590181634e-01 7 KSP preconditioned resid norm 4.831932891684e-01 true resid norm 6.012308310732e-03 ||r(i)||/||b|| 1.165793851583e-01 8 KSP preconditioned resid norm 3.779260278323e-01 true resid norm 4.752323454993e-03 ||r(i)||/||b|| 9.214812644717e-02 Linear solve converged due to CONVERGED_RTOL iterations 8 iter = 3, Function value: -4.22244, Residual: 0.0325071 0 KSP preconditioned resid norm 1.940628934982e+00 true resid norm 2.873883700031e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.015778944284e+00 true resid norm 1.553407311454e-02 ||r(i)||/||b|| 5.405254608727e-01 2 KSP preconditioned resid norm 8.599452198867e-01 true resid norm 1.386942785729e-02 ||r(i)||/||b|| 4.826022659561e-01 3 KSP preconditioned resid norm 7.020844649773e-01 true resid norm 1.064542414722e-02 ||r(i)||/||b|| 3.704194483272e-01 4 KSP preconditioned resid norm 6.268936884604e-01 true resid norm 9.121337013737e-03 ||r(i)||/||b|| 3.173871306497e-01 5 KSP preconditioned resid norm 5.564638147818e-01 true resid norm 8.110633817555e-03 ||r(i)||/||b|| 2.822185816868e-01 6 KSP preconditioned resid norm 4.912179620266e-01 true resid norm 7.198522694011e-03 ||r(i)||/||b|| 2.504806542427e-01 7 KSP preconditioned resid norm 3.956512930624e-01 true resid norm 5.694860378165e-03 ||r(i)||/||b|| 1.981590409558e-01 8 KSP preconditioned resid norm 3.006404654133e-01 true resid norm 4.340865664011e-03 ||r(i)||/||b|| 1.510452793885e-01 9 KSP preconditioned resid norm 2.289271612498e-01 true resid norm 3.323460860340e-03 ||r(i)||/||b|| 1.156435404921e-01 10 KSP preconditioned resid norm 1.804442605262e-01 true resid norm 2.511590540611e-03 ||r(i)||/||b|| 8.739360401342e-02 Linear solve converged due to CONVERGED_RTOL iterations 10 iter = 4, Function value: -4.28596, Residual: 0.0248212 0 KSP preconditioned resid norm 1.242236085862e+00 true resid norm 2.153752870715e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.957410723366e-01 true resid norm 1.052346229945e-02 ||r(i)||/||b|| 4.886104827782e-01 2 KSP preconditioned resid norm 4.938478940613e-01 true resid norm 8.455413879932e-03 ||r(i)||/||b|| 3.925897903563e-01 3 KSP preconditioned resid norm 3.996372234663e-01 true resid norm 6.708749489546e-03 ||r(i)||/||b|| 3.114911455612e-01 4 KSP preconditioned resid norm 3.533933541706e-01 true resid norm 5.541768408825e-03 ||r(i)||/||b|| 2.573075344055e-01 5 KSP preconditioned resid norm 3.011467876410e-01 true resid norm 4.649589019097e-03 ||r(i)||/||b|| 2.158831257903e-01 6 KSP preconditioned resid norm 2.820617262411e-01 true resid norm 4.230731348737e-03 ||r(i)||/||b|| 1.964353202386e-01 7 KSP preconditioned resid norm 2.474494234788e-01 true resid norm 3.760788996797e-03 ||r(i)||/||b|| 1.746156231726e-01 8 KSP preconditioned resid norm 2.007715137983e-01 true resid norm 3.102224817428e-03 ||r(i)||/||b|| 1.440381048174e-01 9 KSP preconditioned resid norm 1.593316367603e-01 true resid norm 2.462750533118e-03 ||r(i)||/||b|| 1.143469414065e-01 10 KSP preconditioned resid norm 1.196765532100e-01 true resid norm 1.858819975947e-03 ||r(i)||/||b|| 8.630609394523e-02 Linear solve converged due to CONVERGED_RTOL iterations 10 iter = 5, Function value: -4.31231, Residual: 0.0178714 0 KSP preconditioned resid norm 8.724820160650e-01 true resid norm 1.544152460821e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.086541001524e-01 true resid norm 7.145592176529e-03 ||r(i)||/||b|| 4.627517267777e-01 2 KSP preconditioned resid norm 3.121531275140e-01 true resid norm 5.477393530594e-03 ||r(i)||/||b|| 3.547184406702e-01 3 KSP preconditioned resid norm 2.484547325551e-01 true resid norm 4.127131397922e-03 ||r(i)||/||b|| 2.672748645382e-01 4 KSP preconditioned resid norm 2.256450612299e-01 true resid norm 3.594296333391e-03 ||r(i)||/||b|| 2.327682288238e-01 5 KSP preconditioned resid norm 1.984197481212e-01 true resid norm 3.051442614320e-03 ||r(i)||/||b|| 1.976127805863e-01 6 KSP preconditioned resid norm 1.851089001362e-01 true resid norm 2.880086620499e-03 ||r(i)||/||b|| 1.865156902297e-01 7 KSP preconditioned resid norm 1.595287233540e-01 true resid norm 2.475260836854e-03 ||r(i)||/||b|| 1.602989924673e-01 8 KSP preconditioned resid norm 1.340161905551e-01 true resid norm 2.026197933587e-03 ||r(i)||/||b|| 1.312174791672e-01 9 KSP preconditioned resid norm 1.109874347554e-01 true resid norm 1.693574409327e-03 ||r(i)||/||b|| 1.096766318286e-01 10 KSP preconditioned resid norm 8.838638640722e-02 true resid norm 1.344541648458e-03 ||r(i)||/||b|| 8.707311503057e-02 11 KSP preconditioned resid norm 7.078139183917e-02 true resid norm 1.102015180203e-03 ||r(i)||/||b|| 7.136699310228e-02 Linear solve converged due to CONVERGED_RTOL iterations 11 iter = 6, Function value: -4.32451, Residual: 0.012743 0 KSP preconditioned resid norm 6.585450027189e-01 true resid norm 1.106508779676e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.124040704236e-01 true resid norm 5.371194417470e-03 ||r(i)||/||b|| 4.854181472507e-01 2 KSP preconditioned resid norm 2.450872206945e-01 true resid norm 4.128857181184e-03 ||r(i)||/||b|| 3.731427402133e-01 3 KSP preconditioned resid norm 1.881012686091e-01 true resid norm 3.046303666750e-03 ||r(i)||/||b|| 2.753076814847e-01 4 KSP preconditioned resid norm 1.654668210997e-01 true resid norm 2.597468829383e-03 ||r(i)||/||b|| 2.347445295594e-01 5 KSP preconditioned resid norm 1.456341868261e-01 true resid norm 2.162923592241e-03 ||r(i)||/||b|| 1.954727908146e-01 6 KSP preconditioned resid norm 1.390883474576e-01 true resid norm 2.101080391977e-03 ||r(i)||/||b|| 1.898837524445e-01 7 KSP preconditioned resid norm 1.291359255246e-01 true resid norm 1.993398122762e-03 ||r(i)||/||b|| 1.801520384995e-01 8 KSP preconditioned resid norm 1.104861736197e-01 true resid norm 1.650746908258e-03 ||r(i)||/||b|| 1.491851613452e-01 9 KSP preconditioned resid norm 9.133820903286e-02 true resid norm 1.362255111735e-03 ||r(i)||/||b|| 1.231129058130e-01 10 KSP preconditioned resid norm 7.227228862579e-02 true resid norm 1.065877252166e-03 ||r(i)||/||b|| 9.632795254263e-02 11 KSP preconditioned resid norm 5.834541793924e-02 true resid norm 8.722602027492e-04 ||r(i)||/||b|| 7.882993960560e-02 Linear solve converged due to CONVERGED_RTOL iterations 11 iter = 7, Function value: -4.33142, Residual: 0.0100395 0 KSP preconditioned resid norm 4.844498651731e-01 true resid norm 8.783633194377e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.225839755798e-01 true resid norm 3.983362860863e-03 ||r(i)||/||b|| 4.534983158692e-01 2 KSP preconditioned resid norm 1.789285946937e-01 true resid norm 3.119887218350e-03 ||r(i)||/||b|| 3.551932496848e-01 3 KSP preconditioned resid norm 1.415228072598e-01 true resid norm 2.378930476398e-03 ||r(i)||/||b|| 2.708367282369e-01 4 KSP preconditioned resid norm 1.242740497928e-01 true resid norm 1.972701944215e-03 ||r(i)||/||b|| 2.245883793824e-01 5 KSP preconditioned resid norm 1.051804755179e-01 true resid norm 1.639604035928e-03 ||r(i)||/||b|| 1.866658135244e-01 6 KSP preconditioned resid norm 1.038457207482e-01 true resid norm 1.578293594766e-03 ||r(i)||/||b|| 1.796857359409e-01 7 KSP preconditioned resid norm 1.014306499320e-01 true resid norm 1.579982235591e-03 ||r(i)||/||b|| 1.798779844999e-01 8 KSP preconditioned resid norm 9.365403832022e-02 true resid norm 1.437831541215e-03 ||r(i)||/||b|| 1.636943972268e-01 9 KSP preconditioned resid norm 8.073642564701e-02 true resid norm 1.252875180321e-03 ||r(i)||/||b|| 1.426374659091e-01 10 KSP preconditioned resid norm 6.451791116641e-02 true resid norm 9.986451561283e-04 ||r(i)||/||b|| 1.136938592526e-01 11 KSP preconditioned resid norm 5.131501196541e-02 true resid norm 8.013561140981e-04 ||r(i)||/||b|| 9.123287554985e-02 12 KSP preconditioned resid norm 4.001747598157e-02 true resid norm 6.143371253542e-04 ||r(i)||/||b|| 6.994111795873e-02 Linear solve converged due to CONVERGED_RTOL iterations 12 iter = 8, Function value: -4.3355, Residual: 0.00804573 0 KSP preconditioned resid norm 3.927967759982e-01 true resid norm 6.996544917363e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.800224761582e-01 true resid norm 3.153228793600e-03 ||r(i)||/||b|| 4.506837061497e-01 2 KSP preconditioned resid norm 1.357792561257e-01 true resid norm 2.423708844490e-03 ||r(i)||/||b|| 3.464151053294e-01 3 KSP preconditioned resid norm 1.082240584444e-01 true resid norm 1.829982585479e-03 ||r(i)||/||b|| 2.615551828929e-01 4 KSP preconditioned resid norm 9.363099041109e-02 true resid norm 1.570490319062e-03 ||r(i)||/||b|| 2.244665527930e-01 5 KSP preconditioned resid norm 8.174815908543e-02 true resid norm 1.289998130047e-03 ||r(i)||/||b|| 1.843764522752e-01 6 KSP preconditioned resid norm 7.429374880715e-02 true resid norm 1.154098942305e-03 ||r(i)||/||b|| 1.649526953570e-01 7 KSP preconditioned resid norm 7.219161853252e-02 true resid norm 1.129425436021e-03 ||r(i)||/||b|| 1.614261681103e-01 8 KSP preconditioned resid norm 7.140040463483e-02 true resid norm 1.109123697083e-03 ||r(i)||/||b|| 1.585244874696e-01 9 KSP preconditioned resid norm 6.611837149181e-02 true resid norm 1.040265174441e-03 ||r(i)||/||b|| 1.486826979212e-01 10 KSP preconditioned resid norm 5.723451778786e-02 true resid norm 8.745407982162e-04 ||r(i)||/||b|| 1.249960957223e-01 11 KSP preconditioned resid norm 4.752885127324e-02 true resid norm 7.436279136968e-04 ||r(i)||/||b|| 1.062850196032e-01 12 KSP preconditioned resid norm 3.826257032419e-02 true resid norm 5.906179588884e-04 ||r(i)||/||b|| 8.441566028150e-02 Linear solve converged due to CONVERGED_RTOL iterations 12 iter = 9, Function value: -4.33807, Residual: 0.00609756 0 KSP preconditioned resid norm 3.014204700543e-01 true resid norm 5.359088645381e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.335648892797e-01 true resid norm 2.404129486454e-03 ||r(i)||/||b|| 4.486078968904e-01 2 KSP preconditioned resid norm 1.026709162654e-01 true resid norm 1.868672524106e-03 ||r(i)||/||b|| 3.486922213381e-01 3 KSP preconditioned resid norm 8.186335101742e-02 true resid norm 1.414883222636e-03 ||r(i)||/||b|| 2.640156407668e-01 4 KSP preconditioned resid norm 7.114741869343e-02 true resid norm 1.177159623060e-03 ||r(i)||/||b|| 2.196566806325e-01 5 KSP preconditioned resid norm 6.077811101007e-02 true resid norm 9.883812187486e-04 ||r(i)||/||b|| 1.844308396728e-01 6 KSP preconditioned resid norm 5.638312083581e-02 true resid norm 9.012450232314e-04 ||r(i)||/||b|| 1.681713221908e-01 7 KSP preconditioned resid norm 5.440497120591e-02 true resid norm 8.954491957738e-04 ||r(i)||/||b|| 1.670898272126e-01 8 KSP preconditioned resid norm 5.279744965994e-02 true resid norm 8.220781002841e-04 ||r(i)||/||b|| 1.533988621354e-01 9 KSP preconditioned resid norm 5.186699669966e-02 true resid norm 8.142183035325e-04 ||r(i)||/||b|| 1.519322327751e-01 10 KSP preconditioned resid norm 4.854397247258e-02 true resid norm 7.493637905932e-04 ||r(i)||/||b|| 1.398304525601e-01 11 KSP preconditioned resid norm 4.216185475436e-02 true resid norm 6.692323036225e-04 ||r(i)||/||b|| 1.248780059272e-01 12 KSP preconditioned resid norm 3.418515268876e-02 true resid norm 5.421796129518e-04 ||r(i)||/||b|| 1.011701147021e-01 13 KSP preconditioned resid norm 2.777127852232e-02 true resid norm 4.527426091689e-04 ||r(i)||/||b|| 8.448126894842e-02 Linear solve converged due to CONVERGED_RTOL iterations 13 iter = 10, Function value: -4.33961, Residual: 0.00500503 0 KSP preconditioned resid norm 2.364109097575e-01 true resid norm 4.341645005845e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.075164754673e-01 true resid norm 1.984083890306e-03 ||r(i)||/||b|| 4.569889725287e-01 2 KSP preconditioned resid norm 8.249562330338e-02 true resid norm 1.511522880572e-03 ||r(i)||/||b|| 3.481452026910e-01 3 KSP preconditioned resid norm 6.341806612782e-02 true resid norm 1.157742542537e-03 ||r(i)||/||b|| 2.666598814455e-01 4 KSP preconditioned resid norm 5.488390196976e-02 true resid norm 9.561094185789e-04 ||r(i)||/||b|| 2.202182392369e-01 5 KSP preconditioned resid norm 4.699055957820e-02 true resid norm 8.116723202675e-04 ||r(i)||/||b|| 1.869504114626e-01 6 KSP preconditioned resid norm 4.251478965931e-02 true resid norm 7.020592632127e-04 ||r(i)||/||b|| 1.617035161252e-01 7 KSP preconditioned resid norm 4.111471419652e-02 true resid norm 6.818479737396e-04 ||r(i)||/||b|| 1.570483014668e-01 8 KSP preconditioned resid norm 3.966182002913e-02 true resid norm 6.777014814735e-04 ||r(i)||/||b|| 1.560932504986e-01 9 KSP preconditioned resid norm 3.786696314412e-02 true resid norm 6.233329856217e-04 ||r(i)||/||b|| 1.435706937768e-01 10 KSP preconditioned resid norm 3.592246077746e-02 true resid norm 5.802390462398e-04 ||r(i)||/||b|| 1.336449768368e-01 11 KSP preconditioned resid norm 3.451852041643e-02 true resid norm 5.559778331473e-04 ||r(i)||/||b|| 1.280569536198e-01 12 KSP preconditioned resid norm 3.089067169545e-02 true resid norm 4.955418187357e-04 ||r(i)||/||b|| 1.141368808524e-01 13 KSP preconditioned resid norm 2.659663552837e-02 true resid norm 4.320291892634e-04 ||r(i)||/||b|| 9.950817919976e-02 14 KSP preconditioned resid norm 2.205795227573e-02 true resid norm 3.562783170277e-04 ||r(i)||/||b|| 8.206067436377e-02 Linear solve converged due to CONVERGED_RTOL iterations 14 iter = 11, Function value: -4.3406, Residual: 0.00390632 0 KSP preconditioned resid norm 1.747564104597e-01 true resid norm 3.395819827250e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 7.787815596856e-02 true resid norm 1.493210285798e-03 ||r(i)||/||b|| 4.397201152475e-01 2 KSP preconditioned resid norm 6.186264636865e-02 true resid norm 1.134482127872e-03 ||r(i)||/||b|| 3.340819553406e-01 3 KSP preconditioned resid norm 4.576006410759e-02 true resid norm 8.543626673755e-04 ||r(i)||/||b|| 2.515924609780e-01 4 KSP preconditioned resid norm 3.878702900095e-02 true resid norm 7.155901004428e-04 ||r(i)||/||b|| 2.107267572622e-01 5 KSP preconditioned resid norm 3.318034878195e-02 true resid norm 5.953898616426e-04 ||r(i)||/||b|| 1.753302271413e-01 6 KSP preconditioned resid norm 2.942467993480e-02 true resid norm 5.187434008216e-04 ||r(i)||/||b|| 1.527594004426e-01 7 KSP preconditioned resid norm 2.703947889515e-02 true resid norm 4.630148110579e-04 ||r(i)||/||b|| 1.363484621129e-01 8 KSP preconditioned resid norm 2.720330355809e-02 true resid norm 4.870594319460e-04 ||r(i)||/||b|| 1.434291148304e-01 9 KSP preconditioned resid norm 2.664515458323e-02 true resid norm 4.697630852266e-04 ||r(i)||/||b|| 1.383356918577e-01 10 KSP preconditioned resid norm 2.506812781995e-02 true resid norm 4.241419599700e-04 ||r(i)||/||b|| 1.249011966320e-01 11 KSP preconditioned resid norm 2.418386634738e-02 true resid norm 3.970510826002e-04 ||r(i)||/||b|| 1.169234832231e-01 12 KSP preconditioned resid norm 2.253268729861e-02 true resid norm 3.681670226612e-04 ||r(i)||/||b|| 1.084177139514e-01 13 KSP preconditioned resid norm 2.086412868143e-02 true resid norm 3.434711644824e-04 ||r(i)||/||b|| 1.011452850726e-01 14 KSP preconditioned resid norm 1.802390740608e-02 true resid norm 2.981089954033e-04 ||r(i)||/||b|| 8.778704718405e-02 15 KSP preconditioned resid norm 1.523179504165e-02 true resid norm 2.520494847704e-04 ||r(i)||/||b|| 7.422345636476e-02 Linear solve converged due to CONVERGED_RTOL iterations 15 iter = 12, Function value: -4.34116, Residual: 0.00294104 0 KSP preconditioned resid norm 1.269253589305e-01 true resid norm 2.536068316040e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.809228196354e-02 true resid norm 1.138219826417e-03 ||r(i)||/||b|| 4.488127623448e-01 2 KSP preconditioned resid norm 4.556056028734e-02 true resid norm 8.657705419405e-04 ||r(i)||/||b|| 3.413829731891e-01 3 KSP preconditioned resid norm 3.433911748363e-02 true resid norm 6.647369933728e-04 ||r(i)||/||b|| 2.621132045886e-01 4 KSP preconditioned resid norm 2.827313089667e-02 true resid norm 5.358088790468e-04 ||r(i)||/||b|| 2.112754122821e-01 5 KSP preconditioned resid norm 2.344783293039e-02 true resid norm 4.357245597546e-04 ||r(i)||/||b|| 1.718110498044e-01 6 KSP preconditioned resid norm 2.062192561669e-02 true resid norm 3.836228029287e-04 ||r(i)||/||b|| 1.512667464447e-01 7 KSP preconditioned resid norm 1.878393491291e-02 true resid norm 3.359546817721e-04 ||r(i)||/||b|| 1.324706750395e-01 8 KSP preconditioned resid norm 1.823608390763e-02 true resid norm 3.212608041473e-04 ||r(i)||/||b|| 1.266767153375e-01 9 KSP preconditioned resid norm 1.815816434403e-02 true resid norm 3.300728123964e-04 ||r(i)||/||b|| 1.301513883947e-01 10 KSP preconditioned resid norm 1.724348490858e-02 true resid norm 3.012362342909e-04 ||r(i)||/||b|| 1.187808042810e-01 11 KSP preconditioned resid norm 1.644315792321e-02 true resid norm 2.840324848441e-04 ||r(i)||/||b|| 1.119971741485e-01 12 KSP preconditioned resid norm 1.539350847902e-02 true resid norm 2.533482074977e-04 ||r(i)||/||b|| 9.989802163266e-02 13 KSP preconditioned resid norm 1.442282393653e-02 true resid norm 2.380954915861e-04 ||r(i)||/||b|| 9.388370576620e-02 14 KSP preconditioned resid norm 1.338532412048e-02 true resid norm 2.181743171858e-04 ||r(i)||/||b|| 8.602856469044e-02 15 KSP preconditioned resid norm 1.221850738038e-02 true resid norm 2.084925892766e-04 ||r(i)||/||b|| 8.221095147867e-02 Linear solve converged due to CONVERGED_RTOL iterations 15 iter = 13, Function value: -4.34146, Residual: 0.00221753 0 KSP preconditioned resid norm 9.473751270336e-02 true resid norm 1.909704210823e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.098346688738e-02 true resid norm 7.993849698937e-04 ||r(i)||/||b|| 4.185909866897e-01 2 KSP preconditioned resid norm 3.147349124242e-02 true resid norm 6.253145723838e-04 ||r(i)||/||b|| 3.274405370424e-01 3 KSP preconditioned resid norm 2.378922461333e-02 true resid norm 4.511846746496e-04 ||r(i)||/||b|| 2.362589306201e-01 4 KSP preconditioned resid norm 1.985385606519e-02 true resid norm 3.836304045900e-04 ||r(i)||/||b|| 2.008847246688e-01 5 KSP preconditioned resid norm 1.665027084316e-02 true resid norm 3.062258292002e-04 ||r(i)||/||b|| 1.603524920062e-01 6 KSP preconditioned resid norm 1.429886591495e-02 true resid norm 2.772412865586e-04 ||r(i)||/||b|| 1.451749883502e-01 7 KSP preconditioned resid norm 1.254670895261e-02 true resid norm 2.243650167112e-04 ||r(i)||/||b|| 1.174867895455e-01 8 KSP preconditioned resid norm 1.225840302025e-02 true resid norm 2.271283663539e-04 ||r(i)||/||b|| 1.189337935512e-01 9 KSP preconditioned resid norm 1.272415101076e-02 true resid norm 2.402247841755e-04 ||r(i)||/||b|| 1.257916188350e-01 10 KSP preconditioned resid norm 1.292755831832e-02 true resid norm 2.381278572875e-04 ||r(i)||/||b|| 1.246935813086e-01 11 KSP preconditioned resid norm 1.215071659815e-02 true resid norm 2.219278455830e-04 ||r(i)||/||b|| 1.162105860820e-01 12 KSP preconditioned resid norm 1.080209989258e-02 true resid norm 1.914458622085e-04 ||r(i)||/||b|| 1.002489606105e-01 13 KSP preconditioned resid norm 9.986110973588e-03 true resid norm 1.748694884789e-04 ||r(i)||/||b|| 9.156888668300e-02 14 KSP preconditioned resid norm 9.156264188533e-03 true resid norm 1.536430336021e-04 ||r(i)||/||b|| 8.045383820770e-02 Linear solve converged due to CONVERGED_RTOL iterations 14 iter = 14, Function value: -4.34162, Residual: 0.00156036 0 KSP preconditioned resid norm 6.713096290304e-02 true resid norm 1.342490878243e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.056343317500e-02 true resid norm 6.209434475362e-04 ||r(i)||/||b|| 4.625308503763e-01 2 KSP preconditioned resid norm 2.413003317910e-02 true resid norm 4.901365410386e-04 ||r(i)||/||b|| 3.650948762349e-01 3 KSP preconditioned resid norm 1.784129496231e-02 true resid norm 3.533841764796e-04 ||r(i)||/||b|| 2.632302254017e-01 4 KSP preconditioned resid norm 1.509269101819e-02 true resid norm 2.888061883045e-04 ||r(i)||/||b|| 2.151271140721e-01 5 KSP preconditioned resid norm 1.249447123177e-02 true resid norm 2.437343259740e-04 ||r(i)||/||b|| 1.815538041443e-01 6 KSP preconditioned resid norm 1.062803951770e-02 true resid norm 2.008814504101e-04 ||r(i)||/||b|| 1.496333819959e-01 7 KSP preconditioned resid norm 9.694555842257e-03 true resid norm 1.786364367197e-04 ||r(i)||/||b|| 1.330634268096e-01 8 KSP preconditioned resid norm 9.563414684320e-03 true resid norm 1.804822937697e-04 ||r(i)||/||b|| 1.344383762264e-01 9 KSP preconditioned resid norm 9.704981715287e-03 true resid norm 1.839496949251e-04 ||r(i)||/||b|| 1.370211879323e-01 10 KSP preconditioned resid norm 9.783862413149e-03 true resid norm 1.850154077014e-04 ||r(i)||/||b|| 1.378150203475e-01 11 KSP preconditioned resid norm 9.520103796076e-03 true resid norm 1.793309452756e-04 ||r(i)||/||b|| 1.335807551336e-01 12 KSP preconditioned resid norm 8.746398790501e-03 true resid norm 1.614504296672e-04 ||r(i)||/||b|| 1.202618448168e-01 13 KSP preconditioned resid norm 7.912808926215e-03 true resid norm 1.429323415866e-04 ||r(i)||/||b|| 1.064680169549e-01 14 KSP preconditioned resid norm 6.925586071034e-03 true resid norm 1.238413598242e-04 ||r(i)||/||b|| 9.224744974527e-02 15 KSP preconditioned resid norm 6.357804411682e-03 true resid norm 1.127345454971e-04 ||r(i)||/||b|| 8.397416125813e-02 Linear solve converged due to CONVERGED_RTOL iterations 15 iter = 15, Function value: -4.34171, Residual: 0.00108123 0 KSP preconditioned resid norm 4.091066173772e-02 true resid norm 9.205007375890e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.728655515714e-02 true resid norm 3.621621547874e-04 ||r(i)||/||b|| 3.934403743511e-01 2 KSP preconditioned resid norm 1.312952276461e-02 true resid norm 2.693997658761e-04 ||r(i)||/||b|| 2.926665399331e-01 3 KSP preconditioned resid norm 1.004715817716e-02 true resid norm 2.071999047139e-04 ||r(i)||/||b|| 2.250947731521e-01 4 KSP preconditioned resid norm 8.313225063985e-03 true resid norm 1.623374316661e-04 ||r(i)||/||b|| 1.763577420821e-01 5 KSP preconditioned resid norm 6.922329050111e-03 true resid norm 1.346166507364e-04 ||r(i)||/||b|| 1.462428493963e-01 6 KSP preconditioned resid norm 5.897122308306e-03 true resid norm 1.157989984308e-04 ||r(i)||/||b|| 1.258000061293e-01 7 KSP preconditioned resid norm 5.051886677466e-03 true resid norm 9.686063834150e-05 ||r(i)||/||b|| 1.052260301227e-01 8 KSP preconditioned resid norm 4.990655930937e-03 true resid norm 9.587954730827e-05 ||r(i)||/||b|| 1.041602069319e-01 9 KSP preconditioned resid norm 5.256056069039e-03 true resid norm 1.039298284360e-04 ||r(i)||/||b|| 1.129057524802e-01 10 KSP preconditioned resid norm 5.388899329733e-03 true resid norm 1.014432042850e-04 ||r(i)||/||b|| 1.102043704502e-01 11 KSP preconditioned resid norm 5.568998571499e-03 true resid norm 1.039508376503e-04 ||r(i)||/||b|| 1.129285761601e-01 12 KSP preconditioned resid norm 5.314829123567e-03 true resid norm 9.585391683975e-05 ||r(i)||/||b|| 1.041323628820e-01 13 KSP preconditioned resid norm 4.921864652163e-03 true resid norm 8.926201545095e-05 ||r(i)||/||b|| 9.697115038142e-02 14 KSP preconditioned resid norm 4.230993871273e-03 true resid norm 7.629079023767e-05 ||r(i)||/||b|| 8.287966225589e-02 15 KSP preconditioned resid norm 3.724966780398e-03 true resid norm 6.689643138772e-05 ||r(i)||/||b|| 7.267395739730e-02 Linear solve converged due to CONVERGED_RTOL iterations 15 iter = 16, Function value: -4.34174, Residual: 0.000716126 0 KSP preconditioned resid norm 2.657708116285e-02 true resid norm 6.067231827511e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.206582807212e-02 true resid norm 2.480656122767e-04 ||r(i)||/||b|| 4.088612720415e-01 2 KSP preconditioned resid norm 8.471076282395e-03 true resid norm 1.766667490069e-04 ||r(i)||/||b|| 2.911818009093e-01 3 KSP preconditioned resid norm 6.431100488802e-03 true resid norm 1.308967705201e-04 ||r(i)||/||b|| 2.157438091067e-01 4 KSP preconditioned resid norm 5.302874374075e-03 true resid norm 1.099545540100e-04 ||r(i)||/||b|| 1.812268875427e-01 5 KSP preconditioned resid norm 4.232576702283e-03 true resid norm 8.329816074698e-05 ||r(i)||/||b|| 1.372918706836e-01 6 KSP preconditioned resid norm 3.680835249549e-03 true resid norm 7.277393297652e-05 ||r(i)||/||b|| 1.199458584169e-01 7 KSP preconditioned resid norm 3.091116068705e-03 true resid norm 5.903744039608e-05 ||r(i)||/||b|| 9.730539737806e-02 8 KSP preconditioned resid norm 2.859903952853e-03 true resid norm 5.558654845147e-05 ||r(i)||/||b|| 9.161764381480e-02 9 KSP preconditioned resid norm 2.959547218210e-03 true resid norm 6.029256120238e-05 ||r(i)||/||b|| 9.937408511241e-02 10 KSP preconditioned resid norm 2.960691089167e-03 true resid norm 5.706028380906e-05 ||r(i)||/||b|| 9.404665163829e-02 11 KSP preconditioned resid norm 2.919227733677e-03 true resid norm 5.529431686412e-05 ||r(i)||/||b|| 9.113598826633e-02 12 KSP preconditioned resid norm 2.836355055820e-03 true resid norm 5.347171431598e-05 ||r(i)||/||b|| 8.813197820053e-02 13 KSP preconditioned resid norm 2.712688618055e-03 true resid norm 5.076214586420e-05 ||r(i)||/||b|| 8.366607261325e-02 14 KSP preconditioned resid norm 2.448623059457e-03 true resid norm 4.647915303924e-05 ||r(i)||/||b|| 7.660685195593e-02 Linear solve converged due to CONVERGED_RTOL iterations 14 iter = 17, Function value: -4.34175, Residual: 0.000390283 0 KSP preconditioned resid norm 1.463210012272e-02 true resid norm 3.337911064811e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.953212140664e-03 true resid norm 1.254784058875e-04 ||r(i)||/||b|| 3.759189608446e-01 2 KSP preconditioned resid norm 4.378466206900e-03 true resid norm 9.143895341207e-05 ||r(i)||/||b|| 2.739406522122e-01 3 KSP preconditioned resid norm 3.284562245867e-03 true resid norm 6.623843167459e-05 ||r(i)||/||b|| 1.984427697097e-01 4 KSP preconditioned resid norm 2.655713575004e-03 true resid norm 5.460415615247e-05 ||r(i)||/||b|| 1.635878101371e-01 5 KSP preconditioned resid norm 2.176701317881e-03 true resid norm 4.282073391863e-05 ||r(i)||/||b|| 1.282860240647e-01 6 KSP preconditioned resid norm 1.809985345342e-03 true resid norm 3.627734130945e-05 ||r(i)||/||b|| 1.086827677702e-01 7 KSP preconditioned resid norm 1.671822205568e-03 true resid norm 3.134253340574e-05 ||r(i)||/||b|| 9.389864737908e-02 8 KSP preconditioned resid norm 1.730122795767e-03 true resid norm 3.488673498476e-05 ||r(i)||/||b|| 1.045166701790e-01 9 KSP preconditioned resid norm 1.863476415530e-03 true resid norm 3.800859799482e-05 ||r(i)||/||b|| 1.138694149030e-01 10 KSP preconditioned resid norm 1.895181284617e-03 true resid norm 3.750233032315e-05 ||r(i)||/||b|| 1.123526948291e-01 11 KSP preconditioned resid norm 1.887184454742e-03 true resid norm 3.639149265995e-05 ||r(i)||/||b|| 1.090247521679e-01 12 KSP preconditioned resid norm 1.752339462009e-03 true resid norm 3.317162220026e-05 ||r(i)||/||b|| 9.937838832786e-02 13 KSP preconditioned resid norm 1.651797928389e-03 true resid norm 3.089273083278e-05 ||r(i)||/||b|| 9.255109028654e-02 14 KSP preconditioned resid norm 1.473025608447e-03 true resid norm 2.790010717354e-05 ||r(i)||/||b|| 8.358553188451e-02 15 KSP preconditioned resid norm 1.285396944788e-03 true resid norm 2.494276452832e-05 ||r(i)||/||b|| 7.472567136757e-02 Linear solve converged due to CONVERGED_RTOL iterations 15 iter = 18, Function value: -4.34176, Residual: 0.000221268 0 KSP preconditioned resid norm 7.856682822254e-03 true resid norm 1.888109931492e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.135270791267e-03 true resid norm 6.991202817395e-05 ||r(i)||/||b|| 3.702751995945e-01 2 KSP preconditioned resid norm 2.218933015390e-03 true resid norm 4.844277762920e-05 ||r(i)||/||b|| 2.565675696167e-01 3 KSP preconditioned resid norm 1.571522532195e-03 true resid norm 3.247549146301e-05 ||r(i)||/||b|| 1.720000033968e-01 4 KSP preconditioned resid norm 1.247112933454e-03 true resid norm 2.586716290383e-05 ||r(i)||/||b|| 1.370003010544e-01 5 KSP preconditioned resid norm 9.788243976253e-04 true resid norm 1.964760930240e-05 ||r(i)||/||b|| 1.040596682147e-01 6 KSP preconditioned resid norm 8.011021804226e-04 true resid norm 1.638368719111e-05 ||r(i)||/||b|| 8.677295171138e-02 7 KSP preconditioned resid norm 6.591982417773e-04 true resid norm 1.309563057529e-05 ||r(i)||/||b|| 6.935841158859e-02 Linear solve converged due to CONVERGED_RTOL iterations 7 iter = 19, Function value: -4.34176, Residual: 0.000102701 0 KSP preconditioned resid norm 3.429190965449e-03 true resid norm 8.715160652501e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.434311663947e-03 true resid norm 3.103086400409e-05 ||r(i)||/||b|| 3.560561330007e-01 2 KSP preconditioned resid norm 1.053762592208e-03 true resid norm 2.208124684907e-05 ||r(i)||/||b|| 2.533659186504e-01 3 KSP preconditioned resid norm 8.068283526000e-04 true resid norm 1.517650946581e-05 ||r(i)||/||b|| 1.741391819490e-01 4 KSP preconditioned resid norm 7.662529384600e-04 true resid norm 1.525702671004e-05 ||r(i)||/||b|| 1.750630575658e-01 5 KSP preconditioned resid norm 7.959531067619e-04 true resid norm 1.570028905586e-05 ||r(i)||/||b|| 1.801491639900e-01 6 KSP preconditioned resid norm 7.536694004422e-04 true resid norm 1.502645711346e-05 ||r(i)||/||b|| 1.724174425763e-01 7 KSP preconditioned resid norm 6.825156968620e-04 true resid norm 1.452568267560e-05 ||r(i)||/||b|| 1.666714275821e-01 8 KSP preconditioned resid norm 6.502855938557e-04 true resid norm 1.343621650213e-05 ||r(i)||/||b|| 1.541706118552e-01 9 KSP preconditioned resid norm 6.192398826140e-04 true resid norm 1.230171857097e-05 ||r(i)||/||b|| 1.411530901319e-01 10 KSP preconditioned resid norm 5.930844572842e-04 true resid norm 1.168092253097e-05 ||r(i)||/||b|| 1.340299163345e-01 11 KSP preconditioned resid norm 5.991549064583e-04 true resid norm 1.176759628689e-05 ||r(i)||/||b|| 1.350244333535e-01 12 KSP preconditioned resid norm 5.853895100617e-04 true resid norm 1.142250594801e-05 ||r(i)||/||b|| 1.310647778447e-01 13 KSP preconditioned resid norm 5.209546030685e-04 true resid norm 1.011803961803e-05 ||r(i)||/||b|| 1.160969948973e-01 14 KSP preconditioned resid norm 4.423272486127e-04 true resid norm 8.627376370560e-06 ||r(i)||/||b|| 9.899274051918e-02 15 KSP preconditioned resid norm 4.043775603666e-04 true resid norm 7.888183144271e-06 ||r(i)||/||b|| 9.051104688481e-02 16 KSP preconditioned resid norm 3.601606825286e-04 true resid norm 6.932677165733e-06 ||r(i)||/||b|| 7.954732496806e-02 17 KSP preconditioned resid norm 3.204466047350e-04 true resid norm 6.179505453231e-06 ||r(i)||/||b|| 7.090523857937e-02 Linear solve converged due to CONVERGED_RTOL iterations 17 iter = 20, Function value: -4.34176, Residual: 4.55416e-05 0 KSP preconditioned resid norm 1.438682776749e-03 true resid norm 3.844799940215e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.089293961259e-04 true resid norm 1.340839840480e-05 ||r(i)||/||b|| 3.487411208203e-01 2 KSP preconditioned resid norm 4.622706374230e-04 true resid norm 9.841148913902e-06 ||r(i)||/||b|| 2.559599736509e-01 3 KSP preconditioned resid norm 3.355050244055e-04 true resid norm 7.017484134459e-06 ||r(i)||/||b|| 1.825188369636e-01 4 KSP preconditioned resid norm 2.845831399181e-04 true resid norm 5.732764582707e-06 ||r(i)||/||b|| 1.491043662050e-01 5 KSP preconditioned resid norm 2.286164775932e-04 true resid norm 4.835735621852e-06 ||r(i)||/||b|| 1.257734003601e-01 6 KSP preconditioned resid norm 1.817665502920e-04 true resid norm 3.772088923163e-06 ||r(i)||/||b|| 9.810884784168e-02 7 KSP preconditioned resid norm 1.442303629690e-04 true resid norm 2.978303378356e-06 ||r(i)||/||b|| 7.746315607230e-02 8 KSP preconditioned resid norm 1.194506703629e-04 true resid norm 2.373116358758e-06 ||r(i)||/||b|| 6.172275269609e-02 Linear solve converged due to CONVERGED_RTOL iterations 8 iter = 21, Function value: -4.34176, Residual: 1.6774e-05 0 KSP preconditioned resid norm 5.344946439561e-04 true resid norm 1.423016797066e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.656909077159e-04 true resid norm 3.738015940921e-06 ||r(i)||/||b|| 2.626824889648e-01 2 KSP preconditioned resid norm 1.315452507202e-04 true resid norm 2.531809935108e-06 ||r(i)||/||b|| 1.779184855954e-01 3 KSP preconditioned resid norm 9.672515990631e-05 true resid norm 1.932462105630e-06 ||r(i)||/||b|| 1.358003721118e-01 4 KSP preconditioned resid norm 9.158697364759e-05 true resid norm 1.775273053083e-06 ||r(i)||/||b|| 1.247541881967e-01 5 KSP preconditioned resid norm 9.935270670260e-05 true resid norm 2.190584381305e-06 ||r(i)||/||b|| 1.539394605756e-01 6 KSP preconditioned resid norm 1.154177758923e-04 true resid norm 2.595766444746e-06 ||r(i)||/||b|| 1.824129167061e-01 7 KSP preconditioned resid norm 1.092024185062e-04 true resid norm 2.287364912714e-06 ||r(i)||/||b|| 1.607405420252e-01 8 KSP preconditioned resid norm 9.816936429818e-05 true resid norm 2.104671520724e-06 ||r(i)||/||b|| 1.479020855596e-01 9 KSP preconditioned resid norm 9.265175504977e-05 true resid norm 1.963568913548e-06 ||r(i)||/||b|| 1.379863482705e-01 10 KSP preconditioned resid norm 8.656522105138e-05 true resid norm 1.845455093663e-06 ||r(i)||/||b|| 1.296861075335e-01 11 KSP preconditioned resid norm 8.410971308097e-05 true resid norm 1.708185075236e-06 ||r(i)||/||b|| 1.200396986710e-01 12 KSP preconditioned resid norm 8.111397603877e-05 true resid norm 1.721965942453e-06 ||r(i)||/||b|| 1.210081248516e-01 13 KSP preconditioned resid norm 7.602298282947e-05 true resid norm 1.550742188904e-06 ||r(i)||/||b|| 1.089756770335e-01 14 KSP preconditioned resid norm 6.337971365825e-05 true resid norm 1.303081343883e-06 ||r(i)||/||b|| 9.157174719017e-02 15 KSP preconditioned resid norm 5.146539909823e-05 true resid norm 1.041756105678e-06 ||r(i)||/||b|| 7.320757617376e-02 Linear solve converged due to CONVERGED_RTOL iterations 15 iter = 22, Function value: -4.34176, Residual: 2.40396e-06 0 KSP preconditioned resid norm 8.733125062330e-05 true resid norm 2.098718745258e-06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.062776294490e-05 true resid norm 7.264615749494e-07 ||r(i)||/||b|| 3.461452739157e-01 2 KSP preconditioned resid norm 3.419958090507e-05 true resid norm 5.958586335411e-07 ||r(i)||/||b|| 2.839154293006e-01 3 KSP preconditioned resid norm 2.736481374734e-05 true resid norm 5.749969858640e-07 ||r(i)||/||b|| 2.739752466419e-01 4 KSP preconditioned resid norm 2.654682566261e-05 true resid norm 5.319475026305e-07 ||r(i)||/||b|| 2.534629777489e-01 5 KSP preconditioned resid norm 2.446732124913e-05 true resid norm 4.706119067035e-07 ||r(i)||/||b|| 2.242377201646e-01 6 KSP preconditioned resid norm 2.213311130786e-05 true resid norm 4.559669864079e-07 ||r(i)||/||b|| 2.172596911512e-01 7 KSP preconditioned resid norm 1.923264290264e-05 true resid norm 3.613519809240e-07 ||r(i)||/||b|| 1.721774209815e-01 8 KSP preconditioned resid norm 1.870548067815e-05 true resid norm 3.877983028150e-07 ||r(i)||/||b|| 1.847785958415e-01 9 KSP preconditioned resid norm 1.605774724722e-05 true resid norm 3.312796139470e-07 ||r(i)||/||b|| 1.578485038529e-01 10 KSP preconditioned resid norm 1.194961089627e-05 true resid norm 2.214807563218e-07 ||r(i)||/||b|| 1.055314137839e-01 11 KSP preconditioned resid norm 9.997801906888e-06 true resid norm 1.920600557586e-07 ||r(i)||/||b|| 9.151300344200e-02 12 KSP preconditioned resid norm 9.025223549114e-06 true resid norm 1.688161624342e-07 ||r(i)||/||b|| 8.043772554833e-02 13 KSP preconditioned resid norm 8.794638859002e-06 true resid norm 1.714140620437e-07 ||r(i)||/||b|| 8.167557583930e-02 14 KSP preconditioned resid norm 8.471152217056e-06 true resid norm 1.642328286332e-07 ||r(i)||/||b|| 7.825385321604e-02 Linear solve converged due to CONVERGED_RTOL iterations 14 iter = 23, Function value: -4.34176, Residual: 2.45834e-07 Tao Object: 1 MPI processes type: tron Total PG its: 69, PG tolerance: 0.001 TaoLineSearch Object: 1 MPI processes type: more-thuente KSP Object: 1 MPI processes type: cg total KSP iterations: 272 Active Set subset type: subvec convergence tolerances: gatol=1e-12, steptol=1e-12, gttol=0. Residual in Function/Gradient:=2.45834e-07 Objective value=-4.34176 total number of iterations=23, (max: 50) total number of function/gradient evaluations=93, (max: 10000) total number of Hessian evaluations=23 Solution converged: ||g(X)||/|f(X)| <= grtol TAO solve converged due to CONVERGED_GRTOL iterations 23 it: 1 7.392125e+00 272 4.851840e+08 ========================================== Time summary: ========================================== Creating DMPlex: 0.523867 Distributing DMPlex: 1.3113e-05 Refining DMPlex: 2.10119 Setting up problem: 2.84659 Overall analysis time: 7.55529 Overall FLOPS/s: 3.5727e+08 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./main on a arch-darwin-c-opt named Justins-MacBook-Pro-2.local with 1 processor, by justin Tue Mar 8 00:55:52 2016 Using Petsc Development GIT revision: pre-tsfc-438-gf0bfc80 GIT Date: 2016-03-01 11:52:01 -0600 Max Max/Min Avg Total Time (sec): 1.308e+01 1.00000 1.308e+01 Objects: 1.415e+03 1.00000 1.415e+03 Flops: 3.658e+09 1.00000 3.658e+09 3.658e+09 Flops/sec: 2.798e+08 1.00000 2.798e+08 2.798e+08 MPI Messages: 5.500e+00 1.00000 5.500e+00 5.500e+00 MPI Message Lengths: 4.494e+06 1.00000 8.170e+05 4.494e+06 MPI Reductions: 1.000e+00 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3076e+01 100.0% 3.6584e+09 100.0% 5.500e+00 100.0% 8.170e+05 100.0% 1.000e+00 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage CreateMesh 94 1.0 2.9656e+00 1.0 3.94e+08 1.0 0.0e+00 0.0e+00 0.0e+00 23 11 0 0 0 23 11 0 0 0 133 BuildTwoSided 5 1.0 5.3954e-04 1.0 0.00e+00 0.0 5.0e-01 4.0e+00 0.0e+00 0 0 9 0 0 0 0 9 0 0 0 VecView 1 1.0 1.2989e-02 1.0 4.02e+05 1.0 1.0e+00 1.0e+06 0.0e+00 0 0 18 22 0 0 0 18 22 0 31 VecDot 508 1.0 6.7075e-02 1.0 1.27e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 1891 VecTDot 544 1.0 6.3299e-02 1.0 1.12e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 1775 VecNorm 660 1.0 6.8642e-02 1.0 1.38e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 2018 VecScale 276 1.0 1.9268e-02 1.0 3.40e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1764 VecCopy 801 1.0 7.9906e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecSet 1144 1.0 1.3466e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 752 1.0 7.7411e-02 1.0 1.64e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 2122 VecAYPX 567 1.0 5.9691e-02 1.0 8.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1467 VecWAXPY 1 1.0 2.0695e-04 1.0 1.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 VecScatterBegin 46 1.0 5.3043e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 684 1.0 1.7788e+00 1.0 1.82e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 50 0 0 0 14 50 0 0 0 1021 MatSolve 295 1.0 8.5658e-01 1.0 7.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 7 20 0 0 0 7 20 0 0 0 856 MatLUFactorNum 23 1.0 6.6945e-01 1.0 2.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 5 7 0 0 0 5 7 0 0 0 377 MatILUFactorSym 23 1.0 1.5076e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatAssemblyBegin 25 1.0 1.4544e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 25 1.0 7.3532e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 23 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 23 1.0 5.4542e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 MatGetOrdering 23 1.0 6.7198e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 1 1.0 9.6512e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexInterp 3 1.0 4.7663e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 DMPlexStratify 11 1.0 6.3193e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 DMPlexPrealloc 1 1.0 1.0380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 DMPlexResidualFE 1 1.0 6.3734e-01 1.0 4.01e+07 1.0 0.0e+00 0.0e+00 0.0e+00 5 1 0 0 0 5 1 0 0 0 63 DMPlexJacobianFE 1 1.0 1.8459e+00 1.0 8.14e+07 1.0 0.0e+00 0.0e+00 0.0e+00 14 2 0 0 0 14 2 0 0 0 44 SFSetGraph 6 1.0 2.0802e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 9 1.0 2.4407e-03 1.0 0.00e+00 0.0 4.5e+00 7.8e+05 0.0e+00 0 0 82 78 0 0 0 82 78 0 0 SFBcastEnd 9 1.0 6.9332e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFReduceBegin 1 1.0 3.6621e-04 1.0 0.00e+00 0.0 1.0e+00 1.0e+06 0.0e+00 0 0 18 22 0 0 0 18 22 0 0 SFReduceEnd 1 1.0 3.8004e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SNESFunctionEval 1 1.0 6.4219e-01 1.0 4.01e+07 1.0 2.0e+00 1.0e+06 0.0e+00 5 1 36 44 0 5 1 36 44 0 62 SNESJacobianEval 1 1.0 1.8474e+00 1.0 8.14e+07 1.0 2.5e+00 6.0e+05 0.0e+00 14 2 45 33 0 14 2 45 33 0 44 KSPSetUp 46 1.0 5.2261e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 23 1.0 3.4166e+00 1.0 2.83e+09 1.0 0.0e+00 0.0e+00 0.0e+00 26 77 0 0 0 26 77 0 0 0 827 PCSetUp 46 1.0 8.2769e-01 1.0 2.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 6 7 0 0 0 6 7 0 0 0 305 PCSetUpOnBlocks 23 1.0 8.2752e-01 1.0 2.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 6 7 0 0 0 6 7 0 0 0 305 PCApply 295 1.0 8.7405e-01 1.0 7.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 7 20 0 0 0 7 20 0 0 0 839 TaoSolve 1 1.0 4.8958e+00 1.0 3.46e+09 1.0 0.0e+00 0.0e+00 0.0e+00 37 95 0 0 0 37 95 0 0 0 707 TaoHessianEval 23 1.0 1.0967e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 TaoLineSearchApply 92 1.0 6.2689e-01 1.0 5.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 5 14 0 0 0 5 14 0 0 0 806 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 4 3 2432 0. Object 7 7 4088 0. Container 15 15 8640 0. Vector 822 822 839619600 0. Vector Scatter 46 46 30544 0. Matrix 47 47 831590120 0. Distributed Mesh 29 29 135576 0. GraphPartitioner 11 11 6732 0. Star Forest Bipartite Graph 62 62 50488 0. Discrete System 29 29 25056 0. Index Set 269 269 37207752 0. IS L to G Mapping 1 1 561392 0. Section 56 54 36288 0. SNES 1 1 1340 0. SNESLineSearch 1 1 1000 0. DMSNES 1 1 672 0. Krylov Solver 3 3 3680 0. Preconditioner 3 3 2824 0. Linear Space 2 2 1296 0. Dual Space 2 2 1328 0. FE Space 2 2 1512 0. Tao 1 1 1944 0. TaoLineSearch 1 1 888 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 #PETSc Option Table entries: -al 1 -am 0 -at 0.001 -bcloc 0,1,0,1,0,0,0,1,0,1,1,1,0,0,0,1,0,1,1,1,0,1,0,1,0,1,0,0,0,1,0,1,1,1,0,1,0.45,0.55,0.45,0.55,0.45,0.55 -bcnum 7 -bcval 0,0,0,0,0,0,1 -dim 3 -dm_refine 1 -dt 0.001 -edges 3,3 -floc 0.25,0.75,0.25,0.75,0.25,0.75 -fnum 0 -ftime 0,99 -fval 1 -ksp_atol 1.0e-12 -ksp_converged_reason -ksp_monitor_true_residual -ksp_rtol 1.0e-1 -ksp_type cg -log_view -lower 0,0 -mat_petscspace_order 0 -mesh datafiles/cube_with_hole2_mesh.dat -mu 1 -nonneg 1 -numsteps 0 -options_left 0 -pc_type bjacobi -petscpartitioner_type parmetis -progress 0 -simplex 1 -solution_petscspace_order 1 -tao_converged_reason -tao_gatol 1e-12 -tao_grtol 1e-7 -tao_monitor -tao_type tron -tao_view -trans datafiles/cube_with_hole2_trans.dat -upper 1,1 -vtuname figures/cube_with_hole_2 -vtuprint 1 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco --download-ctetgen --download-exodusii --download-fblaslapack --download-hdf5 --download-hypre --download-metis --download-ml --download-mumps --download-netcdf --download-parmetis --download-scalapack --download-superlu_dist --download-triangle --with-debugging=0 --with-mpi-dir=/usr/local/opt/open-mpi --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2 PETSC_ARCH=arch-darwin-c-opt ----------------------------------------- Libraries compiled on Tue Mar 1 13:44:59 2016 on Justins-MacBook-Pro-2.local Machine characteristics: Darwin-15.3.0-x86_64-i386-64bit Using PETSc directory: /Users/justin/Software/petsc Using PETSc arch: arch-darwin-c-opt ----------------------------------------- Using C compiler: /usr/local/opt/open-mpi/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -O2 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/opt/open-mpi/bin/mpif90 -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O2 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/Users/justin/Software/petsc/arch-darwin-c-opt/include -I/Users/justin/Software/petsc/include -I/Users/justin/Software/petsc/include -I/Users/justin/Software/petsc/arch-darwin-c-opt/include -I/opt/X11/include -I/usr/local/opt/open-mpi/include -I/usr/local/Cellar/open-mpi/1.10.2/include ----------------------------------------- Using C linker: /usr/local/opt/open-mpi/bin/mpicc Using Fortran linker: /usr/local/opt/open-mpi/bin/mpif90 Using libraries: -Wl,-rpath,/Users/justin/Software/petsc/arch-darwin-c-opt/lib -L/Users/justin/Software/petsc/arch-darwin-c-opt/lib -lpetsc -Wl,-rpath,/Users/justin/Software/petsc/arch-darwin-c-opt/lib -L/Users/justin/Software/petsc/arch-darwin-c-opt/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -Wl,-rpath,/usr/local/opt/libevent/lib -L/usr/local/opt/libevent/lib -Wl,-rpath,/usr/local/Cellar/open-mpi/1.10.2/lib -L/usr/local/Cellar/open-mpi/1.10.2/lib -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin -lclang_rt.osx -lmpi_cxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -lclang_rt.osx -lscalapack -lml -lclang_rt.osx -lmpi_cxx -lc++ -lclang_rt.osx -lflapack -lfblas -ltriangle -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lctetgen -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -Wl,-rpath,/usr/local/Cellar/gcc/5.3.0/lib/gcc/5/gcc/x86_64-apple-darwin15.0.0/5.3.0 -L/usr/local/Cellar/gcc/5.3.0/lib/gcc/5/gcc/x86_64-apple-darwin15.0.0/5.3.0 -Wl,-rpath,/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 -L/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpi_cxx -lc++ -lclang_rt.osx -Wl,-rpath,/usr/local/opt/libevent/lib -L/usr/local/opt/libevent/lib -Wl,-rpath,/usr/local/Cellar/open-mpi/1.10.2/lib -L/usr/local/Cellar/open-mpi/1.10.2/lib -ldl -lmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -lclang_rt.osx -ldl ----------------------------------------- -------------- next part -------------- iter = 0, Function value: -8.83019e-08, Residual: 0.214312 0 KSP preconditioned resid norm 2.721099185886e+01 true resid norm 2.014138134175e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.869407744459e+00 true resid norm 5.822300398427e-02 ||r(i)||/||b|| 2.890715537151e-01 2 KSP preconditioned resid norm 2.242110432761e+00 true resid norm 2.437887036293e-02 ||r(i)||/||b|| 1.210387209759e-01 3 KSP preconditioned resid norm 1.159188546876e+00 true resid norm 1.299762353990e-02 ||r(i)||/||b|| 6.453193710683e-02 4 KSP preconditioned resid norm 7.177070910360e-01 true resid norm 8.470337152644e-03 ||r(i)||/||b|| 4.205440038558e-02 5 KSP preconditioned resid norm 5.036877443493e-01 true resid norm 6.119771032345e-03 ||r(i)||/||b|| 3.038406814562e-02 6 KSP preconditioned resid norm 3.607350191643e-01 true resid norm 4.645502549467e-03 ||r(i)||/||b|| 2.306446847237e-02 7 KSP preconditioned resid norm 2.882809204229e-01 true resid norm 3.925653521457e-03 ||r(i)||/||b|| 1.949048803977e-02 8 KSP preconditioned resid norm 2.316382761182e-01 true resid norm 3.297365650418e-03 ||r(i)||/||b|| 1.637109984896e-02 9 KSP preconditioned resid norm 1.988660674034e-01 true resid norm 2.872013000564e-03 ||r(i)||/||b|| 1.425926529980e-02 10 KSP preconditioned resid norm 1.647703203546e-01 true resid norm 2.407596118385e-03 ||r(i)||/||b|| 1.195348063539e-02 11 KSP preconditioned resid norm 1.392203719999e-01 true resid norm 2.011397078071e-03 ||r(i)||/||b|| 9.986390922959e-03 12 KSP preconditioned resid norm 1.194926809320e-01 true resid norm 1.735371271678e-03 ||r(i)||/||b|| 8.615949632416e-03 13 KSP preconditioned resid norm 1.021184952736e-01 true resid norm 1.495155793005e-03 ||r(i)||/||b|| 7.423303137139e-03 14 KSP preconditioned resid norm 8.644061423831e-02 true resid norm 1.263475286700e-03 ||r(i)||/||b|| 6.273031949807e-03 15 KSP preconditioned resid norm 6.765317790725e-02 true resid norm 9.953412375312e-04 ||r(i)||/||b|| 4.941772466559e-03 16 KSP preconditioned resid norm 4.917452989312e-02 true resid norm 7.233557843802e-04 ||r(i)||/||b|| 3.591391137016e-03 17 KSP preconditioned resid norm 3.423977253678e-02 true resid norm 4.971749871193e-04 ||r(i)||/||b|| 2.468425470347e-03 18 KSP preconditioned resid norm 2.631060560074e-02 true resid norm 3.738334338599e-04 ||r(i)||/||b|| 1.856046651006e-03 Linear solve converged due to CONVERGED_RTOL iterations 18 iter = 1, Function value: -3.07167, Residual: 0.109573 0 KSP preconditioned resid norm 1.259893577029e+01 true resid norm 1.026313269991e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.076163196315e+00 true resid norm 5.540100624654e-02 ||r(i)||/||b|| 5.398060014075e-01 2 KSP preconditioned resid norm 3.926924644678e+00 true resid norm 4.093556215397e-02 ||r(i)||/||b|| 3.988603027058e-01 3 KSP preconditioned resid norm 2.760436063515e+00 true resid norm 2.793697358691e-02 ||r(i)||/||b|| 2.722070775441e-01 4 KSP preconditioned resid norm 2.073295425125e+00 true resid norm 2.286240563018e-02 ||r(i)||/||b|| 2.227624478671e-01 5 KSP preconditioned resid norm 1.585303136673e+00 true resid norm 1.796871465337e-02 ||r(i)||/||b|| 1.750802136031e-01 6 KSP preconditioned resid norm 1.243715771571e+00 true resid norm 1.504912614755e-02 ||r(i)||/||b|| 1.466328711474e-01 7 KSP preconditioned resid norm 9.978347072228e-01 true resid norm 1.254777197887e-02 ||r(i)||/||b|| 1.222606424935e-01 8 KSP preconditioned resid norm 8.378703439118e-01 true resid norm 1.080192649377e-02 ||r(i)||/||b|| 1.052497985714e-01 9 KSP preconditioned resid norm 7.117394454418e-01 true resid norm 9.313542726985e-03 ||r(i)||/||b|| 9.074756216554e-02 10 KSP preconditioned resid norm 5.914981815645e-01 true resid norm 7.796816311280e-03 ||r(i)||/||b|| 7.596916593845e-02 11 KSP preconditioned resid norm 5.021987913492e-01 true resid norm 6.666647757941e-03 ||r(i)||/||b|| 6.495724018065e-02 12 KSP preconditioned resid norm 4.242144520453e-01 true resid norm 5.716567748961e-03 ||r(i)||/||b|| 5.570002762421e-02 13 KSP preconditioned resid norm 3.706960844432e-01 true resid norm 4.949256142252e-03 ||r(i)||/||b|| 4.822363976931e-02 14 KSP preconditioned resid norm 2.940898689805e-01 true resid norm 3.979396209126e-03 ||r(i)||/||b|| 3.877369927371e-02 15 KSP preconditioned resid norm 2.218752385394e-01 true resid norm 3.037567667273e-03 ||r(i)||/||b|| 2.959688582511e-02 16 KSP preconditioned resid norm 1.573384342405e-01 true resid norm 2.138833159848e-03 ||r(i)||/||b|| 2.083996399916e-02 17 KSP preconditioned resid norm 1.116846441808e-01 true resid norm 1.508201731330e-03 ||r(i)||/||b|| 1.469533499594e-02 18 KSP preconditioned resid norm 8.187753940027e-02 true resid norm 1.094592530108e-03 ||r(i)||/||b|| 1.066528673177e-02 19 KSP preconditioned resid norm 6.581477182981e-02 true resid norm 8.933986961570e-04 ||r(i)||/||b|| 8.704931742377e-03 20 KSP preconditioned resid norm 5.544477126199e-02 true resid norm 7.558950098713e-04 ||r(i)||/||b|| 7.365148945972e-03 21 KSP preconditioned resid norm 4.524136257213e-02 true resid norm 6.145063423091e-04 ||r(i)||/||b|| 5.987512392920e-03 22 KSP preconditioned resid norm 3.636916512230e-02 true resid norm 4.939585184330e-04 ||r(i)||/||b|| 4.812940969157e-03 23 KSP preconditioned resid norm 2.897442891895e-02 true resid norm 3.916147831637e-04 ||r(i)||/||b|| 3.815743151865e-03 24 KSP preconditioned resid norm 2.420348670195e-02 true resid norm 3.236188924346e-04 ||r(i)||/||b|| 3.153217461929e-03 25 KSP preconditioned resid norm 2.052196491025e-02 true resid norm 2.706003920443e-04 ||r(i)||/||b|| 2.636625677134e-03 26 KSP preconditioned resid norm 1.673907701468e-02 true resid norm 2.246244711370e-04 ||r(i)||/||b|| 2.188654066014e-03 27 KSP preconditioned resid norm 1.346767941881e-02 true resid norm 1.761799869892e-04 ||r(i)||/||b|| 1.716629728374e-03 28 KSP preconditioned resid norm 1.094577819283e-02 true resid norm 1.418412281312e-04 ||r(i)||/||b|| 1.382046128395e-03 Linear solve converged due to CONVERGED_RTOL iterations 28 iter = 2, Function value: -4.18852, Residual: 0.0427957 0 KSP preconditioned resid norm 3.843101011618e+00 true resid norm 3.958473238638e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.720102823578e+00 true resid norm 1.708629740305e-02 ||r(i)||/||b|| 4.316385730811e-01 2 KSP preconditioned resid norm 1.182450086575e+00 true resid norm 1.337238853737e-02 ||r(i)||/||b|| 3.378168230833e-01 3 KSP preconditioned resid norm 7.235300419283e-01 true resid norm 8.206989410596e-03 ||r(i)||/||b|| 2.073271414466e-01 4 KSP preconditioned resid norm 5.090769370962e-01 true resid norm 5.835318610305e-03 ||r(i)||/||b|| 1.474133651668e-01 5 KSP preconditioned resid norm 3.581545686067e-01 true resid norm 4.177525375005e-03 ||r(i)||/||b|| 1.055337531205e-01 6 KSP preconditioned resid norm 2.736002623517e-01 true resid norm 3.303573873295e-03 ||r(i)||/||b|| 8.345575867609e-02 7 KSP preconditioned resid norm 2.223025682394e-01 true resid norm 2.723729237575e-03 ||r(i)||/||b|| 6.880756982235e-02 8 KSP preconditioned resid norm 1.970747498224e-01 true resid norm 2.494115709301e-03 ||r(i)||/||b|| 6.300701202060e-02 9 KSP preconditioned resid norm 1.710239077957e-01 true resid norm 2.204781682215e-03 ||r(i)||/||b|| 5.569777915117e-02 10 KSP preconditioned resid norm 1.384021788462e-01 true resid norm 1.781077039986e-03 ||r(i)||/||b|| 4.499404019208e-02 11 KSP preconditioned resid norm 1.152419555230e-01 true resid norm 1.506967821135e-03 ||r(i)||/||b|| 3.806942046308e-02 12 KSP preconditioned resid norm 9.427701098251e-02 true resid norm 1.249974228786e-03 ||r(i)||/||b|| 3.157718022658e-02 13 KSP preconditioned resid norm 8.069938415637e-02 true resid norm 1.059343602856e-03 ||r(i)||/||b|| 2.676141883481e-02 14 KSP preconditioned resid norm 6.836574657713e-02 true resid norm 8.924434542061e-04 ||r(i)||/||b|| 2.254514304897e-02 15 KSP preconditioned resid norm 5.737419719744e-02 true resid norm 7.401804372944e-04 ||r(i)||/||b|| 1.869863436412e-02 16 KSP preconditioned resid norm 4.522280674323e-02 true resid norm 5.820760341821e-04 ||r(i)||/||b|| 1.470455903303e-02 17 KSP preconditioned resid norm 3.485704960179e-02 true resid norm 4.362448140807e-04 ||r(i)||/||b|| 1.102053210371e-02 18 KSP preconditioned resid norm 2.586106216043e-02 true resid norm 3.249895469420e-04 ||r(i)||/||b|| 8.209972061194e-03 19 KSP preconditioned resid norm 1.991574719885e-02 true resid norm 2.522562682768e-04 ||r(i)||/||b|| 6.372564700315e-03 20 KSP preconditioned resid norm 1.579635521928e-02 true resid norm 2.061059732069e-04 ||r(i)||/||b|| 5.206703715847e-03 21 KSP preconditioned resid norm 1.265840158049e-02 true resid norm 1.653328123841e-04 ||r(i)||/||b|| 4.176681321735e-03 22 KSP preconditioned resid norm 1.055538254587e-02 true resid norm 1.390133173705e-04 ||r(i)||/||b|| 3.511791263703e-03 23 KSP preconditioned resid norm 8.735494021200e-03 true resid norm 1.138313761372e-04 ||r(i)||/||b|| 2.875638390734e-03 24 KSP preconditioned resid norm 7.052120087071e-03 true resid norm 9.361109274806e-05 ||r(i)||/||b|| 2.364828233126e-03 25 KSP preconditioned resid norm 5.301434900698e-03 true resid norm 7.056387963448e-05 ||r(i)||/||b|| 1.782603427648e-03 26 KSP preconditioned resid norm 4.051222768559e-03 true resid norm 5.375399903260e-05 ||r(i)||/||b|| 1.357947769052e-03 27 KSP preconditioned resid norm 3.180852206834e-03 true resid norm 4.219439827144e-05 ||r(i)||/||b|| 1.065926071183e-03 Linear solve converged due to CONVERGED_RTOL iterations 27 iter = 3, Function value: -4.29603, Residual: 0.017982 0 KSP preconditioned resid norm 1.483993572318e+00 true resid norm 1.646264132252e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 7.092791633738e-01 true resid norm 8.233097564164e-03 ||r(i)||/||b|| 5.001079354686e-01 2 KSP preconditioned resid norm 5.573668376524e-01 true resid norm 7.011191178333e-03 ||r(i)||/||b|| 4.258849501108e-01 3 KSP preconditioned resid norm 4.194486171586e-01 true resid norm 5.146687745269e-03 ||r(i)||/||b|| 3.126283106362e-01 4 KSP preconditioned resid norm 3.212283343262e-01 true resid norm 4.040788089896e-03 ||r(i)||/||b|| 2.454519910100e-01 5 KSP preconditioned resid norm 2.499341997262e-01 true resid norm 3.091776717162e-03 ||r(i)||/||b|| 1.878056295215e-01 6 KSP preconditioned resid norm 1.960765070701e-01 true resid norm 2.465640700445e-03 ||r(i)||/||b|| 1.497718775585e-01 7 KSP preconditioned resid norm 1.537721413676e-01 true resid norm 1.985677497814e-03 ||r(i)||/||b|| 1.206171876622e-01 8 KSP preconditioned resid norm 1.237135385169e-01 true resid norm 1.615522466691e-03 ||r(i)||/||b|| 9.813264074953e-02 9 KSP preconditioned resid norm 1.031356278030e-01 true resid norm 1.387963820462e-03 ||r(i)||/||b|| 8.430991074094e-02 10 KSP preconditioned resid norm 8.445344452882e-02 true resid norm 1.169948964343e-03 ||r(i)||/||b|| 7.106690484367e-02 11 KSP preconditioned resid norm 7.082902906248e-02 true resid norm 9.903292630782e-04 ||r(i)||/||b|| 6.015615864288e-02 12 KSP preconditioned resid norm 5.647649358471e-02 true resid norm 7.943929244234e-04 ||r(i)||/||b|| 4.825428124567e-02 13 KSP preconditioned resid norm 4.703756084744e-02 true resid norm 6.542733096547e-04 ||r(i)||/||b|| 3.974291226036e-02 14 KSP preconditioned resid norm 3.788801110225e-02 true resid norm 5.416578646445e-04 ||r(i)||/||b|| 3.290224539507e-02 15 KSP preconditioned resid norm 3.132446471206e-02 true resid norm 4.399485567978e-04 ||r(i)||/||b|| 2.672405649730e-02 16 KSP preconditioned resid norm 2.604037352711e-02 true resid norm 3.701158808932e-04 ||r(i)||/||b|| 2.248216878703e-02 17 KSP preconditioned resid norm 2.137792300687e-02 true resid norm 3.000413235104e-04 ||r(i)||/||b|| 1.822558832646e-02 18 KSP preconditioned resid norm 1.722438803955e-02 true resid norm 2.443160681325e-04 ||r(i)||/||b|| 1.484063604048e-02 19 KSP preconditioned resid norm 1.433809948402e-02 true resid norm 2.028909369783e-04 ||r(i)||/||b|| 1.232432469392e-02 20 KSP preconditioned resid norm 1.227089009856e-02 true resid norm 1.756378759653e-04 ||r(i)||/||b|| 1.066887582159e-02 21 KSP preconditioned resid norm 1.033696822353e-02 true resid norm 1.487104742963e-04 ||r(i)||/||b|| 9.033208668215e-03 22 KSP preconditioned resid norm 8.325629069824e-03 true resid norm 1.179874615063e-04 ||r(i)||/||b|| 7.166982454078e-03 23 KSP preconditioned resid norm 6.264225530508e-03 true resid norm 8.753289562494e-05 ||r(i)||/||b|| 5.317062669962e-03 24 KSP preconditioned resid norm 4.637253673903e-03 true resid norm 6.466322035183e-05 ||r(i)||/||b|| 3.927876401181e-03 25 KSP preconditioned resid norm 3.495340317598e-03 true resid norm 4.849221243186e-05 ||r(i)||/||b|| 2.945591262171e-03 26 KSP preconditioned resid norm 2.698221268934e-03 true resid norm 3.846740494807e-05 ||r(i)||/||b|| 2.336648426851e-03 27 KSP preconditioned resid norm 2.072217752009e-03 true resid norm 2.960014152852e-05 ||r(i)||/||b|| 1.798018978159e-03 28 KSP preconditioned resid norm 1.712795923670e-03 true resid norm 2.470250340539e-05 ||r(i)||/||b|| 1.500518836646e-03 29 KSP preconditioned resid norm 1.437855367610e-03 true resid norm 2.078819276287e-05 ||r(i)||/||b|| 1.262749540345e-03 Linear solve converged due to CONVERGED_RTOL iterations 29 iter = 4, Function value: -4.31835, Residual: 0.0133318 0 KSP preconditioned resid norm 9.610750892886e-01 true resid norm 1.208969118888e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.387333144663e-01 true resid norm 5.702569185136e-03 ||r(i)||/||b|| 4.716885730200e-01 2 KSP preconditioned resid norm 3.432509810827e-01 true resid norm 4.437889118014e-03 ||r(i)||/||b|| 3.670804364378e-01 3 KSP preconditioned resid norm 2.711551577991e-01 true resid norm 3.452911216931e-03 ||r(i)||/||b|| 2.856078921276e-01 4 KSP preconditioned resid norm 2.305065702176e-01 true resid norm 2.993078759854e-03 ||r(i)||/||b|| 2.475728050528e-01 5 KSP preconditioned resid norm 1.879375710168e-01 true resid norm 2.488448996239e-03 ||r(i)||/||b|| 2.058323043460e-01 6 KSP preconditioned resid norm 1.544275792776e-01 true resid norm 2.144586514838e-03 ||r(i)||/||b|| 1.773896852560e-01 7 KSP preconditioned resid norm 1.268445721718e-01 true resid norm 1.700290201866e-03 ||r(i)||/||b|| 1.406396718743e-01 8 KSP preconditioned resid norm 1.040271686305e-01 true resid norm 1.421885358976e-03 ||r(i)||/||b|| 1.176113878147e-01 9 KSP preconditioned resid norm 8.538125841640e-02 true resid norm 1.189047278263e-03 ||r(i)||/||b|| 9.835216298627e-02 10 KSP preconditioned resid norm 6.909498643524e-02 true resid norm 9.710179371303e-04 ||r(i)||/||b|| 8.031784451395e-02 11 KSP preconditioned resid norm 5.903249903128e-02 true resid norm 8.292710757638e-04 ||r(i)||/||b|| 6.859323888491e-02 12 KSP preconditioned resid norm 4.912828585881e-02 true resid norm 7.026383169101e-04 ||r(i)||/||b|| 5.811879773706e-02 13 KSP preconditioned resid norm 4.093055826527e-02 true resid norm 5.754798371688e-04 ||r(i)||/||b|| 4.760087153409e-02 14 KSP preconditioned resid norm 3.329785742574e-02 true resid norm 4.666316262990e-04 ||r(i)||/||b|| 3.859748102814e-02 15 KSP preconditioned resid norm 2.758475948545e-02 true resid norm 3.854456961338e-04 ||r(i)||/||b|| 3.188217880108e-02 16 KSP preconditioned resid norm 2.233768787114e-02 true resid norm 3.090104259890e-04 ||r(i)||/||b|| 2.555982788652e-02 17 KSP preconditioned resid norm 1.835393111141e-02 true resid norm 2.601197722896e-04 ||r(i)||/||b|| 2.151583264002e-02 18 KSP preconditioned resid norm 1.526932916278e-02 true resid norm 2.130051164362e-04 ||r(i)||/||b|| 1.761873923066e-02 19 KSP preconditioned resid norm 1.226957985320e-02 true resid norm 1.757032490729e-04 ||r(i)||/||b|| 1.453331158983e-02 20 KSP preconditioned resid norm 1.014319338951e-02 true resid norm 1.445305608837e-04 ||r(i)||/||b|| 1.195485960937e-02 21 KSP preconditioned resid norm 8.548801572873e-03 true resid norm 1.193402946236e-04 ||r(i)||/||b|| 9.871244249264e-03 22 KSP preconditioned resid norm 7.152436852356e-03 true resid norm 1.009706387357e-04 ||r(i)||/||b|| 8.351796349322e-03 23 KSP preconditioned resid norm 5.723902282154e-03 true resid norm 8.195717261152e-05 ||r(i)||/||b|| 6.779095622135e-03 24 KSP preconditioned resid norm 4.375788433630e-03 true resid norm 6.163494107916e-05 ||r(i)||/||b|| 5.098140235033e-03 25 KSP preconditioned resid norm 3.278700649025e-03 true resid norm 4.596398305918e-05 ||r(i)||/||b|| 3.801915395610e-03 26 KSP preconditioned resid norm 2.548090856296e-03 true resid norm 3.660914317258e-05 ||r(i)||/||b|| 3.028128891021e-03 27 KSP preconditioned resid norm 1.918468965927e-03 true resid norm 2.759637819924e-05 ||r(i)||/||b|| 2.282637146648e-03 28 KSP preconditioned resid norm 1.495501063256e-03 true resid norm 2.184798795078e-05 ||r(i)||/||b|| 1.807158479852e-03 29 KSP preconditioned resid norm 1.226498586352e-03 true resid norm 1.768993845039e-05 ||r(i)||/||b|| 1.463225005007e-03 30 KSP preconditioned resid norm 1.061578420809e-03 true resid norm 1.544284526297e-05 ||r(i)||/||b|| 1.277356470211e-03 31 KSP preconditioned resid norm 8.917490629859e-04 true resid norm 1.299521925286e-05 ||r(i)||/||b|| 1.074900843192e-03 Linear solve converged due to CONVERGED_RTOL iterations 31 iter = 5, Function value: -4.32865, Residual: 0.0102632 0 KSP preconditioned resid norm 6.435870087995e-01 true resid norm 9.247148480210e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.030399541184e-01 true resid norm 4.503372852756e-03 ||r(i)||/||b|| 4.870012482652e-01 2 KSP preconditioned resid norm 2.402179843757e-01 true resid norm 3.635502209170e-03 ||r(i)||/||b|| 3.931484626802e-01 3 KSP preconditioned resid norm 1.834283297965e-01 true resid norm 2.713151213066e-03 ||r(i)||/||b|| 2.934040930425e-01 4 KSP preconditioned resid norm 1.634048283213e-01 true resid norm 2.324314223983e-03 ||r(i)||/||b|| 2.513546991224e-01 5 KSP preconditioned resid norm 1.407906295925e-01 true resid norm 1.981974232657e-03 ||r(i)||/||b|| 2.143335577338e-01 6 KSP preconditioned resid norm 1.282427280050e-01 true resid norm 1.762612162633e-03 ||r(i)||/||b|| 1.906114264743e-01 7 KSP preconditioned resid norm 1.118367988914e-01 true resid norm 1.560545550995e-03 ||r(i)||/||b|| 1.687596510789e-01 8 KSP preconditioned resid norm 9.408551285381e-02 true resid norm 1.332116372492e-03 ||r(i)||/||b|| 1.440569896053e-01 9 KSP preconditioned resid norm 7.989871907909e-02 true resid norm 1.177307651586e-03 ||r(i)||/||b|| 1.273157508075e-01 10 KSP preconditioned resid norm 6.544231377818e-02 true resid norm 9.366989667598e-04 ||r(i)||/||b|| 1.012959799191e-01 11 KSP preconditioned resid norm 5.636089502456e-02 true resid norm 8.268325126130e-04 ||r(i)||/||b|| 8.941486279608e-02 12 KSP preconditioned resid norm 4.747708784864e-02 true resid norm 6.789087196234e-04 ||r(i)||/||b|| 7.341817005278e-02 13 KSP preconditioned resid norm 3.961610596116e-02 true resid norm 5.852668840291e-04 ||r(i)||/||b|| 6.329160662680e-02 14 KSP preconditioned resid norm 3.172094687613e-02 true resid norm 4.570215925147e-04 ||r(i)||/||b|| 4.942297547106e-02 15 KSP preconditioned resid norm 2.611527312631e-02 true resid norm 3.804242284704e-04 ||r(i)||/||b|| 4.113962583001e-02 16 KSP preconditioned resid norm 2.116055794635e-02 true resid norm 3.125281031158e-04 ||r(i)||/||b|| 3.379724071530e-02 17 KSP preconditioned resid norm 1.744683233100e-02 true resid norm 2.598489469605e-04 ||r(i)||/||b|| 2.810044064033e-02 18 KSP preconditioned resid norm 1.474553648080e-02 true resid norm 2.165874863095e-04 ||r(i)||/||b|| 2.342208376702e-02 19 KSP preconditioned resid norm 1.255028714073e-02 true resid norm 1.807786013058e-04 ||r(i)||/||b|| 1.954965919414e-02 20 KSP preconditioned resid norm 1.043839811711e-02 true resid norm 1.548305523355e-04 ||r(i)||/||b|| 1.674359968014e-02 21 KSP preconditioned resid norm 8.362782871978e-03 true resid norm 1.243939008000e-04 ||r(i)||/||b|| 1.345213619812e-02 22 KSP preconditioned resid norm 6.904368389635e-03 true resid norm 1.036676454744e-04 ||r(i)||/||b|| 1.121076899503e-02 23 KSP preconditioned resid norm 5.486725197810e-03 true resid norm 8.257502313361e-05 ||r(i)||/||b|| 8.929782333476e-03 24 KSP preconditioned resid norm 4.438263095230e-03 true resid norm 6.862202734919e-05 ||r(i)||/||b|| 7.420885205429e-03 25 KSP preconditioned resid norm 3.442612079764e-03 true resid norm 5.304306747739e-05 ||r(i)||/||b|| 5.736153971239e-03 26 KSP preconditioned resid norm 2.678114573711e-03 true resid norm 4.183180142759e-05 ||r(i)||/||b|| 4.523751458854e-03 27 KSP preconditioned resid norm 2.060747538122e-03 true resid norm 3.078198543719e-05 ||r(i)||/||b|| 3.328808389210e-03 28 KSP preconditioned resid norm 1.579616668233e-03 true resid norm 2.429026836533e-05 ||r(i)||/||b|| 2.626784723670e-03 29 KSP preconditioned resid norm 1.234270772301e-03 true resid norm 1.923202377453e-05 ||r(i)||/||b|| 2.079778843791e-03 30 KSP preconditioned resid norm 9.964136289778e-04 true resid norm 1.556535515034e-05 ||r(i)||/||b|| 1.683259999951e-03 31 KSP preconditioned resid norm 8.268586412931e-04 true resid norm 1.282343688537e-05 ||r(i)||/||b|| 1.386744996343e-03 32 KSP preconditioned resid norm 6.876607848451e-04 true resid norm 1.078507077050e-05 ||r(i)||/||b|| 1.166313138973e-03 33 KSP preconditioned resid norm 5.823264724413e-04 true resid norm 9.247191580772e-06 ||r(i)||/||b|| 1.000004660957e-03 Linear solve converged due to CONVERGED_RTOL iterations 33 iter = 6, Function value: -4.33452, Residual: 0.00824 0 KSP preconditioned resid norm 4.580429629641e-01 true resid norm 7.293449405477e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.126275047773e-01 true resid norm 3.365940515921e-03 ||r(i)||/||b|| 4.615018667838e-01 2 KSP preconditioned resid norm 1.607116558503e-01 true resid norm 2.594815654042e-03 ||r(i)||/||b|| 3.557734495413e-01 3 KSP preconditioned resid norm 1.256328112808e-01 true resid norm 1.935137668528e-03 ||r(i)||/||b|| 2.653254394381e-01 4 KSP preconditioned resid norm 1.077437255417e-01 true resid norm 1.649271067684e-03 ||r(i)||/||b|| 2.261304598131e-01 5 KSP preconditioned resid norm 9.146357097985e-02 true resid norm 1.374490100940e-03 ||r(i)||/||b|| 1.884554241108e-01 6 KSP preconditioned resid norm 8.362771902356e-02 true resid norm 1.217255358340e-03 ||r(i)||/||b|| 1.668970730675e-01 7 KSP preconditioned resid norm 7.581663206661e-02 true resid norm 1.101452969704e-03 ||r(i)||/||b|| 1.510194845359e-01 8 KSP preconditioned resid norm 6.998972666501e-02 true resid norm 1.015714068261e-03 ||r(i)||/||b|| 1.392638807501e-01 9 KSP preconditioned resid norm 6.251274854997e-02 true resid norm 9.070776793019e-04 ||r(i)||/||b|| 1.243688176709e-01 10 KSP preconditioned resid norm 5.408278696807e-02 true resid norm 7.923228224938e-04 ||r(i)||/||b|| 1.086348555320e-01 11 KSP preconditioned resid norm 4.620643282733e-02 true resid norm 6.935884107807e-04 ||r(i)||/||b|| 9.509744597115e-02 12 KSP preconditioned resid norm 3.869842941321e-02 true resid norm 5.830791548764e-04 ||r(i)||/||b|| 7.994559535006e-02 13 KSP preconditioned resid norm 3.256453475437e-02 true resid norm 4.918515913668e-04 ||r(i)||/||b|| 6.743744475659e-02 14 KSP preconditioned resid norm 2.709104074098e-02 true resid norm 4.126465083090e-04 ||r(i)||/||b|| 5.657768846647e-02 15 KSP preconditioned resid norm 2.208861076936e-02 true resid norm 3.380633285978e-04 ||r(i)||/||b|| 4.635163827200e-02 16 KSP preconditioned resid norm 1.794721974156e-02 true resid norm 2.701816765321e-04 ||r(i)||/||b|| 3.704443007848e-02 17 KSP preconditioned resid norm 1.461692115243e-02 true resid norm 2.231806229797e-04 ||r(i)||/||b|| 3.060014686770e-02 18 KSP preconditioned resid norm 1.188465256956e-02 true resid norm 1.786589087466e-04 ||r(i)||/||b|| 2.449580422296e-02 19 KSP preconditioned resid norm 1.026170638346e-02 true resid norm 1.498923257873e-04 ||r(i)||/||b|| 2.055163715467e-02 20 KSP preconditioned resid norm 8.810836405727e-03 true resid norm 1.335935464209e-04 ||r(i)||/||b|| 1.831692234960e-02 21 KSP preconditioned resid norm 7.320019812282e-03 true resid norm 1.119977746816e-04 ||r(i)||/||b|| 1.535594044123e-02 22 KSP preconditioned resid norm 6.001486062474e-03 true resid norm 9.512544671775e-05 ||r(i)||/||b|| 1.304258676921e-02 23 KSP preconditioned resid norm 4.859632398865e-03 true resid norm 7.502174448089e-05 ||r(i)||/||b|| 1.028618151852e-02 24 KSP preconditioned resid norm 3.864032476843e-03 true resid norm 6.045355476453e-05 ||r(i)||/||b|| 8.288746710045e-03 25 KSP preconditioned resid norm 2.953626939392e-03 true resid norm 4.629350599701e-05 ||r(i)||/||b|| 6.347271835772e-03 26 KSP preconditioned resid norm 2.293871331969e-03 true resid norm 3.626119871120e-05 ||r(i)||/||b|| 4.971748852328e-03 27 KSP preconditioned resid norm 1.742326125749e-03 true resid norm 2.772353617591e-05 ||r(i)||/||b|| 3.801155617133e-03 28 KSP preconditioned resid norm 1.354937057615e-03 true resid norm 2.195469985970e-05 ||r(i)||/||b|| 3.010194304386e-03 29 KSP preconditioned resid norm 1.075416894315e-03 true resid norm 1.723365042026e-05 ||r(i)||/||b|| 2.362894353845e-03 30 KSP preconditioned resid norm 9.231117147460e-04 true resid norm 1.473226719898e-05 ||r(i)||/||b|| 2.019931363055e-03 31 KSP preconditioned resid norm 7.980046566862e-04 true resid norm 1.277610864821e-05 ||r(i)||/||b|| 1.751723764425e-03 32 KSP preconditioned resid norm 6.822576558920e-04 true resid norm 1.084839674523e-05 ||r(i)||/||b|| 1.487416466766e-03 33 KSP preconditioned resid norm 5.861395322911e-04 true resid norm 9.443538063319e-06 ||r(i)||/||b|| 1.294797226704e-03 34 KSP preconditioned resid norm 4.755664235407e-04 true resid norm 7.622821355203e-06 ||r(i)||/||b|| 1.045159969092e-03 35 KSP preconditioned resid norm 3.952174168403e-04 true resid norm 6.392397459573e-06 ||r(i)||/||b|| 8.764573666299e-04 Linear solve converged due to CONVERGED_RTOL iterations 35 iter = 7, Function value: -4.33756, Residual: 0.0064391 0 KSP preconditioned resid norm 3.365752074078e-01 true resid norm 5.647078421727e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.531642344643e-01 true resid norm 2.630845721377e-03 ||r(i)||/||b|| 4.658773129935e-01 2 KSP preconditioned resid norm 1.226734750047e-01 true resid norm 2.022703176663e-03 ||r(i)||/||b|| 3.581857777787e-01 3 KSP preconditioned resid norm 9.465944065661e-02 true resid norm 1.579710784144e-03 ||r(i)||/||b|| 2.797394805190e-01 4 KSP preconditioned resid norm 8.232256940227e-02 true resid norm 1.342572239495e-03 ||r(i)||/||b|| 2.377463423085e-01 5 KSP preconditioned resid norm 7.087246258195e-02 true resid norm 1.145327339449e-03 ||r(i)||/||b|| 2.028176791458e-01 6 KSP preconditioned resid norm 6.130862137303e-02 true resid norm 9.752429627532e-04 ||r(i)||/||b|| 1.726986753010e-01 7 KSP preconditioned resid norm 5.622445479685e-02 true resid norm 8.849249757586e-04 ||r(i)||/||b|| 1.567049206106e-01 8 KSP preconditioned resid norm 5.263301324622e-02 true resid norm 7.962847091016e-04 ||r(i)||/||b|| 1.410082611989e-01 9 KSP preconditioned resid norm 4.996806693692e-02 true resid norm 7.552014719124e-04 ||r(i)||/||b|| 1.337331298618e-01 10 KSP preconditioned resid norm 4.504564691304e-02 true resid norm 6.740779374940e-04 ||r(i)||/||b|| 1.193675538311e-01 11 KSP preconditioned resid norm 3.966453623321e-02 true resid norm 6.156797768173e-04 ||r(i)||/||b|| 1.090262487676e-01 12 KSP preconditioned resid norm 3.355022162175e-02 true resid norm 5.176362863067e-04 ||r(i)||/||b|| 9.166444091075e-02 13 KSP preconditioned resid norm 2.884353093299e-02 true resid norm 4.607489762404e-04 ||r(i)||/||b|| 8.159068138096e-02 14 KSP preconditioned resid norm 2.443868728436e-02 true resid norm 3.810431618572e-04 ||r(i)||/||b|| 6.747615906859e-02 15 KSP preconditioned resid norm 2.061181916384e-02 true resid norm 3.247698475475e-04 ||r(i)||/||b|| 5.751112757670e-02 16 KSP preconditioned resid norm 1.694095875304e-02 true resid norm 2.658160255366e-04 ||r(i)||/||b|| 4.707142449339e-02 17 KSP preconditioned resid norm 1.389239974916e-02 true resid norm 2.194435760846e-04 ||r(i)||/||b|| 3.885966506863e-02 18 KSP preconditioned resid norm 1.150349869543e-02 true resid norm 1.826230849511e-04 ||r(i)||/||b|| 3.233939239244e-02 19 KSP preconditioned resid norm 9.671536105962e-03 true resid norm 1.541210513702e-04 ||r(i)||/||b|| 2.729217479559e-02 20 KSP preconditioned resid norm 8.389750981227e-03 true resid norm 1.315923210092e-04 ||r(i)||/||b|| 2.330272597294e-02 21 KSP preconditioned resid norm 7.264631759910e-03 true resid norm 1.147613630894e-04 ||r(i)||/||b|| 2.032225418509e-02 22 KSP preconditioned resid norm 6.178394019600e-03 true resid norm 9.763089243448e-05 ||r(i)||/||b|| 1.728874386778e-02 23 KSP preconditioned resid norm 5.064980032463e-03 true resid norm 8.009828537702e-05 ||r(i)||/||b|| 1.418402214300e-02 24 KSP preconditioned resid norm 4.171966914338e-03 true resid norm 6.581770418918e-05 ||r(i)||/||b|| 1.165517800071e-02 25 KSP preconditioned resid norm 3.258142197063e-03 true resid norm 5.195216663286e-05 ||r(i)||/||b|| 9.199830913094e-03 26 KSP preconditioned resid norm 2.543723506121e-03 true resid norm 4.046905232261e-05 ||r(i)||/||b|| 7.166369811141e-03 27 KSP preconditioned resid norm 1.940877080259e-03 true resid norm 3.091509745004e-05 ||r(i)||/||b|| 5.474529507346e-03 28 KSP preconditioned resid norm 1.516132620211e-03 true resid norm 2.455801701616e-05 ||r(i)||/||b|| 4.348800420705e-03 29 KSP preconditioned resid norm 1.166994431020e-03 true resid norm 1.908250238100e-05 ||r(i)||/||b|| 3.379181402472e-03 30 KSP preconditioned resid norm 9.374169781151e-04 true resid norm 1.520042632601e-05 ||r(i)||/||b|| 2.691732820202e-03 31 KSP preconditioned resid norm 7.814603353036e-04 true resid norm 1.272050038155e-05 ||r(i)||/||b|| 2.252580791620e-03 32 KSP preconditioned resid norm 6.790814550455e-04 true resid norm 1.097124865285e-05 ||r(i)||/||b|| 1.942818539697e-03 33 KSP preconditioned resid norm 5.931655433780e-04 true resid norm 9.705766902663e-06 ||r(i)||/||b|| 1.718723590826e-03 34 KSP preconditioned resid norm 5.094285876152e-04 true resid norm 8.295218778434e-06 ||r(i)||/||b|| 1.468939894038e-03 35 KSP preconditioned resid norm 4.430171198913e-04 true resid norm 7.137730967573e-06 ||r(i)||/||b|| 1.263968805553e-03 36 KSP preconditioned resid norm 3.798047602441e-04 true resid norm 6.081142846114e-06 ||r(i)||/||b|| 1.076865308390e-03 37 KSP preconditioned resid norm 3.237513670801e-04 true resid norm 5.336678318234e-06 ||r(i)||/||b|| 9.450335057686e-04 Linear solve converged due to CONVERGED_RTOL iterations 37 iter = 8, Function value: -4.3394, Residual: 0.00516987 0 KSP preconditioned resid norm 2.579822664129e-01 true resid norm 4.535827584469e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.121169065460e-01 true resid norm 2.000030160469e-03 ||r(i)||/||b|| 4.409405170772e-01 2 KSP preconditioned resid norm 8.875000413114e-02 true resid norm 1.571915348368e-03 ||r(i)||/||b|| 3.465553571194e-01 3 KSP preconditioned resid norm 6.890568111092e-02 true resid norm 1.205003080660e-03 ||r(i)||/||b|| 2.656633344677e-01 4 KSP preconditioned resid norm 5.989079351266e-02 true resid norm 1.011752326731e-03 ||r(i)||/||b|| 2.230579332856e-01 5 KSP preconditioned resid norm 5.140645098381e-02 true resid norm 8.796805593396e-04 ||r(i)||/||b|| 1.939404756812e-01 6 KSP preconditioned resid norm 4.582133364180e-02 true resid norm 7.668573161782e-04 ||r(i)||/||b|| 1.690666811948e-01 7 KSP preconditioned resid norm 4.110024568133e-02 true resid norm 6.875780926670e-04 ||r(i)||/||b|| 1.515882338697e-01 8 KSP preconditioned resid norm 3.675683962476e-02 true resid norm 5.965852203625e-04 ||r(i)||/||b|| 1.315273143109e-01 9 KSP preconditioned resid norm 3.528737473146e-02 true resid norm 5.572414164908e-04 ||r(i)||/||b|| 1.228533065055e-01 10 KSP preconditioned resid norm 3.308429905976e-02 true resid norm 5.137181186160e-04 ||r(i)||/||b|| 1.132578584722e-01 11 KSP preconditioned resid norm 3.183638179864e-02 true resid norm 4.998432782211e-04 ||r(i)||/||b|| 1.101989149527e-01 12 KSP preconditioned resid norm 2.900378553500e-02 true resid norm 4.605248585555e-04 ||r(i)||/||b|| 1.015305035254e-01 13 KSP preconditioned resid norm 2.560602673425e-02 true resid norm 4.089497208156e-04 ||r(i)||/||b|| 9.015989104520e-02 14 KSP preconditioned resid norm 2.172631779311e-02 true resid norm 3.435671032488e-04 ||r(i)||/||b|| 7.574518582347e-02 15 KSP preconditioned resid norm 1.856238751993e-02 true resid norm 2.986154564276e-04 ||r(i)||/||b|| 6.583483407748e-02 16 KSP preconditioned resid norm 1.552283696451e-02 true resid norm 2.486848270077e-04 ||r(i)||/||b|| 5.482678130430e-02 17 KSP preconditioned resid norm 1.276388861151e-02 true resid norm 2.058816816931e-04 ||r(i)||/||b|| 4.539010309785e-02 18 KSP preconditioned resid norm 1.061340188175e-02 true resid norm 1.683325449254e-04 ||r(i)||/||b|| 3.711176004613e-02 19 KSP preconditioned resid norm 8.807746392307e-03 true resid norm 1.433985388140e-04 ||r(i)||/||b|| 3.161463617025e-02 20 KSP preconditioned resid norm 7.384480483432e-03 true resid norm 1.180900991800e-04 ||r(i)||/||b|| 2.603496208374e-02 21 KSP preconditioned resid norm 6.165993491753e-03 true resid norm 9.823905730110e-05 ||r(i)||/||b|| 2.165846374705e-02 22 KSP preconditioned resid norm 5.390652069428e-03 true resid norm 8.757572639614e-05 ||r(i)||/||b|| 1.930755187785e-02 23 KSP preconditioned resid norm 4.646336294101e-03 true resid norm 7.506049999161e-05 ||r(i)||/||b|| 1.654835828607e-02 24 KSP preconditioned resid norm 4.019795043433e-03 true resid norm 6.611110498376e-05 ||r(i)||/||b|| 1.457531260891e-02 25 KSP preconditioned resid norm 3.320956834943e-03 true resid norm 5.457508110541e-05 ||r(i)||/||b|| 1.203200079568e-02 26 KSP preconditioned resid norm 2.709096312797e-03 true resid norm 4.476122687918e-05 ||r(i)||/||b|| 9.868370445217e-03 27 KSP preconditioned resid norm 2.107803676616e-03 true resid norm 3.540727729604e-05 ||r(i)||/||b|| 7.806133861278e-03 28 KSP preconditioned resid norm 1.691881451509e-03 true resid norm 2.756681403714e-05 ||r(i)||/||b|| 6.077570966658e-03 29 KSP preconditioned resid norm 1.334628184064e-03 true resid norm 2.229796155957e-05 ||r(i)||/||b|| 4.915963216044e-03 30 KSP preconditioned resid norm 1.079058365509e-03 true resid norm 1.800092925819e-05 ||r(i)||/||b|| 3.968609679925e-03 31 KSP preconditioned resid norm 8.938550090028e-04 true resid norm 1.470331497514e-05 ||r(i)||/||b|| 3.241594769934e-03 32 KSP preconditioned resid norm 7.585148340860e-04 true resid norm 1.260277041889e-05 ||r(i)||/||b|| 2.778494152213e-03 33 KSP preconditioned resid norm 6.753225209717e-04 true resid norm 1.115728794747e-05 ||r(i)||/||b|| 2.459813063811e-03 34 KSP preconditioned resid norm 5.915224853963e-04 true resid norm 9.725116108455e-06 ||r(i)||/||b|| 2.144066529723e-03 35 KSP preconditioned resid norm 5.267672403241e-04 true resid norm 8.702267065332e-06 ||r(i)||/||b|| 1.918562137399e-03 36 KSP preconditioned resid norm 4.544078869487e-04 true resid norm 7.489186416668e-06 ||r(i)||/||b|| 1.651117966281e-03 37 KSP preconditioned resid norm 4.053787350824e-04 true resid norm 6.547856619253e-06 ||r(i)||/||b|| 1.443585872107e-03 38 KSP preconditioned resid norm 3.566577067080e-04 true resid norm 5.893352338599e-06 ||r(i)||/||b|| 1.299289320163e-03 39 KSP preconditioned resid norm 3.075832792784e-04 true resid norm 5.126232656792e-06 ||r(i)||/||b|| 1.130164796022e-03 40 KSP preconditioned resid norm 2.514006833538e-04 true resid norm 4.102750449524e-06 ||r(i)||/||b|| 9.045208119400e-04 Linear solve converged due to CONVERGED_RTOL iterations 40 iter = 9, Function value: -4.34051, Residual: 0.00411428 0 KSP preconditioned resid norm 1.833039581210e-01 true resid norm 3.554527501002e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 8.879425844837e-02 true resid norm 1.589678744969e-03 ||r(i)||/||b|| 4.472264582342e-01 2 KSP preconditioned resid norm 6.744331047980e-02 true resid norm 1.250288961979e-03 ||r(i)||/||b|| 3.517454743637e-01 3 KSP preconditioned resid norm 5.076655139940e-02 true resid norm 9.317217176885e-04 ||r(i)||/||b|| 2.621225232962e-01 4 KSP preconditioned resid norm 4.268624539738e-02 true resid norm 7.673129645816e-04 ||r(i)||/||b|| 2.158691877796e-01 5 KSP preconditioned resid norm 3.675425616122e-02 true resid norm 6.627844536611e-04 ||r(i)||/||b|| 1.864620412908e-01 6 KSP preconditioned resid norm 3.264888429126e-02 true resid norm 5.750294707113e-04 ||r(i)||/||b|| 1.617738139736e-01 7 KSP preconditioned resid norm 2.938683806963e-02 true resid norm 5.189918474182e-04 ||r(i)||/||b|| 1.460086740845e-01 8 KSP preconditioned resid norm 2.591483167169e-02 true resid norm 4.526142272403e-04 ||r(i)||/||b|| 1.273345689723e-01 9 KSP preconditioned resid norm 2.398179250353e-02 true resid norm 4.012635115053e-04 ||r(i)||/||b|| 1.128880030868e-01 10 KSP preconditioned resid norm 2.190806300749e-02 true resid norm 3.666991806483e-04 ||r(i)||/||b|| 1.031639734240e-01 11 KSP preconditioned resid norm 2.138070086376e-02 true resid norm 3.418997966547e-04 ||r(i)||/||b|| 9.618712938873e-02 12 KSP preconditioned resid norm 2.046278803230e-02 true resid norm 3.258822767504e-04 ||r(i)||/||b|| 9.168089898265e-02 13 KSP preconditioned resid norm 1.902760737693e-02 true resid norm 3.094631955219e-04 ||r(i)||/||b|| 8.706169678942e-02 14 KSP preconditioned resid norm 1.657474384314e-02 true resid norm 2.686344692801e-04 ||r(i)||/||b|| 7.557529635216e-02 15 KSP preconditioned resid norm 1.436826862556e-02 true resid norm 2.340245997034e-04 ||r(i)||/||b|| 6.583845522011e-02 16 KSP preconditioned resid norm 1.239020745933e-02 true resid norm 2.043306878958e-04 ||r(i)||/||b|| 5.748462709551e-02 17 KSP preconditioned resid norm 1.062911795441e-02 true resid norm 1.766593783541e-04 ||r(i)||/||b|| 4.969982038521e-02 18 KSP preconditioned resid norm 8.878952061488e-03 true resid norm 1.483095825529e-04 ||r(i)||/||b|| 4.172413422348e-02 19 KSP preconditioned resid norm 7.460327150062e-03 true resid norm 1.227674316610e-04 ||r(i)||/||b|| 3.453832657825e-02 20 KSP preconditioned resid norm 6.148364822331e-03 true resid norm 1.032712793961e-04 ||r(i)||/||b|| 2.905344785404e-02 21 KSP preconditioned resid norm 5.194243108305e-03 true resid norm 8.522588715271e-05 ||r(i)||/||b|| 2.397671339684e-02 22 KSP preconditioned resid norm 4.523621154593e-03 true resid norm 7.284344833090e-05 ||r(i)||/||b|| 2.049314523811e-02 23 KSP preconditioned resid norm 3.920907681455e-03 true resid norm 6.413926502911e-05 ||r(i)||/||b|| 1.804438564930e-02 24 KSP preconditioned resid norm 3.491419448943e-03 true resid norm 5.755834691931e-05 ||r(i)||/||b|| 1.619296711112e-02 25 KSP preconditioned resid norm 2.996127446382e-03 true resid norm 5.016877571631e-05 ||r(i)||/||b|| 1.411404911122e-02 26 KSP preconditioned resid norm 2.582373074783e-03 true resid norm 4.334681073825e-05 ||r(i)||/||b|| 1.219481653357e-02 27 KSP preconditioned resid norm 2.149445960971e-03 true resid norm 3.646036383815e-05 ||r(i)||/||b|| 1.025744317012e-02 28 KSP preconditioned resid norm 1.779882954843e-03 true resid norm 3.042160969776e-05 ||r(i)||/||b|| 8.558552350260e-03 29 KSP preconditioned resid norm 1.434978879237e-03 true resid norm 2.460696179076e-05 ||r(i)||/||b|| 6.922709638291e-03 30 KSP preconditioned resid norm 1.161834179649e-03 true resid norm 1.994990043985e-05 ||r(i)||/||b|| 5.612532308225e-03 31 KSP preconditioned resid norm 9.591045801937e-04 true resid norm 1.649800230840e-05 ||r(i)||/||b|| 4.641405166719e-03 32 KSP preconditioned resid norm 8.349627410909e-04 true resid norm 1.389434173803e-05 ||r(i)||/||b|| 3.908913838510e-03 33 KSP preconditioned resid norm 7.451066133756e-04 true resid norm 1.252123319435e-05 ||r(i)||/||b|| 3.522615366127e-03 34 KSP preconditioned resid norm 6.554433879995e-04 true resid norm 1.105269281989e-05 ||r(i)||/||b|| 3.109468928508e-03 35 KSP preconditioned resid norm 5.813404412631e-04 true resid norm 9.942805922060e-06 ||r(i)||/||b|| 2.797222955585e-03 36 KSP preconditioned resid norm 5.059758616976e-04 true resid norm 8.572359779432e-06 ||r(i)||/||b|| 2.411673500069e-03 37 KSP preconditioned resid norm 4.462905228749e-04 true resid norm 7.501400824829e-06 ||r(i)||/||b|| 2.110379177743e-03 38 KSP preconditioned resid norm 3.934664628627e-04 true resid norm 6.490357855655e-06 ||r(i)||/||b|| 1.825941100139e-03 39 KSP preconditioned resid norm 3.396466705822e-04 true resid norm 5.732168274411e-06 ||r(i)||/||b|| 1.612638606058e-03 40 KSP preconditioned resid norm 2.866064092110e-04 true resid norm 4.796976387692e-06 ||r(i)||/||b|| 1.349539815444e-03 41 KSP preconditioned resid norm 2.360390847727e-04 true resid norm 4.020520754430e-06 ||r(i)||/||b|| 1.131098508394e-03 42 KSP preconditioned resid norm 1.910885139084e-04 true resid norm 3.206412742634e-06 ||r(i)||/||b|| 9.020644070780e-04 43 KSP preconditioned resid norm 1.579551639226e-04 true resid norm 2.673893225665e-06 ||r(i)||/||b|| 7.522499755343e-04 Linear solve converged due to CONVERGED_RTOL iterations 43 iter = 10, Function value: -4.34113, Residual: 0.00298855 0 KSP preconditioned resid norm 1.355014589832e-01 true resid norm 2.588223194387e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.012446726422e-02 true resid norm 1.185069120093e-03 ||r(i)||/||b|| 4.578697550748e-01 2 KSP preconditioned resid norm 4.731899881235e-02 true resid norm 8.977931228669e-04 ||r(i)||/||b|| 3.468762372634e-01 3 KSP preconditioned resid norm 3.606079042143e-02 true resid norm 7.008147838552e-04 ||r(i)||/||b|| 2.707706141321e-01 4 KSP preconditioned resid norm 2.969142059466e-02 true resid norm 5.644621681368e-04 ||r(i)||/||b|| 2.180886754129e-01 5 KSP preconditioned resid norm 2.522151861445e-02 true resid norm 4.693515049702e-04 ||r(i)||/||b|| 1.813412019443e-01 6 KSP preconditioned resid norm 2.239765975149e-02 true resid norm 4.043289484193e-04 ||r(i)||/||b|| 1.562187331047e-01 7 KSP preconditioned resid norm 2.042044702367e-02 true resid norm 3.687606253574e-04 ||r(i)||/||b|| 1.424763622230e-01 8 KSP preconditioned resid norm 1.793609142814e-02 true resid norm 3.197767042271e-04 ||r(i)||/||b|| 1.235506678561e-01 9 KSP preconditioned resid norm 1.636631443158e-02 true resid norm 2.977039840275e-04 ||r(i)||/||b|| 1.150225315472e-01 10 KSP preconditioned resid norm 1.463158356454e-02 true resid norm 2.527668957506e-04 ||r(i)||/||b|| 9.766039354672e-02 11 KSP preconditioned resid norm 1.359310576102e-02 true resid norm 2.363040576682e-04 ||r(i)||/||b|| 9.129972182486e-02 12 KSP preconditioned resid norm 1.286224536085e-02 true resid norm 2.112872641904e-04 ||r(i)||/||b|| 8.163409734085e-02 13 KSP preconditioned resid norm 1.259279399902e-02 true resid norm 2.112246905553e-04 ||r(i)||/||b|| 8.160992105062e-02 14 KSP preconditioned resid norm 1.195160247692e-02 true resid norm 1.992313019942e-04 ||r(i)||/||b|| 7.697609017116e-02 15 KSP preconditioned resid norm 1.090168247084e-02 true resid norm 1.848003831448e-04 ||r(i)||/||b|| 7.140048182303e-02 16 KSP preconditioned resid norm 9.623490590758e-03 true resid norm 1.621835826501e-04 ||r(i)||/||b|| 6.266213169010e-02 17 KSP preconditioned resid norm 8.379685259433e-03 true resid norm 1.389316983124e-04 ||r(i)||/||b|| 5.367840710711e-02 18 KSP preconditioned resid norm 7.257510958402e-03 true resid norm 1.201722648553e-04 ||r(i)||/||b|| 4.643041029690e-02 19 KSP preconditioned resid norm 6.091792690922e-03 true resid norm 1.026968778306e-04 ||r(i)||/||b|| 3.967852465480e-02 20 KSP preconditioned resid norm 5.034040531644e-03 true resid norm 8.493946364578e-05 ||r(i)||/||b|| 3.281767346417e-02 21 KSP preconditioned resid norm 4.054874404469e-03 true resid norm 7.028344407248e-05 ||r(i)||/||b|| 2.715509397524e-02 22 KSP preconditioned resid norm 3.381469789440e-03 true resid norm 5.853532176524e-05 ||r(i)||/||b|| 2.261602550050e-02 23 KSP preconditioned resid norm 2.895777112221e-03 true resid norm 5.039461915632e-05 ||r(i)||/||b|| 1.947073933408e-02 24 KSP preconditioned resid norm 2.557453608211e-03 true resid norm 4.459005024131e-05 ||r(i)||/||b|| 1.722805449623e-02 25 KSP preconditioned resid norm 2.199291578139e-03 true resid norm 3.894087540815e-05 ||r(i)||/||b|| 1.504540856160e-02 26 KSP preconditioned resid norm 1.917724735237e-03 true resid norm 3.380611069985e-05 ||r(i)||/||b|| 1.306151292252e-02 27 KSP preconditioned resid norm 1.643916347925e-03 true resid norm 2.893782290733e-05 ||r(i)||/||b|| 1.118057475494e-02 28 KSP preconditioned resid norm 1.395505924678e-03 true resid norm 2.444692255239e-05 ||r(i)||/||b|| 9.445446051718e-03 29 KSP preconditioned resid norm 1.184960913935e-03 true resid norm 2.048240192598e-05 ||r(i)||/||b|| 7.913692285269e-03 30 KSP preconditioned resid norm 9.906203557454e-04 true resid norm 1.732828297856e-05 ||r(i)||/||b|| 6.695049722194e-03 31 KSP preconditioned resid norm 8.020149598787e-04 true resid norm 1.430148845691e-05 ||r(i)||/||b|| 5.525600917233e-03 32 KSP preconditioned resid norm 6.668165062323e-04 true resid norm 1.169868474700e-05 ||r(i)||/||b|| 4.519967509902e-03 33 KSP preconditioned resid norm 5.708870454613e-04 true resid norm 1.004488020556e-05 ||r(i)||/||b|| 3.880994586301e-03 34 KSP preconditioned resid norm 5.011561459315e-04 true resid norm 8.745360819835e-06 ||r(i)||/||b|| 3.378905203694e-03 35 KSP preconditioned resid norm 4.538996590237e-04 true resid norm 8.083021967301e-06 ||r(i)||/||b|| 3.123000359795e-03 36 KSP preconditioned resid norm 4.052298291182e-04 true resid norm 7.247052739467e-06 ||r(i)||/||b|| 2.800010739098e-03 37 KSP preconditioned resid norm 3.654320848330e-04 true resid norm 6.610483296785e-06 ||r(i)||/||b|| 2.554062304642e-03 38 KSP preconditioned resid norm 3.287952541535e-04 true resid norm 5.949959230776e-06 ||r(i)||/||b|| 2.298858631543e-03 39 KSP preconditioned resid norm 2.925502372200e-04 true resid norm 5.193052836614e-06 ||r(i)||/||b|| 2.006416157569e-03 40 KSP preconditioned resid norm 2.567116543052e-04 true resid norm 4.505880439325e-06 ||r(i)||/||b|| 1.740916490161e-03 41 KSP preconditioned resid norm 2.241839110101e-04 true resid norm 3.964562676466e-06 ||r(i)||/||b|| 1.531770013136e-03 42 KSP preconditioned resid norm 1.917980758634e-04 true resid norm 3.272694059385e-06 ||r(i)||/||b|| 1.264455888689e-03 43 KSP preconditioned resid norm 1.579157331175e-04 true resid norm 2.726086700967e-06 ||r(i)||/||b|| 1.053265694736e-03 44 KSP preconditioned resid norm 1.333939443028e-04 true resid norm 2.342338560644e-06 ||r(i)||/||b|| 9.049986746596e-04 Linear solve converged due to CONVERGED_RTOL iterations 44 iter = 11, Function value: -4.34146, Residual: 0.00215526 0 KSP preconditioned resid norm 9.767510809572e-02 true resid norm 1.871873166738e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.447912449374e-02 true resid norm 8.597916860474e-04 ||r(i)||/||b|| 4.593215509070e-01 2 KSP preconditioned resid norm 3.391126026814e-02 true resid norm 6.790517732531e-04 ||r(i)||/||b|| 3.627659102760e-01 3 KSP preconditioned resid norm 2.551660470840e-02 true resid norm 4.969266544986e-04 ||r(i)||/||b|| 2.654702590585e-01 4 KSP preconditioned resid norm 2.138869351108e-02 true resid norm 4.301255956870e-04 ||r(i)||/||b|| 2.297835148930e-01 5 KSP preconditioned resid norm 1.797333352373e-02 true resid norm 3.569151843760e-04 ||r(i)||/||b|| 1.906727393277e-01 6 KSP preconditioned resid norm 1.589721247293e-02 true resid norm 3.096163398104e-04 ||r(i)||/||b|| 1.654045505390e-01 7 KSP preconditioned resid norm 1.402668952356e-02 true resid norm 2.683332065331e-04 ||r(i)||/||b|| 1.433501004775e-01 8 KSP preconditioned resid norm 1.264232814535e-02 true resid norm 2.386820221617e-04 ||r(i)||/||b|| 1.275097193564e-01 9 KSP preconditioned resid norm 1.153515844579e-02 true resid norm 2.192127559343e-04 ||r(i)||/||b|| 1.171087656095e-01 10 KSP preconditioned resid norm 1.027149964569e-02 true resid norm 1.889140606114e-04 ||r(i)||/||b|| 1.009224684494e-01 11 KSP preconditioned resid norm 9.295029114594e-03 true resid norm 1.720318578229e-04 ||r(i)||/||b|| 9.190358667442e-02 12 KSP preconditioned resid norm 8.514863726230e-03 true resid norm 1.485924981201e-04 ||r(i)||/||b|| 7.938171280004e-02 13 KSP preconditioned resid norm 8.134321600176e-03 true resid norm 1.407953596122e-04 ||r(i)||/||b|| 7.521629248926e-02 14 KSP preconditioned resid norm 7.836499445723e-03 true resid norm 1.308447336611e-04 ||r(i)||/||b|| 6.990042700867e-02 15 KSP preconditioned resid norm 7.679285374892e-03 true resid norm 1.286029244689e-04 ||r(i)||/||b|| 6.870279822056e-02 16 KSP preconditioned resid norm 7.315842126898e-03 true resid norm 1.215655713215e-04 ||r(i)||/||b|| 6.494327365853e-02 17 KSP preconditioned resid norm 6.716935624831e-03 true resid norm 1.134903453501e-04 ||r(i)||/||b|| 6.062929228684e-02 18 KSP preconditioned resid norm 5.950989705366e-03 true resid norm 1.003725240006e-04 ||r(i)||/||b|| 5.362143428528e-02 19 KSP preconditioned resid norm 5.072521582935e-03 true resid norm 8.548190168077e-05 ||r(i)||/||b|| 4.566650305145e-02 20 KSP preconditioned resid norm 4.294912918507e-03 true resid norm 7.329123405880e-05 ||r(i)||/||b|| 3.915395303546e-02 21 KSP preconditioned resid norm 3.496779207965e-03 true resid norm 6.121748118575e-05 ||r(i)||/||b|| 3.270386171111e-02 22 KSP preconditioned resid norm 2.806901318342e-03 true resid norm 5.053505591344e-05 ||r(i)||/||b|| 2.699705130209e-02 23 KSP preconditioned resid norm 2.245942433781e-03 true resid norm 3.906191453189e-05 ||r(i)||/||b|| 2.086782118895e-02 24 KSP preconditioned resid norm 1.908384165432e-03 true resid norm 3.455934107146e-05 ||r(i)||/||b|| 1.846243735182e-02 25 KSP preconditioned resid norm 1.655990273586e-03 true resid norm 2.994399229443e-05 ||r(i)||/||b|| 1.599680620809e-02 26 KSP preconditioned resid norm 1.485171068062e-03 true resid norm 2.636287711030e-05 ||r(i)||/||b|| 1.408368770852e-02 27 KSP preconditioned resid norm 1.305810567640e-03 true resid norm 2.349856017283e-05 ||r(i)||/||b|| 1.255350019989e-02 28 KSP preconditioned resid norm 1.127576789964e-03 true resid norm 2.034496874086e-05 ||r(i)||/||b|| 1.086877524737e-02 29 KSP preconditioned resid norm 9.427755064193e-04 true resid norm 1.652882375750e-05 ||r(i)||/||b|| 8.830098134428e-03 30 KSP preconditioned resid norm 7.942150154752e-04 true resid norm 1.400701976490e-05 ||r(i)||/||b|| 7.482889339832e-03 31 KSP preconditioned resid norm 6.768023028294e-04 true resid norm 1.179320581633e-05 ||r(i)||/||b|| 6.300216289162e-03 32 KSP preconditioned resid norm 5.485715993737e-04 true resid norm 9.723296036885e-06 ||r(i)||/||b|| 5.194420332350e-03 33 KSP preconditioned resid norm 4.525499428935e-04 true resid norm 7.994024543963e-06 ||r(i)||/||b|| 4.270601601653e-03 34 KSP preconditioned resid norm 3.784462248637e-04 true resid norm 6.555432133635e-06 ||r(i)||/||b|| 3.502070679851e-03 35 KSP preconditioned resid norm 3.334912485972e-04 true resid norm 5.775082421396e-06 ||r(i)||/||b|| 3.085188956183e-03 36 KSP preconditioned resid norm 3.039367900330e-04 true resid norm 5.318695306417e-06 ||r(i)||/||b|| 2.841375901384e-03 37 KSP preconditioned resid norm 2.806058246862e-04 true resid norm 4.995317445387e-06 ||r(i)||/||b|| 2.668619612776e-03 38 KSP preconditioned resid norm 2.537053603600e-04 true resid norm 4.576686973120e-06 ||r(i)||/||b|| 2.444977071334e-03 39 KSP preconditioned resid norm 2.357952324330e-04 true resid norm 4.236244236811e-06 ||r(i)||/||b|| 2.263104312881e-03 40 KSP preconditioned resid norm 2.127143973842e-04 true resid norm 3.954017992327e-06 ||r(i)||/||b|| 2.112332214910e-03 41 KSP preconditioned resid norm 1.917592314648e-04 true resid norm 3.532027434661e-06 ||r(i)||/||b|| 1.886894634435e-03 42 KSP preconditioned resid norm 1.663338272883e-04 true resid norm 2.977883429240e-06 ||r(i)||/||b|| 1.590857480173e-03 43 KSP preconditioned resid norm 1.394729597535e-04 true resid norm 2.507820943865e-06 ||r(i)||/||b|| 1.339738711162e-03 44 KSP preconditioned resid norm 1.171556138355e-04 true resid norm 2.102800608327e-06 ||r(i)||/||b|| 1.123367034526e-03 45 KSP preconditioned resid norm 9.983998718950e-05 true resid norm 1.768369525693e-06 ||r(i)||/||b|| 9.447058471246e-04 46 KSP preconditioned resid norm 8.720480360193e-05 true resid norm 1.548023800029e-06 ||r(i)||/||b|| 8.269918216344e-04 Linear solve converged due to CONVERGED_RTOL iterations 46 iter = 12, Function value: -4.34164, Residual: 0.00146073 0 KSP preconditioned resid norm 6.455740207494e-02 true resid norm 1.261393601962e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.717693121053e-02 true resid norm 5.430111708023e-04 ||r(i)||/||b|| 4.304851157940e-01 2 KSP preconditioned resid norm 2.009335255709e-02 true resid norm 4.281987600836e-04 ||r(i)||/||b|| 3.394648263774e-01 3 KSP preconditioned resid norm 1.489901087553e-02 true resid norm 3.004970000154e-04 ||r(i)||/||b|| 2.382261964449e-01 4 KSP preconditioned resid norm 1.264022354170e-02 true resid norm 2.509717065536e-04 ||r(i)||/||b|| 1.989638334642e-01 5 KSP preconditioned resid norm 1.074400315485e-02 true resid norm 2.118740912466e-04 ||r(i)||/||b|| 1.679682621800e-01 6 KSP preconditioned resid norm 9.355384800220e-03 true resid norm 1.839730026950e-04 ||r(i)||/||b|| 1.458490057416e-01 7 KSP preconditioned resid norm 8.246310678715e-03 true resid norm 1.598524745544e-04 ||r(i)||/||b|| 1.267268791484e-01 8 KSP preconditioned resid norm 7.143299925729e-03 true resid norm 1.371378929509e-04 ||r(i)||/||b|| 1.087193503579e-01 9 KSP preconditioned resid norm 6.575941837735e-03 true resid norm 1.257787178532e-04 ||r(i)||/||b|| 9.971409214187e-02 10 KSP preconditioned resid norm 5.995211903527e-03 true resid norm 1.114112423766e-04 ||r(i)||/||b|| 8.832393172385e-02 11 KSP preconditioned resid norm 5.455608691473e-03 true resid norm 1.034840083271e-04 ||r(i)||/||b|| 8.203942700056e-02 12 KSP preconditioned resid norm 4.846115483961e-03 true resid norm 8.783734201343e-05 ||r(i)||/||b|| 6.963515739799e-02 13 KSP preconditioned resid norm 4.442227094113e-03 true resid norm 8.017344044220e-05 ||r(i)||/||b|| 6.355941580606e-02 14 KSP preconditioned resid norm 4.068728434642e-03 true resid norm 7.023299594577e-05 ||r(i)||/||b|| 5.567889026592e-02 15 KSP preconditioned resid norm 3.931727186105e-03 true resid norm 6.762200293258e-05 ||r(i)||/||b|| 5.360896299728e-02 16 KSP preconditioned resid norm 3.877404688477e-03 true resid norm 6.557143021900e-05 ||r(i)||/||b|| 5.198332234840e-02 17 KSP preconditioned resid norm 3.757860433998e-03 true resid norm 6.538020782720e-05 ||r(i)||/||b|| 5.183172621576e-02 18 KSP preconditioned resid norm 3.491143843725e-03 true resid norm 6.011104574090e-05 ||r(i)||/||b|| 4.765447172667e-02 19 KSP preconditioned resid norm 3.131239458112e-03 true resid norm 5.324700711622e-05 ||r(i)||/||b|| 4.221284064974e-02 20 KSP preconditioned resid norm 2.725586708230e-03 true resid norm 4.776207817929e-05 ||r(i)||/||b|| 3.786453182021e-02 21 KSP preconditioned resid norm 2.297721629622e-03 true resid norm 4.043822848795e-05 ||r(i)||/||b|| 3.205837450345e-02 22 KSP preconditioned resid norm 1.934630935658e-03 true resid norm 3.421841377981e-05 ||r(i)||/||b|| 2.712746737147e-02 23 KSP preconditioned resid norm 1.570857267959e-03 true resid norm 2.825731046973e-05 ||r(i)||/||b|| 2.240165989885e-02 24 KSP preconditioned resid norm 1.296059639247e-03 true resid norm 2.333899334784e-05 ||r(i)||/||b|| 1.850254616128e-02 25 KSP preconditioned resid norm 1.082244744923e-03 true resid norm 1.969109171808e-05 ||r(i)||/||b|| 1.561058474329e-02 26 KSP preconditioned resid norm 9.410615628634e-04 true resid norm 1.761685011599e-05 ||r(i)||/||b|| 1.396618001597e-02 27 KSP preconditioned resid norm 8.407826162447e-04 true resid norm 1.565606047533e-05 ||r(i)||/||b|| 1.241171704928e-02 28 KSP preconditioned resid norm 7.336206018538e-04 true resid norm 1.390455953873e-05 ||r(i)||/||b|| 1.102317271715e-02 29 KSP preconditioned resid norm 6.158076043400e-04 true resid norm 1.153058899468e-05 ||r(i)||/||b|| 9.141150689795e-03 30 KSP preconditioned resid norm 5.158032084613e-04 true resid norm 9.575942688494e-06 ||r(i)||/||b|| 7.591557998709e-03 31 KSP preconditioned resid norm 4.360941254513e-04 true resid norm 7.906191601281e-06 ||r(i)||/||b|| 6.267822818337e-03 32 KSP preconditioned resid norm 3.693846355875e-04 true resid norm 6.763994379291e-06 ||r(i)||/||b|| 5.362318604416e-03 33 KSP preconditioned resid norm 3.037706193016e-04 true resid norm 5.582361804382e-06 ||r(i)||/||b|| 4.425551069626e-03 34 KSP preconditioned resid norm 2.467756109857e-04 true resid norm 4.503043366032e-06 ||r(i)||/||b|| 3.569895517963e-03 35 KSP preconditioned resid norm 2.056480131895e-04 true resid norm 3.719079000113e-06 ||r(i)||/||b|| 2.948388983683e-03 36 KSP preconditioned resid norm 1.739542866337e-04 true resid norm 3.193927490558e-06 ||r(i)||/||b|| 2.532062542247e-03 37 KSP preconditioned resid norm 1.541731994926e-04 true resid norm 2.837028684289e-06 ||r(i)||/||b|| 2.249122462550e-03 38 KSP preconditioned resid norm 1.382493420293e-04 true resid norm 2.563795333716e-06 ||r(i)||/||b|| 2.032510177417e-03 39 KSP preconditioned resid norm 1.270546963222e-04 true resid norm 2.380284716257e-06 ||r(i)||/||b|| 1.887027738649e-03 40 KSP preconditioned resid norm 1.201117220012e-04 true resid norm 2.223154210367e-06 ||r(i)||/||b|| 1.762458765376e-03 41 KSP preconditioned resid norm 1.143084003052e-04 true resid norm 2.116918081367e-06 ||r(i)||/||b|| 1.678237528773e-03 42 KSP preconditioned resid norm 1.059480531597e-04 true resid norm 1.964934677518e-06 ||r(i)||/||b|| 1.557749043964e-03 43 KSP preconditioned resid norm 9.310980550027e-05 true resid norm 1.728760891765e-06 ||r(i)||/||b|| 1.370516616761e-03 44 KSP preconditioned resid norm 7.961074494863e-05 true resid norm 1.457674857826e-06 ||r(i)||/||b|| 1.155606668338e-03 45 KSP preconditioned resid norm 6.755577354014e-05 true resid norm 1.236310714170e-06 ||r(i)||/||b|| 9.801149397359e-04 46 KSP preconditioned resid norm 5.763709063664e-05 true resid norm 1.055904052875e-06 ||r(i)||/||b|| 8.370932365858e-04 Linear solve converged due to CONVERGED_RTOL iterations 46 iter = 13, Function value: -4.34171, Residual: 0.00103052 0 KSP preconditioned resid norm 4.305778440717e-02 true resid norm 8.882832839077e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.866733254678e-02 true resid norm 3.806870755500e-04 ||r(i)||/||b|| 4.285649436915e-01 2 KSP preconditioned resid norm 1.387485870514e-02 true resid norm 2.899670414103e-04 ||r(i)||/||b|| 3.264353238020e-01 3 KSP preconditioned resid norm 1.054706664774e-02 true resid norm 2.182515176046e-04 ||r(i)||/||b|| 2.457003543334e-01 4 KSP preconditioned resid norm 8.829774255510e-03 true resid norm 1.808023563680e-04 ||r(i)||/||b|| 2.035413247592e-01 5 KSP preconditioned resid norm 7.227238511185e-03 true resid norm 1.482027307450e-04 ||r(i)||/||b|| 1.668417423021e-01 6 KSP preconditioned resid norm 6.274614013672e-03 true resid norm 1.286551934101e-04 ||r(i)||/||b|| 1.448357700081e-01 7 KSP preconditioned resid norm 5.563167791389e-03 true resid norm 1.096548415837e-04 ||r(i)||/||b|| 1.234458010977e-01 8 KSP preconditioned resid norm 4.927369567783e-03 true resid norm 9.688685117595e-05 ||r(i)||/||b|| 1.090720189507e-01 9 KSP preconditioned resid norm 4.509100936677e-03 true resid norm 8.611529414250e-05 ||r(i)||/||b|| 9.694575559687e-02 10 KSP preconditioned resid norm 3.986093329875e-03 true resid norm 7.478018535613e-05 ||r(i)||/||b|| 8.418506428170e-02 11 KSP preconditioned resid norm 3.629519790101e-03 true resid norm 6.935757419514e-05 ||r(i)||/||b|| 7.808046762969e-02 12 KSP preconditioned resid norm 3.320360502980e-03 true resid norm 6.098241566708e-05 ||r(i)||/||b|| 6.865199061138e-02 13 KSP preconditioned resid norm 3.022372079745e-03 true resid norm 5.409334890054e-05 ||r(i)||/||b|| 6.089650664434e-02 14 KSP preconditioned resid norm 2.682989027395e-03 true resid norm 4.639992155210e-05 ||r(i)||/||b|| 5.223550008504e-02 15 KSP preconditioned resid norm 2.468089977432e-03 true resid norm 4.277543013990e-05 ||r(i)||/||b|| 4.815516729272e-02 16 KSP preconditioned resid norm 2.368370671189e-03 true resid norm 3.949742486699e-05 ||r(i)||/||b|| 4.446489715898e-02 17 KSP preconditioned resid norm 2.344832635754e-03 true resid norm 3.910624687143e-05 ||r(i)||/||b|| 4.402452188383e-02 18 KSP preconditioned resid norm 2.285912627720e-03 true resid norm 3.891576064630e-05 ||r(i)||/||b|| 4.381007877926e-02 19 KSP preconditioned resid norm 2.156732567947e-03 true resid norm 3.626329568410e-05 ||r(i)||/||b|| 4.082402127909e-02 20 KSP preconditioned resid norm 1.958894185824e-03 true resid norm 3.348550568186e-05 ||r(i)||/||b|| 3.769687698563e-02 21 KSP preconditioned resid norm 1.717245991115e-03 true resid norm 2.971442397052e-05 ||r(i)||/||b|| 3.345151767328e-02 22 KSP preconditioned resid norm 1.511091496934e-03 true resid norm 2.655182115970e-05 ||r(i)||/||b|| 2.989116382208e-02 23 KSP preconditioned resid norm 1.293388578849e-03 true resid norm 2.286998288137e-05 ||r(i)||/||b|| 2.574627182081e-02 24 KSP preconditioned resid norm 1.056132747084e-03 true resid norm 1.884334345898e-05 ||r(i)||/||b|| 2.121321407298e-02 25 KSP preconditioned resid norm 8.562228867453e-04 true resid norm 1.558879005274e-05 ||r(i)||/||b|| 1.754934527661e-02 26 KSP preconditioned resid norm 7.195012507283e-04 true resid norm 1.304229533415e-05 ||r(i)||/||b|| 1.468258557876e-02 27 KSP preconditioned resid norm 6.204484474287e-04 true resid norm 1.121758585536e-05 ||r(i)||/||b|| 1.262838787871e-02 28 KSP preconditioned resid norm 5.561380128987e-04 true resid norm 1.024856361205e-05 ||r(i)||/||b|| 1.153749462329e-02 29 KSP preconditioned resid norm 4.854856028322e-04 true resid norm 9.024684791925e-06 ||r(i)||/||b|| 1.015969224618e-02 30 KSP preconditioned resid norm 4.094208527779e-04 true resid norm 7.668260305111e-06 ||r(i)||/||b|| 8.632674332649e-03 31 KSP preconditioned resid norm 3.454541047504e-04 true resid norm 6.348967346901e-06 ||r(i)||/||b|| 7.147457868363e-03 32 KSP preconditioned resid norm 2.883309471338e-04 true resid norm 5.268097897053e-06 ||r(i)||/||b|| 5.930650719754e-03 33 KSP preconditioned resid norm 2.439044408736e-04 true resid norm 4.456091079734e-06 ||r(i)||/||b|| 5.016520248058e-03 34 KSP preconditioned resid norm 1.979913220066e-04 true resid norm 3.578419114364e-06 ||r(i)||/||b|| 4.028466120203e-03 35 KSP preconditioned resid norm 1.606742109242e-04 true resid norm 2.980515576477e-06 ||r(i)||/||b|| 3.355366053231e-03 36 KSP preconditioned resid norm 1.306813537739e-04 true resid norm 2.428865938742e-06 ||r(i)||/||b|| 2.734337100274e-03 37 KSP preconditioned resid norm 1.093838545574e-04 true resid norm 2.013581574616e-06 ||r(i)||/||b|| 2.266823671114e-03 38 KSP preconditioned resid norm 9.145505812450e-05 true resid norm 1.683619309346e-06 ||r(i)||/||b|| 1.895363044477e-03 39 KSP preconditioned resid norm 7.894914294029e-05 true resid norm 1.493274213360e-06 ||r(i)||/||b|| 1.681078818450e-03 40 KSP preconditioned resid norm 7.090839260285e-05 true resid norm 1.320203835816e-06 ||r(i)||/||b|| 1.486241900229e-03 41 KSP preconditioned resid norm 6.635249422512e-05 true resid norm 1.239589657233e-06 ||r(i)||/||b|| 1.395489118944e-03 42 KSP preconditioned resid norm 6.288731607602e-05 true resid norm 1.175979760628e-06 ||r(i)||/||b|| 1.323879197023e-03 43 KSP preconditioned resid norm 5.905177059529e-05 true resid norm 1.120464566624e-06 ||r(i)||/||b|| 1.261382024094e-03 44 KSP preconditioned resid norm 5.449605296760e-05 true resid norm 1.011912663492e-06 ||r(i)||/||b|| 1.139177874699e-03 45 KSP preconditioned resid norm 4.838624685786e-05 true resid norm 8.862532380221e-07 ||r(i)||/||b|| 9.977146413511e-04 46 KSP preconditioned resid norm 4.191730284728e-05 true resid norm 7.688530028500e-07 ||r(i)||/||b|| 8.655493318164e-04 Linear solve converged due to CONVERGED_RTOL iterations 46 iter = 14, Function value: -4.34174, Residual: 0.000620684 0 KSP preconditioned resid norm 2.402035244627e-02 true resid norm 5.339168201708e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.005169555035e-02 true resid norm 2.139104501739e-04 ||r(i)||/||b|| 4.006437746341e-01 2 KSP preconditioned resid norm 7.363602420795e-03 true resid norm 1.589007495974e-04 ||r(i)||/||b|| 2.976133052834e-01 3 KSP preconditioned resid norm 5.615333235684e-03 true resid norm 1.189712983314e-04 ||r(i)||/||b|| 2.228274027653e-01 4 KSP preconditioned resid norm 4.569319499047e-03 true resid norm 9.546511590363e-05 ||r(i)||/||b|| 1.788014767414e-01 5 KSP preconditioned resid norm 3.843614652573e-03 true resid norm 8.030064747427e-05 ||r(i)||/||b|| 1.503991716323e-01 6 KSP preconditioned resid norm 3.286931866635e-03 true resid norm 6.851087142543e-05 ||r(i)||/||b|| 1.283174997250e-01 7 KSP preconditioned resid norm 2.909764575702e-03 true resid norm 5.969740322140e-05 ||r(i)||/||b|| 1.118103063363e-01 8 KSP preconditioned resid norm 2.560108247941e-03 true resid norm 5.226781684000e-05 ||r(i)||/||b|| 9.789505568167e-02 9 KSP preconditioned resid norm 2.361791406234e-03 true resid norm 4.682578677847e-05 ||r(i)||/||b|| 8.770240046658e-02 10 KSP preconditioned resid norm 2.030964277877e-03 true resid norm 3.975533367409e-05 ||r(i)||/||b|| 7.445978881386e-02 11 KSP preconditioned resid norm 1.876696481237e-03 true resid norm 3.646196218185e-05 ||r(i)||/||b|| 6.829146564475e-02 12 KSP preconditioned resid norm 1.737092693866e-03 true resid norm 3.296989362510e-05 ||r(i)||/||b|| 6.175099262569e-02 13 KSP preconditioned resid norm 1.590909324106e-03 true resid norm 3.021238295063e-05 ||r(i)||/||b|| 5.658631046866e-02 14 KSP preconditioned resid norm 1.378549779595e-03 true resid norm 2.550058971026e-05 ||r(i)||/||b|| 4.776135298023e-02 15 KSP preconditioned resid norm 1.218147458692e-03 true resid norm 2.239303572678e-05 ||r(i)||/||b|| 4.194105688526e-02 16 KSP preconditioned resid norm 1.105855363960e-03 true resid norm 1.996214424593e-05 ||r(i)||/||b|| 3.738811644770e-02 17 KSP preconditioned resid norm 1.067246776560e-03 true resid norm 1.892812596241e-05 ||r(i)||/||b|| 3.545145095139e-02 18 KSP preconditioned resid norm 1.049832452344e-03 true resid norm 1.878030765664e-05 ||r(i)||/||b|| 3.517459451947e-02 19 KSP preconditioned resid norm 1.016713350745e-03 true resid norm 1.811420320167e-05 ||r(i)||/||b|| 3.392701356715e-02 20 KSP preconditioned resid norm 9.631374866222e-04 true resid norm 1.754812965838e-05 ||r(i)||/||b|| 3.286678560299e-02 21 KSP preconditioned resid norm 8.690392146218e-04 true resid norm 1.598229846049e-05 ||r(i)||/||b|| 2.993406061899e-02 22 KSP preconditioned resid norm 7.646752572079e-04 true resid norm 1.397800194111e-05 ||r(i)||/||b|| 2.618011160735e-02 23 KSP preconditioned resid norm 6.619477960943e-04 true resid norm 1.208731092475e-05 ||r(i)||/||b|| 2.263894012719e-02 24 KSP preconditioned resid norm 5.772340446918e-04 true resid norm 1.078484427565e-05 ||r(i)||/||b|| 2.019948401738e-02 25 KSP preconditioned resid norm 4.824611134121e-04 true resid norm 8.988524018777e-06 ||r(i)||/||b|| 1.683506433811e-02 26 KSP preconditioned resid norm 4.065140645115e-04 true resid norm 7.622302355070e-06 ||r(i)||/||b|| 1.427619821498e-02 27 KSP preconditioned resid norm 3.451085886825e-04 true resid norm 6.593436198134e-06 ||r(i)||/||b|| 1.234918239891e-02 28 KSP preconditioned resid norm 3.014214577278e-04 true resid norm 5.793960868658e-06 ||r(i)||/||b|| 1.085180434436e-02 29 KSP preconditioned resid norm 2.662875361109e-04 true resid norm 5.154969201310e-06 ||r(i)||/||b|| 9.655004312582e-03 30 KSP preconditioned resid norm 2.301038932806e-04 true resid norm 4.472793599721e-06 ||r(i)||/||b|| 8.377322891400e-03 31 KSP preconditioned resid norm 1.972764393458e-04 true resid norm 3.786728345361e-06 ||r(i)||/||b|| 7.092356341480e-03 32 KSP preconditioned resid norm 1.672312259828e-04 true resid norm 3.217524714976e-06 ||r(i)||/||b|| 6.026265877795e-03 33 KSP preconditioned resid norm 1.436479925554e-04 true resid norm 2.709069124722e-06 ||r(i)||/||b|| 5.073953511814e-03 34 KSP preconditioned resid norm 1.218532747452e-04 true resid norm 2.284889524896e-06 ||r(i)||/||b|| 4.279485939710e-03 35 KSP preconditioned resid norm 1.032066259022e-04 true resid norm 1.966043176416e-06 ||r(i)||/||b|| 3.682302377713e-03 36 KSP preconditioned resid norm 8.523820576686e-05 true resid norm 1.619993311847e-06 ||r(i)||/||b|| 3.034167965206e-03 37 KSP preconditioned resid norm 7.169430493900e-05 true resid norm 1.377594134746e-06 ||r(i)||/||b|| 2.580166203240e-03 38 KSP preconditioned resid norm 6.032491407240e-05 true resid norm 1.142523352317e-06 ||r(i)||/||b|| 2.139890164822e-03 39 KSP preconditioned resid norm 5.154017530069e-05 true resid norm 9.776557093166e-07 ||r(i)||/||b|| 1.831101161046e-03 40 KSP preconditioned resid norm 4.480232878254e-05 true resid norm 8.471161728846e-07 ||r(i)||/||b|| 1.586607016077e-03 41 KSP preconditioned resid norm 3.957567682810e-05 true resid norm 7.587499400200e-07 ||r(i)||/||b|| 1.421101398861e-03 42 KSP preconditioned resid norm 3.685474014314e-05 true resid norm 7.021556056847e-07 ||r(i)||/||b|| 1.315102988252e-03 43 KSP preconditioned resid norm 3.519210350291e-05 true resid norm 6.746854316690e-07 ||r(i)||/||b|| 1.263652700533e-03 44 KSP preconditioned resid norm 3.390331473089e-05 true resid norm 6.551911891888e-07 ||r(i)||/||b|| 1.227140941128e-03 45 KSP preconditioned resid norm 3.073212207656e-05 true resid norm 6.055564785563e-07 ||r(i)||/||b|| 1.134177564143e-03 46 KSP preconditioned resid norm 2.707091965923e-05 true resid norm 5.242855612490e-07 ||r(i)||/||b|| 9.819611247334e-04 47 KSP preconditioned resid norm 2.333701818418e-05 true resid norm 4.413986325229e-07 ||r(i)||/||b|| 8.267179752488e-04 Linear solve converged due to CONVERGED_RTOL iterations 47 iter = 15, Function value: -4.34175, Residual: 0.000372228 0 KSP preconditioned resid norm 1.468915558432e-02 true resid norm 3.188464339552e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.943738246377e-03 true resid norm 1.275880478697e-04 ||r(i)||/||b|| 4.001551665077e-01 2 KSP preconditioned resid norm 4.309035745184e-03 true resid norm 9.371732602572e-05 ||r(i)||/||b|| 2.939262166529e-01 3 KSP preconditioned resid norm 3.141224930749e-03 true resid norm 6.694644422830e-05 ||r(i)||/||b|| 2.099645380939e-01 4 KSP preconditioned resid norm 2.571099685655e-03 true resid norm 5.537940750719e-05 ||r(i)||/||b|| 1.736867708390e-01 5 KSP preconditioned resid norm 2.156168519818e-03 true resid norm 4.485205775213e-05 ||r(i)||/||b|| 1.406697801062e-01 6 KSP preconditioned resid norm 1.813483042259e-03 true resid norm 3.799038487326e-05 ||r(i)||/||b|| 1.191494739396e-01 7 KSP preconditioned resid norm 1.575959129263e-03 true resid norm 3.352546447886e-05 ||r(i)||/||b|| 1.051461170915e-01 8 KSP preconditioned resid norm 1.372976650767e-03 true resid norm 2.752667934342e-05 ||r(i)||/||b|| 8.633209097546e-02 9 KSP preconditioned resid norm 1.247681568701e-03 true resid norm 2.597832137766e-05 ||r(i)||/||b|| 8.147596651908e-02 10 KSP preconditioned resid norm 1.100660177018e-03 true resid norm 2.228613482898e-05 ||r(i)||/||b|| 6.989613950682e-02 11 KSP preconditioned resid norm 9.849405795635e-04 true resid norm 1.953766099398e-05 ||r(i)||/||b|| 6.127608438841e-02 12 KSP preconditioned resid norm 8.783945830029e-04 true resid norm 1.704306238606e-05 ||r(i)||/||b|| 5.345225968079e-02 13 KSP preconditioned resid norm 8.313440720431e-04 true resid norm 1.589187781057e-05 ||r(i)||/||b|| 4.984179253139e-02 14 KSP preconditioned resid norm 7.541589433437e-04 true resid norm 1.431939982132e-05 ||r(i)||/||b|| 4.491002029939e-02 15 KSP preconditioned resid norm 6.691364128828e-04 true resid norm 1.260611929153e-05 ||r(i)||/||b|| 3.953664820759e-02 16 KSP preconditioned resid norm 5.835909018861e-04 true resid norm 1.074985294405e-05 ||r(i)||/||b|| 3.371482883060e-02 17 KSP preconditioned resid norm 5.255990068954e-04 true resid norm 9.505653102788e-06 ||r(i)||/||b|| 2.981263734041e-02 18 KSP preconditioned resid norm 5.092664233644e-04 true resid norm 8.841128681868e-06 ||r(i)||/||b|| 2.772848537836e-02 19 KSP preconditioned resid norm 5.130021094982e-04 true resid norm 9.134086804730e-06 ||r(i)||/||b|| 2.864729171164e-02 20 KSP preconditioned resid norm 5.052421836366e-04 true resid norm 8.924335429458e-06 ||r(i)||/||b|| 2.798944720427e-02 21 KSP preconditioned resid norm 4.796021102806e-04 true resid norm 8.586508957222e-06 ||r(i)||/||b|| 2.692991999537e-02 22 KSP preconditioned resid norm 4.378361332832e-04 true resid norm 8.040489199914e-06 ||r(i)||/||b|| 2.521743492682e-02 23 KSP preconditioned resid norm 3.770006087237e-04 true resid norm 6.909474941827e-06 ||r(i)||/||b|| 2.167022806596e-02 24 KSP preconditioned resid norm 3.167088592155e-04 true resid norm 5.800262669738e-06 ||r(i)||/||b|| 1.819139890570e-02 25 KSP preconditioned resid norm 2.605101620314e-04 true resid norm 4.790474066123e-06 ||r(i)||/||b|| 1.502439279844e-02 26 KSP preconditioned resid norm 2.144849190954e-04 true resid norm 4.076417877755e-06 ||r(i)||/||b|| 1.278489405445e-02 27 KSP preconditioned resid norm 1.793307174561e-04 true resid norm 3.407174237512e-06 ||r(i)||/||b|| 1.068594117628e-02 28 KSP preconditioned resid norm 1.567420077410e-04 true resid norm 3.018617895553e-06 ||r(i)||/||b|| 9.467309570029e-03 29 KSP preconditioned resid norm 1.402555648749e-04 true resid norm 2.681541035494e-06 ||r(i)||/||b|| 8.410133374333e-03 30 KSP preconditioned resid norm 1.276877270938e-04 true resid norm 2.488061260038e-06 ||r(i)||/||b|| 7.803321583918e-03 31 KSP preconditioned resid norm 1.134078423370e-04 true resid norm 2.172074405323e-06 ||r(i)||/||b|| 6.812290099593e-03 32 KSP preconditioned resid norm 9.832752555978e-05 true resid norm 1.894547971770e-06 ||r(i)||/||b|| 5.941882266862e-03 33 KSP preconditioned resid norm 8.509918954377e-05 true resid norm 1.637123465087e-06 ||r(i)||/||b|| 5.134520228998e-03 34 KSP preconditioned resid norm 7.196261075978e-05 true resid norm 1.378858631241e-06 ||r(i)||/||b|| 4.324522667972e-03 35 KSP preconditioned resid norm 6.161040834711e-05 true resid norm 1.181774564199e-06 ||r(i)||/||b|| 3.706406716046e-03 36 KSP preconditioned resid norm 5.286467547792e-05 true resid norm 1.010441421616e-06 ||r(i)||/||b|| 3.169053544308e-03 37 KSP preconditioned resid norm 4.509528243967e-05 true resid norm 8.742939726873e-07 ||r(i)||/||b|| 2.742053476471e-03 38 KSP preconditioned resid norm 3.836510806573e-05 true resid norm 7.372425762934e-07 ||r(i)||/||b|| 2.312218352729e-03 39 KSP preconditioned resid norm 3.335350649020e-05 true resid norm 6.395009145553e-07 ||r(i)||/||b|| 2.005670587632e-03 40 KSP preconditioned resid norm 2.921686595740e-05 true resid norm 5.753503065472e-07 ||r(i)||/||b|| 1.804474647592e-03 41 KSP preconditioned resid norm 2.569134258521e-05 true resid norm 5.092718836569e-07 ||r(i)||/||b|| 1.597232490072e-03 42 KSP preconditioned resid norm 2.305837518475e-05 true resid norm 4.500567830554e-07 ||r(i)||/||b|| 1.411515811774e-03 43 KSP preconditioned resid norm 2.143806569419e-05 true resid norm 4.161453842421e-07 ||r(i)||/||b|| 1.305159286494e-03 44 KSP preconditioned resid norm 2.021907568701e-05 true resid norm 3.920545293521e-07 ||r(i)||/||b|| 1.229602992540e-03 45 KSP preconditioned resid norm 1.867451599218e-05 true resid norm 3.655614378422e-07 ||r(i)||/||b|| 1.146512549341e-03 46 KSP preconditioned resid norm 1.714365324148e-05 true resid norm 3.357138403388e-07 ||r(i)||/||b|| 1.052901348697e-03 47 KSP preconditioned resid norm 1.490044315100e-05 true resid norm 2.915059776699e-07 ||r(i)||/||b|| 9.142519615285e-04 48 KSP preconditioned resid norm 1.289209906621e-05 true resid norm 2.487088051613e-07 ||r(i)||/||b|| 7.800269304448e-04 Linear solve converged due to CONVERGED_RTOL iterations 48 iter = 16, Function value: -4.34176, Residual: 0.000189954 0 KSP preconditioned resid norm 6.919740996378e-03 true resid norm 1.630810478291e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.803515963097e-03 true resid norm 6.186347483212e-05 ||r(i)||/||b|| 3.793419018067e-01 2 KSP preconditioned resid norm 1.984696462777e-03 true resid norm 4.329951389104e-05 ||r(i)||/||b|| 2.655091714668e-01 3 KSP preconditioned resid norm 1.407306227400e-03 true resid norm 3.096472987678e-05 ||r(i)||/||b|| 1.898732580455e-01 4 KSP preconditioned resid norm 1.108594852034e-03 true resid norm 2.432673325965e-05 ||r(i)||/||b|| 1.491695913380e-01 5 KSP preconditioned resid norm 9.085488141096e-04 true resid norm 1.970117667878e-05 ||r(i)||/||b|| 1.208060466930e-01 6 KSP preconditioned resid norm 7.532365821783e-04 true resid norm 1.594852487577e-05 ||r(i)||/||b|| 9.779508464090e-02 7 KSP preconditioned resid norm 6.596324747605e-04 true resid norm 1.368277164478e-05 ||r(i)||/||b|| 8.390166623848e-02 8 KSP preconditioned resid norm 5.641524188554e-04 true resid norm 1.164606292562e-05 ||r(i)||/||b|| 7.141273054501e-02 9 KSP preconditioned resid norm 5.154852992353e-04 true resid norm 1.025274700879e-05 ||r(i)||/||b|| 6.286902828545e-02 10 KSP preconditioned resid norm 4.629824482631e-04 true resid norm 9.193473737474e-06 ||r(i)||/||b|| 5.637364893012e-02 11 KSP preconditioned resid norm 4.125716405477e-04 true resid norm 8.227853791804e-06 ||r(i)||/||b|| 5.045254431052e-02 12 KSP preconditioned resid norm 3.606802772222e-04 true resid norm 7.001757374962e-06 ||r(i)||/||b|| 4.293421871007e-02 13 KSP preconditioned resid norm 3.253190971582e-04 true resid norm 6.301372551030e-06 ||r(i)||/||b|| 3.863951473769e-02 14 KSP preconditioned resid norm 2.989090722505e-04 true resid norm 5.639422961564e-06 ||r(i)||/||b|| 3.458049256266e-02 15 KSP preconditioned resid norm 2.736056759370e-04 true resid norm 5.257585999221e-06 ||r(i)||/||b|| 3.223909871324e-02 16 KSP preconditioned resid norm 2.421272203655e-04 true resid norm 4.516397565448e-06 ||r(i)||/||b|| 2.769419025429e-02 17 KSP preconditioned resid norm 2.099550925843e-04 true resid norm 3.899266373154e-06 ||r(i)||/||b|| 2.390999092206e-02 18 KSP preconditioned resid norm 1.922161002573e-04 true resid norm 3.446635682806e-06 ||r(i)||/||b|| 2.113449556946e-02 19 KSP preconditioned resid norm 1.838270219830e-04 true resid norm 3.253501555804e-06 ||r(i)||/||b|| 1.995021248093e-02 20 KSP preconditioned resid norm 1.782437673081e-04 true resid norm 3.177269557424e-06 ||r(i)||/||b|| 1.948276393682e-02 21 KSP preconditioned resid norm 1.756538097674e-04 true resid norm 3.153268094814e-06 ||r(i)||/||b|| 1.933558887921e-02 22 KSP preconditioned resid norm 1.690760491498e-04 true resid norm 3.070104762396e-06 ||r(i)||/||b|| 1.882563794668e-02 23 KSP preconditioned resid norm 1.554672356028e-04 true resid norm 2.812152600567e-06 ||r(i)||/||b|| 1.724389582972e-02 24 KSP preconditioned resid norm 1.404078018036e-04 true resid norm 2.535393217112e-06 ||r(i)||/||b|| 1.554682932727e-02 25 KSP preconditioned resid norm 1.204750023360e-04 true resid norm 2.164270477978e-06 ||r(i)||/||b|| 1.327113424146e-02 26 KSP preconditioned resid norm 1.003149824142e-04 true resid norm 1.815432802409e-06 ||r(i)||/||b|| 1.113208939098e-02 27 KSP preconditioned resid norm 8.155920077114e-05 true resid norm 1.534824880234e-06 ||r(i)||/||b|| 9.411423955543e-03 28 KSP preconditioned resid norm 6.626153053525e-05 true resid norm 1.255192674416e-06 ||r(i)||/||b|| 7.696741535114e-03 29 KSP preconditioned resid norm 5.623625524811e-05 true resid norm 1.072533466794e-06 ||r(i)||/||b|| 6.576689818170e-03 30 KSP preconditioned resid norm 4.973052168679e-05 true resid norm 9.500685144794e-07 ||r(i)||/||b|| 5.825744481818e-03 31 KSP preconditioned resid norm 4.437933230173e-05 true resid norm 8.719169684246e-07 ||r(i)||/||b|| 5.346525424207e-03 32 KSP preconditioned resid norm 3.887376092832e-05 true resid norm 7.678036519720e-07 ||r(i)||/||b|| 4.708110857717e-03 33 KSP preconditioned resid norm 3.453255523384e-05 true resid norm 6.671965364472e-07 ||r(i)||/||b|| 4.091196036135e-03 34 KSP preconditioned resid norm 3.045065573385e-05 true resid norm 6.028807137866e-07 ||r(i)||/||b|| 3.696816532712e-03 35 KSP preconditioned resid norm 2.670729920315e-05 true resid norm 5.182932887600e-07 ||r(i)||/||b|| 3.178133177701e-03 36 KSP preconditioned resid norm 2.279843241434e-05 true resid norm 4.430091552089e-07 ||r(i)||/||b|| 2.716496865247e-03 37 KSP preconditioned resid norm 1.937658552771e-05 true resid norm 3.798128200233e-07 ||r(i)||/||b|| 2.328981969881e-03 38 KSP preconditioned resid norm 1.664886533804e-05 true resid norm 3.244124192427e-07 ||r(i)||/||b|| 1.989271123538e-03 39 KSP preconditioned resid norm 1.428012688544e-05 true resid norm 2.736645091414e-07 ||r(i)||/||b|| 1.678088979586e-03 40 KSP preconditioned resid norm 1.234392712323e-05 true resid norm 2.349020105995e-07 ||r(i)||/||b|| 1.440400424981e-03 41 KSP preconditioned resid norm 1.108807017227e-05 true resid norm 2.103602761406e-07 ||r(i)||/||b|| 1.289912463409e-03 42 KSP preconditioned resid norm 9.993602831274e-06 true resid norm 1.889642601597e-07 ||r(i)||/||b|| 1.158713797067e-03 43 KSP preconditioned resid norm 8.999627355325e-06 true resid norm 1.698567571409e-07 ||r(i)||/||b|| 1.041548109986e-03 44 KSP preconditioned resid norm 8.339819751772e-06 true resid norm 1.543197862083e-07 ||r(i)||/||b|| 9.462766413544e-04 45 KSP preconditioned resid norm 7.591352710794e-06 true resid norm 1.462123980322e-07 ||r(i)||/||b|| 8.965627826074e-04 46 KSP preconditioned resid norm 7.001351730062e-06 true resid norm 1.351755116981e-07 ||r(i)||/||b|| 8.288854744159e-04 47 KSP preconditioned resid norm 6.342580178837e-06 true resid norm 1.252934122255e-07 ||r(i)||/||b|| 7.682892273098e-04 Linear solve converged due to CONVERGED_RTOL iterations 47 iter = 17, Function value: -4.34176, Residual: 8.97185e-05 0 KSP preconditioned resid norm 3.131772628445e-03 true resid norm 7.696726079340e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.163429013178e-03 true resid norm 2.620162816276e-05 ||r(i)||/||b|| 3.404256289319e-01 2 KSP preconditioned resid norm 8.088443460012e-04 true resid norm 1.752784606246e-05 ||r(i)||/||b|| 2.277311922210e-01 3 KSP preconditioned resid norm 5.745171861712e-04 true resid norm 1.190522274243e-05 ||r(i)||/||b|| 1.546790495038e-01 4 KSP preconditioned resid norm 4.291150669288e-04 true resid norm 9.211156621784e-06 ||r(i)||/||b|| 1.196762951784e-01 5 KSP preconditioned resid norm 3.451045621632e-04 true resid norm 7.188254701153e-06 ||r(i)||/||b|| 9.339366669742e-02 6 KSP preconditioned resid norm 2.822266596850e-04 true resid norm 5.934421625190e-06 ||r(i)||/||b|| 7.710319379976e-02 7 KSP preconditioned resid norm 2.371591700380e-04 true resid norm 5.030008802482e-06 ||r(i)||/||b|| 6.535257654528e-02 8 KSP preconditioned resid norm 1.993562336331e-04 true resid norm 4.136406859804e-06 ||r(i)||/||b|| 5.374242005191e-02 9 KSP preconditioned resid norm 1.770995169354e-04 true resid norm 3.683964710565e-06 ||r(i)||/||b|| 4.786404859143e-02 10 KSP preconditioned resid norm 1.549953692279e-04 true resid norm 3.132373873775e-06 ||r(i)||/||b|| 4.069748411838e-02 11 KSP preconditioned resid norm 1.434404536618e-04 true resid norm 2.892575466940e-06 ||r(i)||/||b|| 3.758189439409e-02 12 KSP preconditioned resid norm 1.255664309355e-04 true resid norm 2.502509071404e-06 ||r(i)||/||b|| 3.251394223476e-02 13 KSP preconditioned resid norm 1.121491379797e-04 true resid norm 2.201035733254e-06 ||r(i)||/||b|| 2.859703867028e-02 14 KSP preconditioned resid norm 1.013439521543e-04 true resid norm 1.944569718653e-06 ||r(i)||/||b|| 2.526489443183e-02 15 KSP preconditioned resid norm 9.420811629406e-05 true resid norm 1.804097719569e-06 ||r(i)||/||b|| 2.343980675642e-02 16 KSP preconditioned resid norm 8.609868808571e-05 true resid norm 1.627104313104e-06 ||r(i)||/||b|| 2.114021333657e-02 17 KSP preconditioned resid norm 7.389516847297e-05 true resid norm 1.395061731419e-06 ||r(i)||/||b|| 1.812539145915e-02 18 KSP preconditioned resid norm 6.566263869315e-05 true resid norm 1.190597915189e-06 ||r(i)||/||b|| 1.546888771818e-02 19 KSP preconditioned resid norm 6.010141556509e-05 true resid norm 1.086442580368e-06 ||r(i)||/||b|| 1.411564565465e-02 20 KSP preconditioned resid norm 5.746872701493e-05 true resid norm 1.016774149756e-06 ||r(i)||/||b|| 1.321047597738e-02 21 KSP preconditioned resid norm 5.504608160174e-05 true resid norm 9.784184887438e-07 ||r(i)||/||b|| 1.271213862437e-02 22 KSP preconditioned resid norm 5.485725842319e-05 true resid norm 9.827616768284e-07 ||r(i)||/||b|| 1.276856765718e-02 23 KSP preconditioned resid norm 5.267180223119e-05 true resid norm 9.574046568511e-07 ||r(i)||/||b|| 1.243911563153e-02 24 KSP preconditioned resid norm 4.915420209843e-05 true resid norm 9.171520446587e-07 ||r(i)||/||b|| 1.191613206972e-02 25 KSP preconditioned resid norm 4.292723498315e-05 true resid norm 8.125278078257e-07 ||r(i)||/||b|| 1.055679777934e-02 26 KSP preconditioned resid norm 3.623045786492e-05 true resid norm 6.829936848732e-07 ||r(i)||/||b|| 8.873820866595e-03 27 KSP preconditioned resid norm 3.015115522346e-05 true resid norm 5.649206297596e-07 ||r(i)||/||b|| 7.339752304242e-03 28 KSP preconditioned resid norm 2.480151501291e-05 true resid norm 4.755141948648e-07 ||r(i)||/||b|| 6.178135871837e-03 29 KSP preconditioned resid norm 2.070247124736e-05 true resid norm 3.967120153093e-07 ||r(i)||/||b|| 5.154295621540e-03 30 KSP preconditioned resid norm 1.761036053456e-05 true resid norm 3.451675954707e-07 ||r(i)||/||b|| 4.484602828691e-03 31 KSP preconditioned resid norm 1.517375677054e-05 true resid norm 2.980672554824e-07 ||r(i)||/||b|| 3.872649908673e-03 32 KSP preconditioned resid norm 1.311092989513e-05 true resid norm 2.523513117361e-07 ||r(i)||/||b|| 3.278683808347e-03 33 KSP preconditioned resid norm 1.149466598329e-05 true resid norm 2.275801848775e-07 ||r(i)||/||b|| 2.956844020841e-03 34 KSP preconditioned resid norm 1.030809642601e-05 true resid norm 1.934620086580e-07 ||r(i)||/||b|| 2.513562346688e-03 35 KSP preconditioned resid norm 9.287424934906e-06 true resid norm 1.759852981316e-07 ||r(i)||/||b|| 2.286495534822e-03 36 KSP preconditioned resid norm 8.093773273962e-06 true resid norm 1.523592548412e-07 ||r(i)||/||b|| 1.979533288188e-03 37 KSP preconditioned resid norm 6.925754654006e-06 true resid norm 1.315557916168e-07 ||r(i)||/||b|| 1.709243518097e-03 38 KSP preconditioned resid norm 5.824519845048e-06 true resid norm 1.116692041707e-07 ||r(i)||/||b|| 1.450866290674e-03 39 KSP preconditioned resid norm 5.073056512929e-06 true resid norm 9.719495681394e-08 ||r(i)||/||b|| 1.262809093269e-03 40 KSP preconditioned resid norm 4.433635326764e-06 true resid norm 8.543104801357e-08 ||r(i)||/||b|| 1.109966070416e-03 41 KSP preconditioned resid norm 3.909169310328e-06 true resid norm 7.642591459337e-08 ||r(i)||/||b|| 9.929665393513e-04 42 KSP preconditioned resid norm 3.412826037230e-06 true resid norm 6.627368136190e-08 ||r(i)||/||b|| 8.610632713017e-04 43 KSP preconditioned resid norm 3.032814740983e-06 true resid norm 6.028305311359e-08 ||r(i)||/||b|| 7.832298108595e-04 Linear solve converged due to CONVERGED_RTOL iterations 43 iter = 18, Function value: -4.34176, Residual: 4.01693e-05 0 KSP preconditioned resid norm 1.322225781123e-03 true resid norm 3.405593205026e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.995673850832e-04 true resid norm 1.060841001212e-05 ||r(i)||/||b|| 3.114996235154e-01 2 KSP preconditioned resid norm 3.162026656298e-04 true resid norm 6.910740673871e-06 ||r(i)||/||b|| 2.029232576478e-01 3 KSP preconditioned resid norm 2.129561435015e-04 true resid norm 4.635163402885e-06 ||r(i)||/||b|| 1.361044353754e-01 4 KSP preconditioned resid norm 1.610534251657e-04 true resid norm 3.312701793243e-06 ||r(i)||/||b|| 9.727238674173e-02 5 KSP preconditioned resid norm 1.309528557120e-04 true resid norm 2.794327705930e-06 ||r(i)||/||b|| 8.205112994135e-02 6 KSP preconditioned resid norm 1.044505009707e-04 true resid norm 2.095444991044e-06 ||r(i)||/||b|| 6.152951526776e-02 7 KSP preconditioned resid norm 8.588794457239e-05 true resid norm 1.808518531288e-06 ||r(i)||/||b|| 5.310436163131e-02 8 KSP preconditioned resid norm 7.030760304464e-05 true resid norm 1.462630443304e-06 ||r(i)||/||b|| 4.294789058028e-02 9 KSP preconditioned resid norm 6.258769223182e-05 true resid norm 1.314565916865e-06 ||r(i)||/||b|| 3.860020377434e-02 10 KSP preconditioned resid norm 5.300370609825e-05 true resid norm 1.066492584710e-06 ||r(i)||/||b|| 3.131591239774e-02 11 KSP preconditioned resid norm 4.922604245698e-05 true resid norm 9.609670821258e-07 ||r(i)||/||b|| 2.821731851906e-02 12 KSP preconditioned resid norm 4.507403751059e-05 true resid norm 9.159920373701e-07 ||r(i)||/||b|| 2.689669558943e-02 13 KSP preconditioned resid norm 3.913868778349e-05 true resid norm 7.502449369458e-07 ||r(i)||/||b|| 2.202978722880e-02 14 KSP preconditioned resid norm 3.490106002883e-05 true resid norm 6.425028839991e-07 ||r(i)||/||b|| 1.886610776210e-02 15 KSP preconditioned resid norm 3.160147421659e-05 true resid norm 5.815979895599e-07 ||r(i)||/||b|| 1.707772932779e-02 16 KSP preconditioned resid norm 2.959889142849e-05 true resid norm 5.404933949910e-07 ||r(i)||/||b|| 1.587075620756e-02 17 KSP preconditioned resid norm 2.699438752944e-05 true resid norm 4.971558251790e-07 ||r(i)||/||b|| 1.459821520801e-02 18 KSP preconditioned resid norm 2.294637139104e-05 true resid norm 4.044317033311e-07 ||r(i)||/||b|| 1.187551416106e-02 19 KSP preconditioned resid norm 2.002634881316e-05 true resid norm 3.487632986598e-07 ||r(i)||/||b|| 1.024089718482e-02 20 KSP preconditioned resid norm 1.828337149709e-05 true resid norm 3.147138017230e-07 ||r(i)||/||b|| 9.241086142012e-03 21 KSP preconditioned resid norm 1.773287841403e-05 true resid norm 3.009517265025e-07 ||r(i)||/||b|| 8.836983996159e-03 22 KSP preconditioned resid norm 1.764858498352e-05 true resid norm 3.111846525294e-07 ||r(i)||/||b|| 9.137458110680e-03 23 KSP preconditioned resid norm 1.735748058077e-05 true resid norm 3.079339485030e-07 ||r(i)||/||b|| 9.042006192887e-03 24 KSP preconditioned resid norm 1.663649073026e-05 true resid norm 2.973049704471e-07 ||r(i)||/||b|| 8.729902620439e-03 25 KSP preconditioned resid norm 1.485159156222e-05 true resid norm 2.701425508723e-07 ||r(i)||/||b|| 7.932319998573e-03 26 KSP preconditioned resid norm 1.285076393590e-05 true resid norm 2.335778072460e-07 ||r(i)||/||b|| 6.858652610102e-03 27 KSP preconditioned resid norm 1.104078002446e-05 true resid norm 2.044735202517e-07 ||r(i)||/||b|| 6.004050041851e-03 28 KSP preconditioned resid norm 9.054699209809e-06 true resid norm 1.668979996869e-07 ||r(i)||/||b|| 4.900702745138e-03 29 KSP preconditioned resid norm 7.358742030042e-06 true resid norm 1.348383757127e-07 ||r(i)||/||b|| 3.959321257563e-03 30 KSP preconditioned resid norm 6.128829660537e-06 true resid norm 1.171988042733e-07 ||r(i)||/||b|| 3.441362406419e-03 31 KSP preconditioned resid norm 5.215485372403e-06 true resid norm 9.952403620577e-08 ||r(i)||/||b|| 2.922370060490e-03 32 KSP preconditioned resid norm 4.554934587007e-06 true resid norm 8.587417602799e-08 ||r(i)||/||b|| 2.521562936561e-03 33 KSP preconditioned resid norm 4.051318747373e-06 true resid norm 7.558466847871e-08 ||r(i)||/||b|| 2.219427392771e-03 34 KSP preconditioned resid norm 3.616897295405e-06 true resid norm 6.902005905012e-08 ||r(i)||/||b|| 2.026667746114e-03 35 KSP preconditioned resid norm 3.172663661273e-06 true resid norm 5.924641615290e-08 ||r(i)||/||b|| 1.739679773423e-03 36 KSP preconditioned resid norm 2.746270981842e-06 true resid norm 5.117596250623e-08 ||r(i)||/||b|| 1.502703330236e-03 37 KSP preconditioned resid norm 2.408251147421e-06 true resid norm 4.488283146484e-08 ||r(i)||/||b|| 1.317915228354e-03 38 KSP preconditioned resid norm 2.087743431004e-06 true resid norm 3.953602156157e-08 ||r(i)||/||b|| 1.160914389400e-03 39 KSP preconditioned resid norm 1.798364630482e-06 true resid norm 3.399967529470e-08 ||r(i)||/||b|| 9.983481070060e-04 40 KSP preconditioned resid norm 1.588946507534e-06 true resid norm 3.044228998646e-08 ||r(i)||/||b|| 8.938909656482e-04 41 KSP preconditioned resid norm 1.413610303844e-06 true resid norm 2.704334203503e-08 ||r(i)||/||b|| 7.940860932866e-04 42 KSP preconditioned resid norm 1.261029310369e-06 true resid norm 2.450891486848e-08 ||r(i)||/||b|| 7.196665424486e-04 Linear solve converged due to CONVERGED_RTOL iterations 42 iter = 19, Function value: -4.34176, Residual: 1.0966e-05 0 KSP preconditioned resid norm 2.834753235122e-04 true resid norm 9.199725805332e-06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 7.671850238398e-05 true resid norm 1.828666484869e-06 ||r(i)||/||b|| 1.987740203962e-01 2 KSP preconditioned resid norm 4.552217792547e-05 true resid norm 9.245248909205e-07 ||r(i)||/||b|| 1.004948310943e-01 3 KSP preconditioned resid norm 2.943186668756e-05 true resid norm 6.427287078450e-07 ||r(i)||/||b|| 6.986389827754e-02 4 KSP preconditioned resid norm 2.190845581341e-05 true resid norm 4.365062950096e-07 ||r(i)||/||b|| 4.744775053585e-02 5 KSP preconditioned resid norm 1.677936249249e-05 true resid norm 3.471852509163e-07 ||r(i)||/||b|| 3.773865202755e-02 6 KSP preconditioned resid norm 1.325124592687e-05 true resid norm 2.686324299984e-07 ||r(i)||/||b|| 2.920004744518e-02 7 KSP preconditioned resid norm 1.053115442053e-05 true resid norm 2.125718099648e-07 ||r(i)||/||b|| 2.310632017332e-02 8 KSP preconditioned resid norm 8.838476394611e-06 true resid norm 1.769783142370e-07 ||r(i)||/||b|| 1.923734663206e-02 9 KSP preconditioned resid norm 7.544263449240e-06 true resid norm 1.509494740757e-07 ||r(i)||/||b|| 1.640804055140e-02 10 KSP preconditioned resid norm 6.414518722724e-06 true resid norm 1.297628240184e-07 ||r(i)||/||b|| 1.410507516900e-02 11 KSP preconditioned resid norm 5.782150395405e-06 true resid norm 1.150707779723e-07 ||r(i)||/||b|| 1.250806604536e-02 12 KSP preconditioned resid norm 5.123523977312e-06 true resid norm 1.031619039872e-07 ||r(i)||/||b|| 1.121358464047e-02 13 KSP preconditioned resid norm 4.443616980016e-06 true resid norm 9.182113562299e-08 ||r(i)||/||b|| 9.980855687000e-03 14 KSP preconditioned resid norm 3.885051189303e-06 true resid norm 7.610485010864e-08 ||r(i)||/||b|| 8.272512868212e-03 15 KSP preconditioned resid norm 3.416383814036e-06 true resid norm 6.726229292368e-08 ||r(i)||/||b|| 7.311336701436e-03 16 KSP preconditioned resid norm 3.189219080477e-06 true resid norm 6.107952452345e-08 ||r(i)||/||b|| 6.639276628011e-03 17 KSP preconditioned resid norm 2.951968041422e-06 true resid norm 5.647533479336e-08 ||r(i)||/||b|| 6.138806306665e-03 18 KSP preconditioned resid norm 2.696324595003e-06 true resid norm 5.174146472721e-08 ||r(i)||/||b|| 5.624239876499e-03 19 KSP preconditioned resid norm 2.261351223372e-06 true resid norm 4.123407995435e-08 ||r(i)||/||b|| 4.482098795863e-03 20 KSP preconditioned resid norm 1.991584144236e-06 true resid norm 3.585961065524e-08 ||r(i)||/||b|| 3.897899938980e-03 21 KSP preconditioned resid norm 1.853328343141e-06 true resid norm 3.450293486182e-08 ||r(i)||/||b|| 3.750430783689e-03 22 KSP preconditioned resid norm 1.790348845217e-06 true resid norm 3.258485242794e-08 ||r(i)||/||b|| 3.541937348725e-03 23 KSP preconditioned resid norm 1.738362162705e-06 true resid norm 3.232265380528e-08 ||r(i)||/||b|| 3.513436649008e-03 24 KSP preconditioned resid norm 1.701662498152e-06 true resid norm 3.165686953835e-08 ||r(i)||/||b|| 3.441066637008e-03 25 KSP preconditioned resid norm 1.585531732168e-06 true resid norm 2.958153732956e-08 ||r(i)||/||b|| 3.215480325774e-03 26 KSP preconditioned resid norm 1.432953673649e-06 true resid norm 2.706558432705e-08 ||r(i)||/||b|| 2.941999022554e-03 27 KSP preconditioned resid norm 1.255003533522e-06 true resid norm 2.424292985953e-08 ||r(i)||/||b|| 2.635179609970e-03 28 KSP preconditioned resid norm 1.109305579838e-06 true resid norm 2.135413064960e-08 ||r(i)||/||b|| 2.321170337188e-03 29 KSP preconditioned resid norm 9.359142577955e-07 true resid norm 1.796632287660e-08 ||r(i)||/||b|| 1.952919386596e-03 30 KSP preconditioned resid norm 7.905013967673e-07 true resid norm 1.523325710354e-08 ||r(i)||/||b|| 1.655838165819e-03 31 KSP preconditioned resid norm 6.871676730748e-07 true resid norm 1.321289950920e-08 ||r(i)||/||b|| 1.436227534253e-03 32 KSP preconditioned resid norm 5.988365139967e-07 true resid norm 1.120803757866e-08 ||r(i)||/||b|| 1.218301264170e-03 33 KSP preconditioned resid norm 5.262942321561e-07 true resid norm 9.983046569993e-09 ||r(i)||/||b|| 1.085146099051e-03 34 KSP preconditioned resid norm 4.547511237936e-07 true resid norm 8.485648381441e-09 ||r(i)||/||b|| 9.223805753562e-04 35 KSP preconditioned resid norm 3.908365301856e-07 true resid norm 7.485095789562e-09 ||r(i)||/||b|| 8.136216174207e-04 36 KSP preconditioned resid norm 3.297970210831e-07 true resid norm 6.322608973062e-09 ||r(i)||/||b|| 6.872605887229e-04 37 KSP preconditioned resid norm 2.820636651212e-07 true resid norm 5.549016352309e-09 ||r(i)||/||b|| 6.031719281343e-04 Linear solve converged due to CONVERGED_RTOL iterations 37 iter = 20, Function value: -4.34176, Residual: 2.10862e-06 0 KSP preconditioned resid norm 6.830677268940e-05 true resid norm 1.856951709240e-06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.406384948778e-05 true resid norm 4.208735026516e-07 ||r(i)||/||b|| 2.266475216116e-01 2 KSP preconditioned resid norm 6.835283181980e-06 true resid norm 1.645598208924e-07 ||r(i)||/||b|| 8.861825543100e-02 3 KSP preconditioned resid norm 3.863310923639e-06 true resid norm 8.958437165489e-08 ||r(i)||/||b|| 4.824270400202e-02 4 KSP preconditioned resid norm 2.411011793283e-06 true resid norm 4.893091319464e-08 ||r(i)||/||b|| 2.635012690485e-02 5 KSP preconditioned resid norm 1.603763839015e-06 true resid norm 3.390054202581e-08 ||r(i)||/||b|| 1.825601702895e-02 6 KSP preconditioned resid norm 1.271001343991e-06 true resid norm 2.612804387178e-08 ||r(i)||/||b|| 1.407039490676e-02 7 KSP preconditioned resid norm 9.878244657498e-07 true resid norm 1.965266376892e-08 ||r(i)||/||b|| 1.058329286170e-02 8 KSP preconditioned resid norm 7.719522165424e-07 true resid norm 1.528157234949e-08 ||r(i)||/||b|| 8.229385973502e-03 9 KSP preconditioned resid norm 6.467074169403e-07 true resid norm 1.225663278399e-08 ||r(i)||/||b|| 6.600404697121e-03 10 KSP preconditioned resid norm 5.414522730748e-07 true resid norm 1.015543143439e-08 ||r(i)||/||b|| 5.468872121906e-03 11 KSP preconditioned resid norm 4.572854991498e-07 true resid norm 9.054897049528e-09 ||r(i)||/||b|| 4.876215684271e-03 12 KSP preconditioned resid norm 3.821672457141e-07 true resid norm 7.315935548453e-09 ||r(i)||/||b|| 3.939755413160e-03 13 KSP preconditioned resid norm 3.410203695470e-07 true resid norm 6.847446322548e-09 ||r(i)||/||b|| 3.687466016740e-03 14 KSP preconditioned resid norm 2.979121924760e-07 true resid norm 5.960620275353e-09 ||r(i)||/||b|| 3.209895144658e-03 15 KSP preconditioned resid norm 2.563777646275e-07 true resid norm 4.986383366849e-09 ||r(i)||/||b|| 2.685252040771e-03 16 KSP preconditioned resid norm 2.253899930178e-07 true resid norm 4.334537331741e-09 ||r(i)||/||b|| 2.334221891810e-03 17 KSP preconditioned resid norm 2.068360116853e-07 true resid norm 3.828015372857e-09 ||r(i)||/||b|| 2.061451223427e-03 18 KSP preconditioned resid norm 1.923736253277e-07 true resid norm 3.689957572078e-09 ||r(i)||/||b|| 1.987104755454e-03 19 KSP preconditioned resid norm 1.755342218892e-07 true resid norm 3.271054896781e-09 ||r(i)||/||b|| 1.761518557809e-03 20 KSP preconditioned resid norm 1.561672957859e-07 true resid norm 2.796085708462e-09 ||r(i)||/||b|| 1.505739591692e-03 21 KSP preconditioned resid norm 1.403465702139e-07 true resid norm 2.460272490878e-09 ||r(i)||/||b|| 1.324898476701e-03 22 KSP preconditioned resid norm 1.359217556860e-07 true resid norm 2.386682689115e-09 ||r(i)||/||b|| 1.285269119945e-03 23 KSP preconditioned resid norm 1.308433962007e-07 true resid norm 2.256087388359e-09 ||r(i)||/||b|| 1.214941334841e-03 24 KSP preconditioned resid norm 1.311836867253e-07 true resid norm 2.302005923558e-09 ||r(i)||/||b|| 1.239669245088e-03 25 KSP preconditioned resid norm 1.263097518328e-07 true resid norm 2.189126164598e-09 ||r(i)||/||b|| 1.178881579799e-03 26 KSP preconditioned resid norm 1.155902039051e-07 true resid norm 2.148430753134e-09 ||r(i)||/||b|| 1.156966410297e-03 27 KSP preconditioned resid norm 1.027944710442e-07 true resid norm 1.865927709926e-09 ||r(i)||/||b|| 1.004833728654e-03 28 KSP preconditioned resid norm 9.298615403849e-08 true resid norm 1.804679271278e-09 ||r(i)||/||b|| 9.718504053168e-04 29 KSP preconditioned resid norm 8.133107372584e-08 true resid norm 1.526532984069e-09 ||r(i)||/||b|| 8.220639106948e-04 30 KSP preconditioned resid norm 6.718433072463e-08 true resid norm 1.300909235523e-09 ||r(i)||/||b|| 7.005616942272e-04 Linear solve converged due to CONVERGED_RTOL iterations 30 iter = 21, Function value: -4.34176, Residual: 2.38301e-07 Tao Object: 1 MPI processes type: tron Total PG its: 63, PG tolerance: 0.001 TaoLineSearch Object: 1 MPI processes type: more-thuente KSP Object: 1 MPI processes type: cg total KSP iterations: 797 Active Set subset type: subvec convergence tolerances: gatol=1e-12, steptol=1e-12, gttol=0. Residual in Function/Gradient:=2.38301e-07 Objective value=-4.34176 total number of iterations=21, (max: 50) total number of function/gradient evaluations=85, (max: 10000) total number of Hessian evaluations=21 Solution converged: ||g(X)||/|f(X)| <= grtol TAO solve converged due to CONVERGED_GRTOL iterations 21 it: 1 1.199163e+01 797 6.802495e+08 ========================================== Time summary: ========================================== Creating DMPlex: 0.533826 Distributing DMPlex: 2.31266e-05 Refining DMPlex: 2.10193 Setting up problem: 3.03744 Overall analysis time: 12.1315 Overall FLOPS/s: 5.47522e+08 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./main on a arch-darwin-c-opt named Justins-MacBook-Pro-2.local with 1 processor, by justin Tue Mar 8 00:53:22 2016 Using Petsc Development GIT revision: pre-tsfc-438-gf0bfc80 GIT Date: 2016-03-01 11:52:01 -0600 Max Max/Min Avg Total Time (sec): 1.785e+01 1.00000 1.785e+01 Objects: 2.415e+03 1.00000 2.415e+03 Flops: 8.229e+09 1.00000 8.229e+09 8.229e+09 Flops/sec: 4.610e+08 1.00000 4.610e+08 4.610e+08 MPI Messages: 5.500e+00 1.00000 5.500e+00 5.500e+00 MPI Message Lengths: 4.494e+06 1.00000 8.170e+05 4.494e+06 MPI Reductions: 1.000e+00 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.7852e+01 100.0% 8.2291e+09 100.0% 5.500e+00 100.0% 8.170e+05 100.0% 1.000e+00 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage CreateMesh 86 1.0 2.9466e+00 1.0 3.61e+08 1.0 0.0e+00 0.0e+00 0.0e+00 17 4 0 0 0 17 4 0 0 0 122 BuildTwoSided 5 1.0 5.4789e-04 1.0 0.00e+00 0.0 5.0e-01 4.0e+00 0.0e+00 0 0 9 0 0 0 0 9 0 0 0 VecView 1 1.0 1.3087e-02 1.0 4.02e+05 1.0 1.0e+00 1.0e+06 0.0e+00 0 0 18 22 0 0 0 18 22 0 31 VecDot 464 1.0 6.1516e-02 1.0 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1883 VecTDot 1594 1.0 1.8452e-01 1.0 3.28e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 1775 VecNorm 1700 1.0 1.7410e-01 1.0 3.51e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 2018 VecScale 252 1.0 1.7918e-02 1.0 3.10e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1730 VecCopy 1280 1.0 1.1792e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecSet 2693 1.0 2.1445e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 1784 1.0 1.8100e-01 1.0 3.75e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 2072 VecAYPX 1615 1.0 1.6647e-01 1.0 2.49e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 1494 VecWAXPY 1 1.0 2.0790e-04 1.0 1.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 600 VecScatterBegin 42 1.0 4.9083e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 1722 1.0 4.3076e+00 1.0 4.35e+09 1.0 0.0e+00 0.0e+00 0.0e+00 24 53 0 0 0 24 53 0 0 0 1010 MatSolve 818 1.0 2.3631e+00 1.0 2.01e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 24 0 0 0 13 24 0 0 0 852 MatLUFactorNum 21 1.0 5.9143e-01 1.0 2.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 3 0 0 0 3 3 0 0 0 376 MatILUFactorSym 21 1.0 1.3571e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatAssemblyBegin 23 1.0 1.8835e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 23 1.0 6.8571e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 21 1.0 9.0599e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 21 1.0 5.2455e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 MatGetOrdering 21 1.0 6.6700e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 1 1.0 9.4509e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexInterp 3 1.0 4.8164e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 DMPlexStratify 11 1.0 6.3346e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 DMPlexPrealloc 1 1.0 1.0882e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 6 0 0 0 0 0 DMPlexResidualFE 1 1.0 6.3628e-01 1.0 4.01e+07 1.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 63 DMPlexJacobianFE 1 1.0 1.9361e+00 1.0 8.14e+07 1.0 0.0e+00 0.0e+00 0.0e+00 11 1 0 0 0 11 1 0 0 0 42 SFSetGraph 6 1.0 2.0554e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 9 1.0 2.5361e-03 1.0 0.00e+00 0.0 4.5e+00 7.8e+05 0.0e+00 0 0 82 78 0 0 0 82 78 0 0 SFBcastEnd 9 1.0 6.9499e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFReduceBegin 1 1.0 3.6287e-04 1.0 0.00e+00 0.0 1.0e+00 1.0e+06 0.0e+00 0 0 18 22 0 0 0 18 22 0 0 SFReduceEnd 1 1.0 3.8218e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SNESFunctionEval 1 1.0 6.4115e-01 1.0 4.01e+07 1.0 2.0e+00 1.0e+06 0.0e+00 4 0 36 44 0 4 0 36 44 0 63 SNESJacobianEval 1 1.0 1.9377e+00 1.0 8.14e+07 1.0 2.5e+00 6.0e+05 0.0e+00 11 1 45 33 0 11 1 45 33 0 42 KSPSetUp 42 1.0 5.8956e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 21 1.0 8.0150e+00 1.0 7.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 45 91 0 0 0 45 91 0 0 0 930 PCSetUp 42 1.0 7.3451e-01 1.0 2.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 3 0 0 0 4 3 0 0 0 303 PCSetUpOnBlocks 21 1.0 7.3436e-01 1.0 2.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 3 0 0 0 4 3 0 0 0 303 PCApply 818 1.0 2.4092e+00 1.0 2.01e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 24 0 0 0 13 24 0 0 0 836 TaoSolve 1 1.0 9.4062e+00 1.0 8.03e+09 1.0 0.0e+00 0.0e+00 0.0e+00 53 98 0 0 0 53 98 0 0 0 854 TaoHessianEval 21 1.0 5.2452e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 TaoLineSearchApply 84 1.0 5.7770e-01 1.0 4.61e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 798 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 4 3 2432 0. Object 7 7 4088 0. Container 15 15 8640 0. Vector 1850 1850 1684755112 0. Vector Scatter 42 42 27888 0. Matrix 43 43 742422776 0. Distributed Mesh 29 29 135576 0. GraphPartitioner 11 11 6732 0. Star Forest Bipartite Graph 62 62 50488 0. Discrete System 29 29 25056 0. Index Set 249 249 35249872 0. IS L to G Mapping 1 1 561392 0. Section 56 54 36288 0. SNES 1 1 1340 0. SNESLineSearch 1 1 1000 0. DMSNES 1 1 672 0. Krylov Solver 3 3 3680 0. Preconditioner 3 3 2824 0. Linear Space 2 2 1296 0. Dual Space 2 2 1328 0. FE Space 2 2 1512 0. Tao 1 1 1944 0. TaoLineSearch 1 1 888 0. ======================================================================================================================== Average time to get PetscTime(): 0. #PETSc Option Table entries: -al 1 -am 0 -at 0.001 -bcloc 0,1,0,1,0,0,0,1,0,1,1,1,0,0,0,1,0,1,1,1,0,1,0,1,0,1,0,0,0,1,0,1,1,1,0,1,0.45,0.55,0.45,0.55,0.45,0.55 -bcnum 7 -bcval 0,0,0,0,0,0,1 -dim 3 -dm_refine 1 -dt 0.001 -edges 3,3 -floc 0.25,0.75,0.25,0.75,0.25,0.75 -fnum 0 -ftime 0,99 -fval 1 -ksp_atol 1.0e-12 -ksp_converged_reason -ksp_monitor_true_residual -ksp_rtol 1.0e-3 -ksp_type cg -log_view -lower 0,0 -mat_petscspace_order 0 -mesh datafiles/cube_with_hole2_mesh.dat -mu 1 -nonneg 1 -numsteps 0 -options_left 0 -pc_type bjacobi -petscpartitioner_type parmetis -progress 0 -simplex 1 -solution_petscspace_order 1 -tao_converged_reason -tao_gatol 1e-12 -tao_grtol 1e-7 -tao_monitor -tao_type tron -tao_view -trans datafiles/cube_with_hole2_trans.dat -upper 1,1 -vtuname figures/cube_with_hole_2 -vtuprint 1 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco --download-ctetgen --download-exodusii --download-fblaslapack --download-hdf5 --download-hypre --download-metis --download-ml --download-mumps --download-netcdf --download-parmetis --download-scalapack --download-superlu_dist --download-triangle --with-debugging=0 --with-mpi-dir=/usr/local/opt/open-mpi --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2 PETSC_ARCH=arch-darwin-c-opt ----------------------------------------- Libraries compiled on Tue Mar 1 13:44:59 2016 on Justins-MacBook-Pro-2.local Machine characteristics: Darwin-15.3.0-x86_64-i386-64bit Using PETSc directory: /Users/justin/Software/petsc Using PETSc arch: arch-darwin-c-opt ----------------------------------------- Using C compiler: /usr/local/opt/open-mpi/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -O2 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/opt/open-mpi/bin/mpif90 -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O2 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/Users/justin/Software/petsc/arch-darwin-c-opt/include -I/Users/justin/Software/petsc/include -I/Users/justin/Software/petsc/include -I/Users/justin/Software/petsc/arch-darwin-c-opt/include -I/opt/X11/include -I/usr/local/opt/open-mpi/include -I/usr/local/Cellar/open-mpi/1.10.2/include ----------------------------------------- Using C linker: /usr/local/opt/open-mpi/bin/mpicc Using Fortran linker: /usr/local/opt/open-mpi/bin/mpif90 Using libraries: -Wl,-rpath,/Users/justin/Software/petsc/arch-darwin-c-opt/lib -L/Users/justin/Software/petsc/arch-darwin-c-opt/lib -lpetsc -Wl,-rpath,/Users/justin/Software/petsc/arch-darwin-c-opt/lib -L/Users/justin/Software/petsc/arch-darwin-c-opt/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -Wl,-rpath,/usr/local/opt/libevent/lib -L/usr/local/opt/libevent/lib -Wl,-rpath,/usr/local/Cellar/open-mpi/1.10.2/lib -L/usr/local/Cellar/open-mpi/1.10.2/lib -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin -lclang_rt.osx -lmpi_cxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -lclang_rt.osx -lscalapack -lml -lclang_rt.osx -lmpi_cxx -lc++ -lclang_rt.osx -lflapack -lfblas -ltriangle -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lctetgen -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -Wl,-rpath,/usr/local/Cellar/gcc/5.3.0/lib/gcc/5/gcc/x86_64-apple-darwin15.0.0/5.3.0 -L/usr/local/Cellar/gcc/5.3.0/lib/gcc/5/gcc/x86_64-apple-darwin15.0.0/5.3.0 -Wl,-rpath,/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 -L/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpi_cxx -lc++ -lclang_rt.osx -Wl,-rpath,/usr/local/opt/libevent/lib -L/usr/local/opt/libevent/lib -Wl,-rpath,/usr/local/Cellar/open-mpi/1.10.2/lib -L/usr/local/Cellar/open-mpi/1.10.2/lib -ldl -lmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -lclang_rt.osx -ldl ----------------------------------------- -------------- next part -------------- TSTEP ANALYSIS TIME ITER FLOPS/s iter = 0, Function value: -8.83019e-08, Residual: 0.214312 0 KSP preconditioned resid norm 2.721099185886e+01 true resid norm 2.014138134175e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.869407744459e+00 true resid norm 5.822300398427e-02 ||r(i)||/||b|| 2.890715537151e-01 2 KSP preconditioned resid norm 2.242110432761e+00 true resid norm 2.437887036293e-02 ||r(i)||/||b|| 1.210387209759e-01 3 KSP preconditioned resid norm 1.159188546876e+00 true resid norm 1.299762353990e-02 ||r(i)||/||b|| 6.453193710683e-02 4 KSP preconditioned resid norm 7.177070910360e-01 true resid norm 8.470337152644e-03 ||r(i)||/||b|| 4.205440038558e-02 5 KSP preconditioned resid norm 5.036877443493e-01 true resid norm 6.119771032345e-03 ||r(i)||/||b|| 3.038406814562e-02 6 KSP preconditioned resid norm 3.607350191643e-01 true resid norm 4.645502549467e-03 ||r(i)||/||b|| 2.306446847237e-02 7 KSP preconditioned resid norm 2.882809204229e-01 true resid norm 3.925653521457e-03 ||r(i)||/||b|| 1.949048803977e-02 8 KSP preconditioned resid norm 2.316382761182e-01 true resid norm 3.297365650418e-03 ||r(i)||/||b|| 1.637109984896e-02 9 KSP preconditioned resid norm 1.988660674034e-01 true resid norm 2.872013000564e-03 ||r(i)||/||b|| 1.425926529980e-02 10 KSP preconditioned resid norm 1.647703203546e-01 true resid norm 2.407596118385e-03 ||r(i)||/||b|| 1.195348063539e-02 11 KSP preconditioned resid norm 1.392203719999e-01 true resid norm 2.011397078071e-03 ||r(i)||/||b|| 9.986390922959e-03 12 KSP preconditioned resid norm 1.194926809320e-01 true resid norm 1.735371271678e-03 ||r(i)||/||b|| 8.615949632416e-03 13 KSP preconditioned resid norm 1.021184952736e-01 true resid norm 1.495155793005e-03 ||r(i)||/||b|| 7.423303137139e-03 14 KSP preconditioned resid norm 8.644061423831e-02 true resid norm 1.263475286700e-03 ||r(i)||/||b|| 6.273031949807e-03 15 KSP preconditioned resid norm 6.765317790725e-02 true resid norm 9.953412375312e-04 ||r(i)||/||b|| 4.941772466559e-03 16 KSP preconditioned resid norm 4.917452989312e-02 true resid norm 7.233557843802e-04 ||r(i)||/||b|| 3.591391137016e-03 17 KSP preconditioned resid norm 3.423977253678e-02 true resid norm 4.971749871193e-04 ||r(i)||/||b|| 2.468425470347e-03 18 KSP preconditioned resid norm 2.631060560074e-02 true resid norm 3.738334338599e-04 ||r(i)||/||b|| 1.856046651006e-03 19 KSP preconditioned resid norm 2.150018400123e-02 true resid norm 3.066976855480e-04 ||r(i)||/||b|| 1.522724188297e-03 20 KSP preconditioned resid norm 1.862939289182e-02 true resid norm 2.683025907357e-04 ||r(i)||/||b|| 1.332096275738e-03 21 KSP preconditioned resid norm 1.658811781560e-02 true resid norm 2.364991565965e-04 ||r(i)||/||b|| 1.174195317509e-03 22 KSP preconditioned resid norm 1.519476952113e-02 true resid norm 2.152496728220e-04 ||r(i)||/||b|| 1.068693696673e-03 23 KSP preconditioned resid norm 1.324677805794e-02 true resid norm 1.908271209032e-04 ||r(i)||/||b|| 9.474381010186e-04 24 KSP preconditioned resid norm 1.100601436745e-02 true resid norm 1.594847078641e-04 ||r(i)||/||b|| 7.918260677264e-04 25 KSP preconditioned resid norm 8.984870418657e-03 true resid norm 1.346950418144e-04 ||r(i)||/||b|| 6.687477861072e-04 26 KSP preconditioned resid norm 7.748887814752e-03 true resid norm 1.165325740675e-04 ||r(i)||/||b|| 5.785728996947e-04 27 KSP preconditioned resid norm 6.576319020263e-03 true resid norm 9.791184587025e-05 ||r(i)||/||b|| 4.861227947028e-04 28 KSP preconditioned resid norm 5.389117734804e-03 true resid norm 7.955305562044e-05 ||r(i)||/||b|| 3.949731861516e-04 29 KSP preconditioned resid norm 4.519456717206e-03 true resid norm 6.698616740437e-05 ||r(i)||/||b|| 3.325798080468e-04 30 KSP preconditioned resid norm 3.932240847963e-03 true resid norm 5.810472455106e-05 ||r(i)||/||b|| 2.884843078296e-04 31 KSP preconditioned resid norm 3.507422820066e-03 true resid norm 5.184611521829e-05 ||r(i)||/||b|| 2.574109210217e-04 32 KSP preconditioned resid norm 3.163649589191e-03 true resid norm 4.617289029310e-05 ||r(i)||/||b|| 2.292439108801e-04 33 KSP preconditioned resid norm 2.899170705334e-03 true resid norm 4.222325544314e-05 ||r(i)||/||b|| 2.096343578760e-04 34 KSP preconditioned resid norm 2.630786210763e-03 true resid norm 3.849978293468e-05 ||r(i)||/||b|| 1.911476789076e-04 35 KSP preconditioned resid norm 2.335319526879e-03 true resid norm 3.486349939426e-05 ||r(i)||/||b|| 1.730938846880e-04 36 KSP preconditioned resid norm 2.004563524318e-03 true resid norm 3.004754039312e-05 ||r(i)||/||b|| 1.491831165067e-04 37 KSP preconditioned resid norm 1.763532111737e-03 true resid norm 2.598370699528e-05 ||r(i)||/||b|| 1.290065788160e-04 38 KSP preconditioned resid norm 1.525341167053e-03 true resid norm 2.271971812894e-05 ||r(i)||/||b|| 1.128011914548e-04 39 KSP preconditioned resid norm 1.350753831797e-03 true resid norm 2.038984224621e-05 ||r(i)||/||b|| 1.012335842326e-04 40 KSP preconditioned resid norm 1.190975949540e-03 true resid norm 1.852980047591e-05 ||r(i)||/||b|| 9.199865769637e-05 41 KSP preconditioned resid norm 1.101875226346e-03 true resid norm 1.698851660538e-05 ||r(i)||/||b|| 8.434633313935e-05 42 KSP preconditioned resid norm 1.015609984078e-03 true resid norm 1.600315260378e-05 ||r(i)||/||b|| 7.945409667912e-05 43 KSP preconditioned resid norm 9.099467858429e-04 true resid norm 1.445158769249e-05 ||r(i)||/||b|| 7.175072775440e-05 44 KSP preconditioned resid norm 7.874717640691e-04 true resid norm 1.269696798266e-05 ||r(i)||/||b|| 6.303921149811e-05 45 KSP preconditioned resid norm 6.950984299784e-04 true resid norm 1.119490884077e-05 ||r(i)||/||b|| 5.558163390493e-05 46 KSP preconditioned resid norm 6.280641234357e-04 true resid norm 1.023401417248e-05 ||r(i)||/||b|| 5.081088530542e-05 47 KSP preconditioned resid norm 5.648091928453e-04 true resid norm 9.147793382121e-06 ||r(i)||/||b|| 4.541790469535e-05 48 KSP preconditioned resid norm 5.059139546744e-04 true resid norm 8.199545371755e-06 ||r(i)||/||b|| 4.070994552275e-05 49 KSP preconditioned resid norm 4.534691667629e-04 true resid norm 7.329873767891e-06 ||r(i)||/||b|| 3.639211056840e-05 50 KSP preconditioned resid norm 4.032773047833e-04 true resid norm 6.578158700790e-06 ||r(i)||/||b|| 3.265991835007e-05 51 KSP preconditioned resid norm 3.667455182101e-04 true resid norm 5.853198666589e-06 ||r(i)||/||b|| 2.906056226867e-05 52 KSP preconditioned resid norm 3.422077910799e-04 true resid norm 5.394434941801e-06 ||r(i)||/||b|| 2.678284498105e-05 53 KSP preconditioned resid norm 3.078427368049e-04 true resid norm 4.864002462256e-06 ||r(i)||/||b|| 2.414929929445e-05 54 KSP preconditioned resid norm 2.725837707914e-04 true resid norm 4.311024232678e-06 ||r(i)||/||b|| 2.140381615109e-05 55 KSP preconditioned resid norm 2.440537398852e-04 true resid norm 3.886549581636e-06 ||r(i)||/||b|| 1.929634078066e-05 56 KSP preconditioned resid norm 2.205886873990e-04 true resid norm 3.534900723394e-06 ||r(i)||/||b|| 1.755043839057e-05 57 KSP preconditioned resid norm 1.981433998614e-04 true resid norm 3.211010253165e-06 ||r(i)||/||b|| 1.594235369800e-05 58 KSP preconditioned resid norm 1.725710081850e-04 true resid norm 2.800409797826e-06 ||r(i)||/||b|| 1.390376236024e-05 59 KSP preconditioned resid norm 1.493637198409e-04 true resid norm 2.458055450818e-06 ||r(i)||/||b|| 1.220400631471e-05 60 KSP preconditioned resid norm 1.326613587650e-04 true resid norm 2.188541658623e-06 ||r(i)||/||b|| 1.086589654150e-05 61 KSP preconditioned resid norm 1.196363811822e-04 true resid norm 1.990902324237e-06 ||r(i)||/||b|| 9.884636462895e-06 62 KSP preconditioned resid norm 1.064700269885e-04 true resid norm 1.766171143006e-06 ||r(i)||/||b|| 8.768867998866e-06 63 KSP preconditioned resid norm 9.503194275123e-05 true resid norm 1.580134784957e-06 ||r(i)||/||b|| 7.845215569610e-06 64 KSP preconditioned resid norm 8.321673701272e-05 true resid norm 1.395227470305e-06 ||r(i)||/||b|| 6.927168731040e-06 65 KSP preconditioned resid norm 7.331502089643e-05 true resid norm 1.218974554137e-06 ||r(i)||/||b|| 6.052090139472e-06 66 KSP preconditioned resid norm 6.338376782962e-05 true resid norm 1.079912228121e-06 ||r(i)||/||b|| 5.361659211936e-06 67 KSP preconditioned resid norm 5.761773416199e-05 true resid norm 9.657506114892e-07 ||r(i)||/||b|| 4.794857885380e-06 68 KSP preconditioned resid norm 5.115597006225e-05 true resid norm 8.564973454081e-07 ||r(i)||/||b|| 4.252426042065e-06 69 KSP preconditioned resid norm 4.433976778029e-05 true resid norm 7.420093170735e-07 ||r(i)||/||b|| 3.684004113141e-06 70 KSP preconditioned resid norm 3.880717701108e-05 true resid norm 6.520731532060e-07 ||r(i)||/||b|| 3.237479804101e-06 71 KSP preconditioned resid norm 3.419142599030e-05 true resid norm 5.729215582166e-07 ||r(i)||/||b|| 2.844499830948e-06 72 KSP preconditioned resid norm 3.089262210794e-05 true resid norm 5.149748252731e-07 ||r(i)||/||b|| 2.556799936088e-06 73 KSP preconditioned resid norm 2.755749580300e-05 true resid norm 4.521312348244e-07 ||r(i)||/||b|| 2.244787619840e-06 74 KSP preconditioned resid norm 2.385938395277e-05 true resid norm 3.977069530736e-07 ||r(i)||/||b|| 1.974576352662e-06 75 KSP preconditioned resid norm 2.041122887719e-05 true resid norm 3.385889170726e-07 ||r(i)||/||b|| 1.681061052008e-06 76 KSP preconditioned resid norm 1.703797060566e-05 true resid norm 2.854222974065e-07 ||r(i)||/||b|| 1.417093954797e-06 77 KSP preconditioned resid norm 1.461212664703e-05 true resid norm 2.449915749041e-07 ||r(i)||/||b|| 1.216359348682e-06 78 KSP preconditioned resid norm 1.269211829393e-05 true resid norm 2.133870457446e-07 ||r(i)||/||b|| 1.059445934337e-06 79 KSP preconditioned resid norm 1.099216493157e-05 true resid norm 1.859835924467e-07 ||r(i)||/||b|| 9.233904531720e-07 80 KSP preconditioned resid norm 9.683032618557e-06 true resid norm 1.616844376470e-07 ||r(i)||/||b|| 8.027475122167e-07 81 KSP preconditioned resid norm 8.543548059128e-06 true resid norm 1.443513810480e-07 ||r(i)||/||b|| 7.166905715091e-07 82 KSP preconditioned resid norm 7.618003884106e-06 true resid norm 1.276243557946e-07 ||r(i)||/||b|| 6.336425175071e-07 83 KSP preconditioned resid norm 6.768566118849e-06 true resid norm 1.124324635935e-07 ||r(i)||/||b|| 5.582162498482e-07 84 KSP preconditioned resid norm 6.048915101349e-06 true resid norm 1.027715482282e-07 ||r(i)||/||b|| 5.102507443975e-07 85 KSP preconditioned resid norm 5.586006560959e-06 true resid norm 9.482762097940e-08 ||r(i)||/||b|| 4.708099180012e-07 86 KSP preconditioned resid norm 5.142843344087e-06 true resid norm 8.829385301824e-08 ||r(i)||/||b|| 4.383703953573e-07 87 KSP preconditioned resid norm 4.752648373166e-06 true resid norm 8.206782775325e-08 ||r(i)||/||b|| 4.074587852778e-07 88 KSP preconditioned resid norm 4.298372318090e-06 true resid norm 7.573438983970e-08 ||r(i)||/||b|| 3.760138818420e-07 89 KSP preconditioned resid norm 3.921677884713e-06 true resid norm 6.921783634092e-08 ||r(i)||/||b|| 3.436598273300e-07 90 KSP preconditioned resid norm 3.512316935593e-06 true resid norm 6.235807478991e-08 ||r(i)||/||b|| 3.096017782090e-07 91 KSP preconditioned resid norm 3.155049198093e-06 true resid norm 5.494023287162e-08 ||r(i)||/||b|| 2.727729143271e-07 92 KSP preconditioned resid norm 2.761124951607e-06 true resid norm 4.784100607496e-08 ||r(i)||/||b|| 2.375259435449e-07 93 KSP preconditioned resid norm 2.370740074494e-06 true resid norm 4.098097607653e-08 ||r(i)||/||b|| 2.034665616086e-07 Linear solve converged due to CONVERGED_RTOL iterations 93 iter = 1, Function value: -3.07176, Residual: 0.109561 0 KSP preconditioned resid norm 1.208291666398e+01 true resid norm 8.128226097570e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.920561139500e+00 true resid norm 5.609725181947e-02 ||r(i)||/||b|| 6.901536835478e-01 2 KSP preconditioned resid norm 3.881129456473e+00 true resid norm 3.955141467913e-02 ||r(i)||/||b|| 4.865934363091e-01 3 KSP preconditioned resid norm 2.757565946849e+00 true resid norm 2.864602877081e-02 ||r(i)||/||b|| 3.524265741006e-01 4 KSP preconditioned resid norm 2.068896536909e+00 true resid norm 2.281479796426e-02 ||r(i)||/||b|| 2.806860647132e-01 5 KSP preconditioned resid norm 1.596871189340e+00 true resid norm 1.846530063458e-02 ||r(i)||/||b|| 2.271750368767e-01 6 KSP preconditioned resid norm 1.256941337146e+00 true resid norm 1.525471217664e-02 ||r(i)||/||b|| 1.876757855099e-01 7 KSP preconditioned resid norm 1.016165013076e+00 true resid norm 1.293071073526e-02 ||r(i)||/||b|| 1.590840434314e-01 8 KSP preconditioned resid norm 8.533059398924e-01 true resid norm 1.104682985351e-02 ||r(i)||/||b|| 1.359070198209e-01 9 KSP preconditioned resid norm 7.286296401476e-01 true resid norm 9.563452051797e-03 ||r(i)||/||b|| 1.176573084582e-01 10 KSP preconditioned resid norm 6.007667725327e-01 true resid norm 7.925251418712e-03 ||r(i)||/||b|| 9.750284162348e-02 11 KSP preconditioned resid norm 5.163497460926e-01 true resid norm 6.864695260761e-03 ||r(i)||/||b|| 8.445502349907e-02 12 KSP preconditioned resid norm 4.428793542312e-01 true resid norm 5.978663663418e-03 ||r(i)||/||b|| 7.355434742650e-02 13 KSP preconditioned resid norm 3.926551780564e-01 true resid norm 5.275518463473e-03 ||r(i)||/||b|| 6.490368747309e-02 14 KSP preconditioned resid norm 3.155183977396e-01 true resid norm 4.275507793382e-03 ||r(i)||/||b|| 5.260074882341e-02 15 KSP preconditioned resid norm 2.391629838386e-01 true resid norm 3.265859795081e-03 ||r(i)||/||b|| 4.017924398113e-02 16 KSP preconditioned resid norm 1.661697353991e-01 true resid norm 2.259765224053e-03 ||r(i)||/||b|| 2.780145627013e-02 17 KSP preconditioned resid norm 1.144139739063e-01 true resid norm 1.550139838624e-03 ||r(i)||/||b|| 1.907107184293e-02 18 KSP preconditioned resid norm 8.348950562746e-02 true resid norm 1.121080906166e-03 ||r(i)||/||b|| 1.379244244327e-02 19 KSP preconditioned resid norm 6.796993588659e-02 true resid norm 9.306815063392e-04 ||r(i)||/||b|| 1.144999530239e-02 20 KSP preconditioned resid norm 5.797834544499e-02 true resid norm 7.869293093676e-04 ||r(i)||/||b|| 9.681439712939e-03 21 KSP preconditioned resid norm 4.787642608640e-02 true resid norm 6.470524074053e-04 ||r(i)||/||b|| 7.960561131521e-03 22 KSP preconditioned resid norm 3.955050761623e-02 true resid norm 5.419489180796e-04 ||r(i)||/||b|| 6.667493147633e-03 23 KSP preconditioned resid norm 3.305605629354e-02 true resid norm 4.528439372683e-04 ||r(i)||/||b|| 5.571251732327e-03 24 KSP preconditioned resid norm 2.811994621626e-02 true resid norm 3.804568665420e-04 ||r(i)||/||b|| 4.680687544552e-03 25 KSP preconditioned resid norm 2.355696282960e-02 true resid norm 3.109624943633e-04 ||r(i)||/||b|| 3.825711669810e-03 26 KSP preconditioned resid norm 1.914266933202e-02 true resid norm 2.553048324748e-04 ||r(i)||/||b|| 3.140966176508e-03 27 KSP preconditioned resid norm 1.583081811760e-02 true resid norm 2.075586000815e-04 ||r(i)||/||b|| 2.553553476368e-03 28 KSP preconditioned resid norm 1.317866353210e-02 true resid norm 1.720422793006e-04 ||r(i)||/||b|| 2.116603023039e-03 29 KSP preconditioned resid norm 1.067926656285e-02 true resid norm 1.381683430730e-04 ||r(i)||/||b|| 1.699858510510e-03 30 KSP preconditioned resid norm 8.682040728304e-03 true resid norm 1.119238160352e-04 ||r(i)||/||b|| 1.376977149647e-03 31 KSP preconditioned resid norm 6.725934979352e-03 true resid norm 8.674091202708e-05 ||r(i)||/||b|| 1.067156732427e-03 32 KSP preconditioned resid norm 5.074764386642e-03 true resid norm 6.519413336353e-05 ||r(i)||/||b|| 8.020708649212e-04 33 KSP preconditioned resid norm 3.768589509007e-03 true resid norm 4.860932707548e-05 ||r(i)||/||b|| 5.980311877644e-04 34 KSP preconditioned resid norm 2.910754652928e-03 true resid norm 3.678639744402e-05 ||r(i)||/||b|| 4.525759618697e-04 35 KSP preconditioned resid norm 2.234075981015e-03 true resid norm 2.837270847521e-05 ||r(i)||/||b|| 3.490639671513e-04 36 KSP preconditioned resid norm 1.753951250034e-03 true resid norm 2.207862424724e-05 ||r(i)||/||b|| 2.716290612762e-04 37 KSP preconditioned resid norm 1.395909468657e-03 true resid norm 1.792904467113e-05 ||r(i)||/||b|| 2.205775830533e-04 38 KSP preconditioned resid norm 1.095264422250e-03 true resid norm 1.403649128430e-05 ||r(i)||/||b|| 1.726882485281e-04 39 KSP preconditioned resid norm 8.712685777796e-04 true resid norm 1.136279521341e-05 ||r(i)||/||b|| 1.397942807818e-04 40 KSP preconditioned resid norm 7.106986415358e-04 true resid norm 9.433215720325e-06 ||r(i)||/||b|| 1.160550359585e-04 41 KSP preconditioned resid norm 5.697793610436e-04 true resid norm 7.692817543641e-06 ||r(i)||/||b|| 9.464325243045e-05 42 KSP preconditioned resid norm 4.580041191758e-04 true resid norm 6.258753959481e-06 ||r(i)||/||b|| 7.700024438730e-05 43 KSP preconditioned resid norm 3.594688842143e-04 true resid norm 4.879253644562e-06 ||r(i)||/||b|| 6.002851773551e-05 44 KSP preconditioned resid norm 2.800577180978e-04 true resid norm 3.734724942400e-06 ||r(i)||/||b|| 4.594760157467e-05 45 KSP preconditioned resid norm 2.148410229076e-04 true resid norm 2.905266556978e-06 ||r(i)||/||b|| 3.574293483110e-05 46 KSP preconditioned resid norm 1.679639949761e-04 true resid norm 2.263091608951e-06 ||r(i)||/||b|| 2.784238014279e-05 47 KSP preconditioned resid norm 1.279392911580e-04 true resid norm 1.748768860236e-06 ||r(i)||/||b|| 2.151476643543e-05 48 KSP preconditioned resid norm 9.914824374873e-05 true resid norm 1.369998842372e-06 ||r(i)||/||b|| 1.685483186525e-05 49 KSP preconditioned resid norm 7.412614188098e-05 true resid norm 1.043438599818e-06 ||r(i)||/||b|| 1.283722410391e-05 50 KSP preconditioned resid norm 5.466636524574e-05 true resid norm 8.055297834906e-07 ||r(i)||/||b|| 9.910277763206e-06 51 KSP preconditioned resid norm 4.048843624713e-05 true resid norm 6.105397741503e-07 ||r(i)||/||b|| 7.511353237736e-06 52 KSP preconditioned resid norm 3.086671147154e-05 true resid norm 4.537397693310e-07 ||r(i)||/||b|| 5.582272981636e-06 53 KSP preconditioned resid norm 2.414074192235e-05 true resid norm 3.476290856641e-07 ||r(i)||/||b|| 4.276813679777e-06 54 KSP preconditioned resid norm 1.841120613120e-05 true resid norm 2.601358598092e-07 ||r(i)||/||b|| 3.200401375239e-06 55 KSP preconditioned resid norm 1.455825281038e-05 true resid norm 2.026301857839e-07 ||r(i)||/||b|| 2.492920144587e-06 56 KSP preconditioned resid norm 1.192512021181e-05 true resid norm 1.678124203370e-07 ||r(i)||/||b|| 2.064563882975e-06 57 KSP preconditioned resid norm 9.844805773649e-06 true resid norm 1.414896322458e-07 ||r(i)||/||b|| 1.740719691448e-06 58 KSP preconditioned resid norm 7.906325721612e-06 true resid norm 1.173899736895e-07 ||r(i)||/||b|| 1.444226234363e-06 59 KSP preconditioned resid norm 6.349183928891e-06 true resid norm 9.661097610855e-08 ||r(i)||/||b|| 1.188586229626e-06 60 KSP preconditioned resid norm 5.062040589769e-06 true resid norm 7.851498529922e-08 ||r(i)||/||b|| 9.659547403915e-07 61 KSP preconditioned resid norm 4.137770205574e-06 true resid norm 6.283776478010e-08 ||r(i)||/||b|| 7.730809161286e-07 62 KSP preconditioned resid norm 3.384291454629e-06 true resid norm 5.007798463595e-08 ||r(i)||/||b|| 6.160997988346e-07 63 KSP preconditioned resid norm 2.769028058673e-06 true resid norm 4.054138672197e-08 ||r(i)||/||b|| 4.987728716613e-07 64 KSP preconditioned resid norm 2.300822489770e-06 true resid norm 3.321511744749e-08 ||r(i)||/||b|| 4.086391919809e-07 65 KSP preconditioned resid norm 1.843913416235e-06 true resid norm 2.659743212987e-08 ||r(i)||/||b|| 3.272230842326e-07 66 KSP preconditioned resid norm 1.405538559721e-06 true resid norm 2.061683161503e-08 ||r(i)||/||b|| 2.536449080962e-07 67 KSP preconditioned resid norm 1.077850813732e-06 true resid norm 1.579257148683e-08 ||r(i)||/||b|| 1.942929649994e-07 Linear solve converged due to CONVERGED_RTOL iterations 67 iter = 2, Function value: -4.2216, Residual: 0.0358744 0 KSP preconditioned resid norm 3.078712726327e+00 true resid norm 2.511830325565e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.436785516558e+00 true resid norm 1.497519704922e-02 ||r(i)||/||b|| 5.961866491062e-01 2 KSP preconditioned resid norm 9.839748965512e-01 true resid norm 1.104748856015e-02 ||r(i)||/||b|| 4.398182650999e-01 3 KSP preconditioned resid norm 6.118449977932e-01 true resid norm 6.926850737956e-03 ||r(i)||/||b|| 2.757690544404e-01 4 KSP preconditioned resid norm 4.410561113681e-01 true resid norm 5.061094236589e-03 ||r(i)||/||b|| 2.014902911665e-01 5 KSP preconditioned resid norm 3.281601397321e-01 true resid norm 3.808680782210e-03 ||r(i)||/||b|| 1.516296998028e-01 6 KSP preconditioned resid norm 2.448606232589e-01 true resid norm 2.949881358170e-03 ||r(i)||/||b|| 1.174395152470e-01 7 KSP preconditioned resid norm 2.017578857263e-01 true resid norm 2.454929084613e-03 ||r(i)||/||b|| 9.773467019757e-02 8 KSP preconditioned resid norm 1.729452118718e-01 true resid norm 2.195908112798e-03 ||r(i)||/||b|| 8.742262924565e-02 9 KSP preconditioned resid norm 1.488159996928e-01 true resid norm 1.919411709263e-03 ||r(i)||/||b|| 7.641486328625e-02 10 KSP preconditioned resid norm 1.216676441514e-01 true resid norm 1.593978511646e-03 ||r(i)||/||b|| 6.345884494756e-02 11 KSP preconditioned resid norm 1.042776759019e-01 true resid norm 1.383079630114e-03 ||r(i)||/||b|| 5.506262170806e-02 12 KSP preconditioned resid norm 8.418920977526e-02 true resid norm 1.122799371342e-03 ||r(i)||/||b|| 4.470044651960e-02 13 KSP preconditioned resid norm 7.241350600183e-02 true resid norm 9.490161375700e-04 ||r(i)||/||b|| 3.778185683607e-02 14 KSP preconditioned resid norm 6.016619175834e-02 true resid norm 7.909259663239e-04 ||r(i)||/||b|| 3.148803317940e-02 15 KSP preconditioned resid norm 5.125981907456e-02 true resid norm 6.661005205745e-04 ||r(i)||/||b|| 2.651853167768e-02 16 KSP preconditioned resid norm 4.057011389800e-02 true resid norm 5.278135981331e-04 ||r(i)||/||b|| 2.101310716577e-02 17 KSP preconditioned resid norm 3.109972722989e-02 true resid norm 3.990281117194e-04 ||r(i)||/||b|| 1.588595008421e-02 18 KSP preconditioned resid norm 2.284614241356e-02 true resid norm 2.969350117079e-04 ||r(i)||/||b|| 1.182145978117e-02 19 KSP preconditioned resid norm 1.764483420584e-02 true resid norm 2.302438310211e-04 ||r(i)||/||b|| 9.166376752353e-03 20 KSP preconditioned resid norm 1.420015549423e-02 true resid norm 1.899378160332e-04 ||r(i)||/||b|| 7.561729552354e-03 21 KSP preconditioned resid norm 1.165478028029e-02 true resid norm 1.561867860434e-04 ||r(i)||/||b|| 6.218046834363e-03 22 KSP preconditioned resid norm 1.004130667954e-02 true resid norm 1.313954171822e-04 ||r(i)||/||b|| 5.231062617759e-03 23 KSP preconditioned resid norm 8.296092363320e-03 true resid norm 1.090794952456e-04 ||r(i)||/||b|| 4.342629919523e-03 24 KSP preconditioned resid norm 6.786577654956e-03 true resid norm 8.958840577842e-05 ||r(i)||/||b|| 3.566658339404e-03 25 KSP preconditioned resid norm 5.252814483069e-03 true resid norm 7.125683254750e-05 ||r(i)||/||b|| 2.836848963175e-03 26 KSP preconditioned resid norm 4.243930114293e-03 true resid norm 5.713783394789e-05 ||r(i)||/||b|| 2.274748949655e-03 27 KSP preconditioned resid norm 3.451657540901e-03 true resid norm 4.615164349915e-05 ||r(i)||/||b|| 1.837371060833e-03 28 KSP preconditioned resid norm 2.801350948249e-03 true resid norm 3.782739272360e-05 ||r(i)||/||b|| 1.505969266260e-03 29 KSP preconditioned resid norm 2.214358577188e-03 true resid norm 2.972786105478e-05 ||r(i)||/||b|| 1.183513900291e-03 30 KSP preconditioned resid norm 1.817748371141e-03 true resid norm 2.452255191279e-05 ||r(i)||/||b|| 9.762821820887e-04 31 KSP preconditioned resid norm 1.542430997111e-03 true resid norm 2.085501528754e-05 ||r(i)||/||b|| 8.302716578936e-04 32 KSP preconditioned resid norm 1.320573543129e-03 true resid norm 1.786098055136e-05 ||r(i)||/||b|| 7.110743257448e-04 33 KSP preconditioned resid norm 1.069123938664e-03 true resid norm 1.458332864504e-05 ||r(i)||/||b|| 5.805857384798e-04 34 KSP preconditioned resid norm 8.278294539422e-04 true resid norm 1.135711952312e-05 ||r(i)||/||b|| 4.521451710943e-04 35 KSP preconditioned resid norm 6.473427500034e-04 true resid norm 8.790182120684e-06 ||r(i)||/||b|| 3.499512698457e-04 36 KSP preconditioned resid norm 4.950078157097e-04 true resid norm 6.837510025034e-06 ||r(i)||/||b|| 2.722122571514e-04 37 KSP preconditioned resid norm 3.835822744757e-04 true resid norm 5.366537858790e-06 ||r(i)||/||b|| 2.136504923988e-04 38 KSP preconditioned resid norm 2.956450814610e-04 true resid norm 4.142299798591e-06 ||r(i)||/||b|| 1.649116087353e-04 39 KSP preconditioned resid norm 2.335051478651e-04 true resid norm 3.355387777810e-06 ||r(i)||/||b|| 1.335833771756e-04 40 KSP preconditioned resid norm 1.840176017421e-04 true resid norm 2.675401008746e-06 ||r(i)||/||b|| 1.065120116401e-04 41 KSP preconditioned resid norm 1.524320388282e-04 true resid norm 2.193486758137e-06 ||r(i)||/||b|| 8.732623122717e-05 42 KSP preconditioned resid norm 1.222919376875e-04 true resid norm 1.728883461112e-06 ||r(i)||/||b|| 6.882962768288e-05 43 KSP preconditioned resid norm 9.599565080538e-05 true resid norm 1.367007014872e-06 ||r(i)||/||b|| 5.442274507793e-05 44 KSP preconditioned resid norm 7.661562076065e-05 true resid norm 1.105137541011e-06 ||r(i)||/||b|| 4.399730068401e-05 45 KSP preconditioned resid norm 6.250406393055e-05 true resid norm 8.957073911718e-07 ||r(i)||/||b|| 3.565955001242e-05 46 KSP preconditioned resid norm 5.145354949991e-05 true resid norm 7.063338136268e-07 ||r(i)||/||b|| 2.812028370061e-05 47 KSP preconditioned resid norm 4.153440553245e-05 true resid norm 5.761498076209e-07 ||r(i)||/||b|| 2.293744930766e-05 48 KSP preconditioned resid norm 3.419464595370e-05 true resid norm 4.702601363512e-07 ||r(i)||/||b|| 1.872181140442e-05 49 KSP preconditioned resid norm 2.760201489298e-05 true resid norm 3.784962933248e-07 ||r(i)||/||b|| 1.506854541378e-05 50 KSP preconditioned resid norm 2.199152517729e-05 true resid norm 3.078726722516e-07 ||r(i)||/||b|| 1.225690561652e-05 51 KSP preconditioned resid norm 1.746370511231e-05 true resid norm 2.413117988859e-07 ||r(i)||/||b|| 9.607010331467e-06 52 KSP preconditioned resid norm 1.418239019363e-05 true resid norm 1.954928901118e-07 ||r(i)||/||b|| 7.782885974507e-06 53 KSP preconditioned resid norm 1.099712801869e-05 true resid norm 1.514891134122e-07 ||r(i)||/||b|| 6.031024941069e-06 54 KSP preconditioned resid norm 8.441120743302e-06 true resid norm 1.159390409680e-07 ||r(i)||/||b|| 4.615719453182e-06 55 KSP preconditioned resid norm 6.255938530880e-06 true resid norm 8.758832595523e-08 ||r(i)||/||b|| 3.487031948925e-06 56 KSP preconditioned resid norm 4.606004914203e-06 true resid norm 6.379221987129e-08 ||r(i)||/||b|| 2.539670742169e-06 57 KSP preconditioned resid norm 3.432709573112e-06 true resid norm 4.860654070950e-08 ||r(i)||/||b|| 1.935104462065e-06 58 KSP preconditioned resid norm 2.639944912676e-06 true resid norm 3.784335158361e-08 ||r(i)||/||b|| 1.506604614112e-06 59 KSP preconditioned resid norm 2.075012954065e-06 true resid norm 2.956848327713e-08 ||r(i)||/||b|| 1.177168814955e-06 60 KSP preconditioned resid norm 1.590198859164e-06 true resid norm 2.257854817249e-08 ||r(i)||/||b|| 8.988882705448e-07 61 KSP preconditioned resid norm 1.228443552904e-06 true resid norm 1.776903300082e-08 ||r(i)||/||b|| 7.074137460627e-07 62 KSP preconditioned resid norm 9.758189757346e-07 true resid norm 1.432398473765e-08 ||r(i)||/||b|| 5.702608409441e-07 63 KSP preconditioned resid norm 7.998041759359e-07 true resid norm 1.175896254407e-08 ||r(i)||/||b|| 4.681431872365e-07 64 KSP preconditioned resid norm 6.692356604741e-07 true resid norm 9.725658337724e-09 ||r(i)||/||b|| 3.871940806964e-07 65 KSP preconditioned resid norm 5.473721917974e-07 true resid norm 7.962042992050e-09 ||r(i)||/||b|| 3.169817208995e-07 66 KSP preconditioned resid norm 4.457349347894e-07 true resid norm 6.456213457789e-09 ||r(i)||/||b|| 2.570322283348e-07 67 KSP preconditioned resid norm 3.634186843581e-07 true resid norm 5.388520624502e-09 ||r(i)||/||b|| 2.145256616125e-07 68 KSP preconditioned resid norm 3.002175112928e-07 true resid norm 4.343848652894e-09 ||r(i)||/||b|| 1.729355923719e-07 Linear solve converged due to CONVERGED_RTOL iterations 68 iter = 3, Function value: -4.30152, Residual: 0.0171005 0 KSP preconditioned resid norm 1.319159362845e+00 true resid norm 1.183488520040e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.514138475320e-01 true resid norm 7.892484295511e-03 ||r(i)||/||b|| 6.668830463387e-01 2 KSP preconditioned resid norm 5.282231230075e-01 true resid norm 6.540292465120e-03 ||r(i)||/||b|| 5.526282979829e-01 3 KSP preconditioned resid norm 4.063192316002e-01 true resid norm 5.175775732114e-03 ||r(i)||/||b|| 4.373321451347e-01 4 KSP preconditioned resid norm 3.130861851176e-01 true resid norm 3.899975461451e-03 ||r(i)||/||b|| 3.295321750412e-01 5 KSP preconditioned resid norm 2.463984401133e-01 true resid norm 3.185989418482e-03 ||r(i)||/||b|| 2.692032380994e-01 6 KSP preconditioned resid norm 1.980125928647e-01 true resid norm 2.545811460686e-03 ||r(i)||/||b|| 2.151107862542e-01 7 KSP preconditioned resid norm 1.622464235995e-01 true resid norm 2.174720631219e-03 ||r(i)||/||b|| 1.837551099478e-01 8 KSP preconditioned resid norm 1.330735287123e-01 true resid norm 1.813144130295e-03 ||r(i)||/||b|| 1.532033559763e-01 9 KSP preconditioned resid norm 1.137966185611e-01 true resid norm 1.568927345671e-03 ||r(i)||/||b|| 1.325680240327e-01 10 KSP preconditioned resid norm 9.405509160505e-02 true resid norm 1.298916291879e-03 ||r(i)||/||b|| 1.097531805239e-01 11 KSP preconditioned resid norm 7.945645440114e-02 true resid norm 1.123739708851e-03 ||r(i)||/||b|| 9.495146677158e-02 12 KSP preconditioned resid norm 6.573669085128e-02 true resid norm 9.226307522234e-04 ||r(i)||/||b|| 7.795857218727e-02 13 KSP preconditioned resid norm 5.653185279397e-02 true resid norm 7.799404380735e-04 ||r(i)||/||b|| 6.590181694768e-02 14 KSP preconditioned resid norm 4.569747543150e-02 true resid norm 6.263070868686e-04 ||r(i)||/||b|| 5.292041927432e-02 15 KSP preconditioned resid norm 3.744441561071e-02 true resid norm 5.070359966992e-04 ||r(i)||/||b|| 4.284249387414e-02 16 KSP preconditioned resid norm 3.035528717192e-02 true resid norm 4.175506502479e-04 ||r(i)||/||b|| 3.528134351770e-02 17 KSP preconditioned resid norm 2.479764869002e-02 true resid norm 3.341761443657e-04 ||r(i)||/||b|| 2.823653450854e-02 18 KSP preconditioned resid norm 1.963408065179e-02 true resid norm 2.724426047914e-04 ||r(i)||/||b|| 2.302029974758e-02 19 KSP preconditioned resid norm 1.610937948671e-02 true resid norm 2.197476260379e-04 ||r(i)||/||b|| 1.856778687050e-02 20 KSP preconditioned resid norm 1.373140776591e-02 true resid norm 1.905474815130e-04 ||r(i)||/||b|| 1.610049259342e-02 21 KSP preconditioned resid norm 1.142190412378e-02 true resid norm 1.623013826004e-04 ||r(i)||/||b|| 1.371381131731e-02 22 KSP preconditioned resid norm 9.075259152653e-03 true resid norm 1.295684097434e-04 ||r(i)||/||b|| 1.094800731477e-02 23 KSP preconditioned resid norm 6.961620886742e-03 true resid norm 9.850472550573e-05 ||r(i)||/||b|| 8.323251458528e-03 24 KSP preconditioned resid norm 5.280712373373e-03 true resid norm 7.539993600800e-05 ||r(i)||/||b|| 6.370990062958e-03 25 KSP preconditioned resid norm 4.019447413973e-03 true resid norm 5.668561339935e-05 ||r(i)||/||b|| 4.789705387040e-03 26 KSP preconditioned resid norm 3.100674051153e-03 true resid norm 4.388073055531e-05 ||r(i)||/||b|| 3.707744503836e-03 27 KSP preconditioned resid norm 2.404289818837e-03 true resid norm 3.388570958937e-05 ||r(i)||/||b|| 2.863205600695e-03 28 KSP preconditioned resid norm 1.963370194805e-03 true resid norm 2.831465399283e-05 ||r(i)||/||b|| 2.392473903497e-03 29 KSP preconditioned resid norm 1.657212289849e-03 true resid norm 2.348881079581e-05 ||r(i)||/||b|| 1.984709644249e-03 30 KSP preconditioned resid norm 1.408984269678e-03 true resid norm 1.997109783296e-05 ||r(i)||/||b|| 1.687477106435e-03 31 KSP preconditioned resid norm 1.184952270711e-03 true resid norm 1.669200873332e-05 ||r(i)||/||b|| 1.410407321294e-03 32 KSP preconditioned resid norm 9.955740309112e-04 true resid norm 1.409585267556e-05 ||r(i)||/||b|| 1.191042619922e-03 33 KSP preconditioned resid norm 8.010772703545e-04 true resid norm 1.146129488172e-05 ||r(i)||/||b|| 9.684331269501e-04 34 KSP preconditioned resid norm 6.571004710871e-04 true resid norm 9.274027085601e-06 ||r(i)||/||b|| 7.836178322446e-04 35 KSP preconditioned resid norm 5.457596691859e-04 true resid norm 7.719885711333e-06 ||r(i)||/||b|| 6.522991630770e-04 36 KSP preconditioned resid norm 4.424577045973e-04 true resid norm 6.334457661890e-06 ||r(i)||/||b|| 5.352360884478e-04 37 KSP preconditioned resid norm 3.547158795991e-04 true resid norm 5.132558751496e-06 ||r(i)||/||b|| 4.336804848198e-04 38 KSP preconditioned resid norm 2.776384857414e-04 true resid norm 4.029571043015e-06 ||r(i)||/||b|| 3.404824782651e-04 39 KSP preconditioned resid norm 2.242160215654e-04 true resid norm 3.239872148598e-06 ||r(i)||/||b|| 2.737561111694e-04 40 KSP preconditioned resid norm 1.821914020833e-04 true resid norm 2.639786697609e-06 ||r(i)||/||b|| 2.230513142214e-04 41 KSP preconditioned resid norm 1.493010091960e-04 true resid norm 2.158224275163e-06 ||r(i)||/||b|| 1.823612344876e-04 42 KSP preconditioned resid norm 1.187594113186e-04 true resid norm 1.712192142730e-06 ||r(i)||/||b|| 1.446733207578e-04 43 KSP preconditioned resid norm 9.397295222777e-05 true resid norm 1.370897338981e-06 ||r(i)||/||b|| 1.158352882827e-04 44 KSP preconditioned resid norm 7.358206485516e-05 true resid norm 1.097675946634e-06 ||r(i)||/||b|| 9.274918413210e-05 45 KSP preconditioned resid norm 5.882058863425e-05 true resid norm 8.689029449935e-07 ||r(i)||/||b|| 7.341878947539e-05 46 KSP preconditioned resid norm 4.685478354048e-05 true resid norm 6.993294964093e-07 ||r(i)||/||b|| 5.909051795329e-05 47 KSP preconditioned resid norm 3.673844874942e-05 true resid norm 5.407211758978e-07 ||r(i)||/||b|| 4.568875546673e-05 48 KSP preconditioned resid norm 2.957177102988e-05 true resid norm 4.300034119448e-07 ||r(i)||/||b|| 3.633355158614e-05 49 KSP preconditioned resid norm 2.367745467249e-05 true resid norm 3.504211062746e-07 ||r(i)||/||b|| 2.960916817873e-05 50 KSP preconditioned resid norm 1.925644647411e-05 true resid norm 2.851183607579e-07 ||r(i)||/||b|| 2.409134993116e-05 51 KSP preconditioned resid norm 1.556912872089e-05 true resid norm 2.301064672081e-07 ||r(i)||/||b|| 1.944306711148e-05 52 KSP preconditioned resid norm 1.275554806204e-05 true resid norm 1.902090952981e-07 ||r(i)||/||b|| 1.607190032495e-05 53 KSP preconditioned resid norm 1.079695120256e-05 true resid norm 1.626571536353e-07 ||r(i)||/||b|| 1.374387253286e-05 54 KSP preconditioned resid norm 9.169302418572e-06 true resid norm 1.374527463931e-07 ||r(i)||/||b|| 1.161420191794e-05 55 KSP preconditioned resid norm 7.734363665135e-06 true resid norm 1.157756053301e-07 ||r(i)||/||b|| 9.782571049036e-06 56 KSP preconditioned resid norm 6.412346887180e-06 true resid norm 9.508928829817e-08 ||r(i)||/||b|| 8.034660808958e-06 57 KSP preconditioned resid norm 5.201601050280e-06 true resid norm 7.795914925251e-08 ||r(i)||/||b|| 6.587233245818e-06 58 KSP preconditioned resid norm 4.207592687478e-06 true resid norm 6.201308598100e-08 ||r(i)||/||b|| 5.239855303278e-06 59 KSP preconditioned resid norm 3.361615715837e-06 true resid norm 4.977498135501e-08 ||r(i)||/||b|| 4.205784890362e-06 60 KSP preconditioned resid norm 2.644819873761e-06 true resid norm 3.974320482898e-08 ||r(i)||/||b|| 3.358140290844e-06 61 KSP preconditioned resid norm 2.069478800455e-06 true resid norm 3.165162630840e-08 ||r(i)||/||b|| 2.674434586601e-06 62 KSP preconditioned resid norm 1.607732490097e-06 true resid norm 2.428765462838e-08 ||r(i)||/||b|| 2.052208721683e-06 63 KSP preconditioned resid norm 1.236234589205e-06 true resid norm 1.866591929698e-08 ||r(i)||/||b|| 1.577194791577e-06 64 KSP preconditioned resid norm 9.482626657129e-07 true resid norm 1.425411657557e-08 ||r(i)||/||b|| 1.204415280267e-06 65 KSP preconditioned resid norm 7.301471738651e-07 true resid norm 1.106664275922e-08 ||r(i)||/||b|| 9.350866165436e-07 66 KSP preconditioned resid norm 5.604900760609e-07 true resid norm 8.463436362522e-09 ||r(i)||/||b|| 7.151261900061e-07 67 KSP preconditioned resid norm 4.240448021014e-07 true resid norm 6.490057824452e-09 ||r(i)||/||b|| 5.483836737371e-07 68 KSP preconditioned resid norm 3.321470511043e-07 true resid norm 5.088792971299e-09 ||r(i)||/||b|| 4.299824531570e-07 69 KSP preconditioned resid norm 2.620915884870e-07 true resid norm 4.047590336629e-09 ||r(i)||/||b|| 3.420050357981e-07 70 KSP preconditioned resid norm 2.119517793704e-07 true resid norm 3.262121220019e-09 ||r(i)||/||b|| 2.756360678437e-07 71 KSP preconditioned resid norm 1.730393081521e-07 true resid norm 2.600379407906e-09 ||r(i)||/||b|| 2.197215574020e-07 72 KSP preconditioned resid norm 1.423004777798e-07 true resid norm 2.134841301621e-09 ||r(i)||/||b|| 1.803854676637e-07 73 KSP preconditioned resid norm 1.174690852278e-07 true resid norm 1.732300323811e-09 ||r(i)||/||b|| 1.463723808451e-07 Linear solve converged due to CONVERGED_RTOL iterations 73 iter = 4, Function value: -4.32312, Residual: 0.0125018 0 KSP preconditioned resid norm 7.709622512815e-01 true resid norm 7.928781623310e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.868155790035e-01 true resid norm 5.222403296788e-03 ||r(i)||/||b|| 6.586640350183e-01 2 KSP preconditioned resid norm 3.007215979876e-01 true resid norm 4.069354370316e-03 ||r(i)||/||b|| 5.132382960773e-01 3 KSP preconditioned resid norm 2.467162568911e-01 true resid norm 3.314673931707e-03 ||r(i)||/||b|| 4.180558992774e-01 4 KSP preconditioned resid norm 2.139158321559e-01 true resid norm 2.888968849177e-03 ||r(i)||/||b|| 3.643647897533e-01 5 KSP preconditioned resid norm 1.801774908774e-01 true resid norm 2.538770371212e-03 ||r(i)||/||b|| 3.201967832924e-01 6 KSP preconditioned resid norm 1.523252608706e-01 true resid norm 2.111805222248e-03 ||r(i)||/||b|| 2.663467506835e-01 7 KSP preconditioned resid norm 1.271850316560e-01 true resid norm 1.791383014721e-03 ||r(i)||/||b|| 2.259342103021e-01 8 KSP preconditioned resid norm 1.068570399306e-01 true resid norm 1.472385780399e-03 ||r(i)||/||b|| 1.857013915063e-01 9 KSP preconditioned resid norm 9.098020309241e-02 true resid norm 1.295156100566e-03 ||r(i)||/||b|| 1.633486911480e-01 10 KSP preconditioned resid norm 7.634262887695e-02 true resid norm 1.064863801450e-03 ||r(i)||/||b|| 1.343035856000e-01 11 KSP preconditioned resid norm 6.517309124223e-02 true resid norm 9.374080053428e-04 ||r(i)||/||b|| 1.182285059519e-01 12 KSP preconditioned resid norm 5.357026307018e-02 true resid norm 7.641667526497e-04 ||r(i)||/||b|| 9.637883712210e-02 13 KSP preconditioned resid norm 4.472183390369e-02 true resid norm 6.625726319158e-04 ||r(i)||/||b|| 8.356550393164e-02 14 KSP preconditioned resid norm 3.668430630223e-02 true resid norm 5.317367290624e-04 ||r(i)||/||b|| 6.706411581562e-02 15 KSP preconditioned resid norm 3.031382549937e-02 true resid norm 4.390550484943e-04 ||r(i)||/||b|| 5.537484437755e-02 16 KSP preconditioned resid norm 2.494909278502e-02 true resid norm 3.582328132280e-04 ||r(i)||/||b|| 4.518131917958e-02 17 KSP preconditioned resid norm 2.126516174579e-02 true resid norm 3.054595407649e-04 ||r(i)||/||b|| 3.852540721602e-02 18 KSP preconditioned resid norm 1.761386078976e-02 true resid norm 2.576717960010e-04 ||r(i)||/||b|| 3.249828387801e-02 19 KSP preconditioned resid norm 1.428915254056e-02 true resid norm 2.087003015117e-04 ||r(i)||/||b|| 2.632186273085e-02 20 KSP preconditioned resid norm 1.149226186765e-02 true resid norm 1.692332453822e-04 ||r(i)||/||b|| 2.134416779555e-02 21 KSP preconditioned resid norm 9.606351087692e-03 true resid norm 1.391666927645e-04 ||r(i)||/||b|| 1.755209051985e-02 22 KSP preconditioned resid norm 8.138961870621e-03 true resid norm 1.197252337242e-04 ||r(i)||/||b|| 1.510007961024e-02 23 KSP preconditioned resid norm 6.601693821087e-03 true resid norm 9.641926065813e-05 ||r(i)||/||b|| 1.216066543877e-02 24 KSP preconditioned resid norm 5.135125242485e-03 true resid norm 7.560918870159e-05 ||r(i)||/||b|| 9.536041260023e-03 25 KSP preconditioned resid norm 3.862523889699e-03 true resid norm 5.722676889872e-05 ||r(i)||/||b|| 7.217599325787e-03 26 KSP preconditioned resid norm 3.050385434947e-03 true resid norm 4.533388720455e-05 ||r(i)||/||b|| 5.717635994826e-03 27 KSP preconditioned resid norm 2.295504988717e-03 true resid norm 3.443897377922e-05 ||r(i)||/||b|| 4.343539198756e-03 28 KSP preconditioned resid norm 1.826591765407e-03 true resid norm 2.711481381846e-05 ||r(i)||/||b|| 3.419795765184e-03 29 KSP preconditioned resid norm 1.484742685428e-03 true resid norm 2.244463678780e-05 ||r(i)||/||b|| 2.830780043408e-03 30 KSP preconditioned resid norm 1.250876934752e-03 true resid norm 1.873596933527e-05 ||r(i)||/||b|| 2.363032585005e-03 31 KSP preconditioned resid norm 1.032962485683e-03 true resid norm 1.560473524447e-05 ||r(i)||/||b|| 1.968112628881e-03 32 KSP preconditioned resid norm 8.894040124145e-04 true resid norm 1.309766609622e-05 ||r(i)||/||b|| 1.651914092035e-03 33 KSP preconditioned resid norm 7.608612815522e-04 true resid norm 1.114927897527e-05 ||r(i)||/||b|| 1.406178086995e-03 34 KSP preconditioned resid norm 6.300411545423e-04 true resid norm 9.272872418054e-06 ||r(i)||/||b|| 1.169520470938e-03 35 KSP preconditioned resid norm 5.186229512144e-04 true resid norm 7.589535105842e-06 ||r(i)||/||b|| 9.572132852707e-04 36 KSP preconditioned resid norm 4.177517723941e-04 true resid norm 6.120504847987e-06 ||r(i)||/||b|| 7.719351016043e-04 37 KSP preconditioned resid norm 3.383661191525e-04 true resid norm 4.981170149298e-06 ||r(i)||/||b|| 6.282390392307e-04 38 KSP preconditioned resid norm 2.786910672159e-04 true resid norm 4.095733310453e-06 ||r(i)||/||b|| 5.165652814062e-04 39 KSP preconditioned resid norm 2.234724191416e-04 true resid norm 3.282704710025e-06 ||r(i)||/||b|| 4.140238520853e-04 40 KSP preconditioned resid norm 1.771887908598e-04 true resid norm 2.631346060313e-06 ||r(i)||/||b|| 3.318726867917e-04 41 KSP preconditioned resid norm 1.448364754336e-04 true resid norm 2.137803862729e-06 ||r(i)||/||b|| 2.696257715617e-04 42 KSP preconditioned resid norm 1.173057791815e-04 true resid norm 1.749200058130e-06 ||r(i)||/||b|| 2.206139784438e-04 43 KSP preconditioned resid norm 9.136391611972e-05 true resid norm 1.371716375931e-06 ||r(i)||/||b|| 1.730046861044e-04 44 KSP preconditioned resid norm 6.988362222953e-05 true resid norm 1.052858769385e-06 ||r(i)||/||b|| 1.327894775523e-04 45 KSP preconditioned resid norm 5.397526544397e-05 true resid norm 8.117735692941e-07 ||r(i)||/||b|| 1.023831413023e-04 46 KSP preconditioned resid norm 4.275705913272e-05 true resid norm 6.490413162402e-07 ||r(i)||/||b|| 8.185889674803e-05 47 KSP preconditioned resid norm 3.420852324177e-05 true resid norm 5.224381885425e-07 ||r(i)||/||b|| 6.589135801226e-05 48 KSP preconditioned resid norm 2.727158398477e-05 true resid norm 4.195916155157e-07 ||r(i)||/||b|| 5.292006205369e-05 49 KSP preconditioned resid norm 2.195560301124e-05 true resid norm 3.322469375010e-07 ||r(i)||/||b|| 4.190390822775e-05 50 KSP preconditioned resid norm 1.784285373496e-05 true resid norm 2.735801719570e-07 ||r(i)||/||b|| 3.450469251829e-05 51 KSP preconditioned resid norm 1.429971513510e-05 true resid norm 2.196630075085e-07 ||r(i)||/||b|| 2.770450971468e-05 52 KSP preconditioned resid norm 1.197607383692e-05 true resid norm 1.843615346208e-07 ||r(i)||/||b|| 2.325218972847e-05 53 KSP preconditioned resid norm 1.002312681434e-05 true resid norm 1.558529279530e-07 ||r(i)||/||b|| 1.965660493093e-05 54 KSP preconditioned resid norm 8.489406201593e-06 true resid norm 1.335014904381e-07 ||r(i)||/||b|| 1.683757943915e-05 55 KSP preconditioned resid norm 7.198586674778e-06 true resid norm 1.113500046787e-07 ||r(i)||/||b|| 1.404377241912e-05 56 KSP preconditioned resid norm 6.011165514260e-06 true resid norm 9.338430284201e-08 ||r(i)||/||b|| 1.177788811429e-05 57 KSP preconditioned resid norm 5.024838735304e-06 true resid norm 7.886406151033e-08 ||r(i)||/||b|| 9.946554875276e-06 58 KSP preconditioned resid norm 4.122144046193e-06 true resid norm 6.383182503092e-08 ||r(i)||/||b|| 8.050647383611e-06 59 KSP preconditioned resid norm 3.255476877261e-06 true resid norm 5.067810669811e-08 ||r(i)||/||b|| 6.391663827532e-06 60 KSP preconditioned resid norm 2.547102630180e-06 true resid norm 3.918002013874e-08 ||r(i)||/||b|| 4.941493157481e-06 61 KSP preconditioned resid norm 2.047599683181e-06 true resid norm 3.190856475230e-08 ||r(i)||/||b|| 4.024396971471e-06 62 KSP preconditioned resid norm 1.631539133662e-06 true resid norm 2.571493735843e-08 ||r(i)||/||b|| 3.243239450918e-06 63 KSP preconditioned resid norm 1.261302419970e-06 true resid norm 2.000358336127e-08 ||r(i)||/||b|| 2.522907593073e-06 64 KSP preconditioned resid norm 9.838126058447e-07 true resid norm 1.548369379132e-08 ||r(i)||/||b|| 1.952846544013e-06 65 KSP preconditioned resid norm 8.200700468068e-07 true resid norm 1.267690752501e-08 ||r(i)||/||b|| 1.598846850283e-06 66 KSP preconditioned resid norm 6.743656138529e-07 true resid norm 1.054747497037e-08 ||r(i)||/||b|| 1.330276891391e-06 67 KSP preconditioned resid norm 5.416191897012e-07 true resid norm 8.553548547030e-09 ||r(i)||/||b|| 1.078797342821e-06 68 KSP preconditioned resid norm 4.267199659132e-07 true resid norm 6.635797645204e-09 ||r(i)||/||b|| 8.369252629806e-07 69 KSP preconditioned resid norm 3.391095640834e-07 true resid norm 5.187682071427e-09 ||r(i)||/||b|| 6.542848974646e-07 70 KSP preconditioned resid norm 2.712984065467e-07 true resid norm 4.227479017140e-09 ||r(i)||/||b|| 5.331814165132e-07 71 KSP preconditioned resid norm 2.196306116474e-07 true resid norm 3.373713083816e-09 ||r(i)||/||b|| 4.255020814166e-07 72 KSP preconditioned resid norm 1.826204483932e-07 true resid norm 2.834832303986e-09 ||r(i)||/||b|| 3.575369380400e-07 73 KSP preconditioned resid norm 1.547957220154e-07 true resid norm 2.398174892013e-09 ||r(i)||/||b|| 3.024644902519e-07 74 KSP preconditioned resid norm 1.297186572995e-07 true resid norm 2.012637005794e-09 ||r(i)||/||b|| 2.538393792909e-07 75 KSP preconditioned resid norm 1.045606367621e-07 true resid norm 1.616988650277e-09 ||r(i)||/||b|| 2.039391078099e-07 76 KSP preconditioned resid norm 8.282351720412e-08 true resid norm 1.285928628967e-09 ||r(i)||/||b|| 1.621848967547e-07 77 KSP preconditioned resid norm 6.403069302509e-08 true resid norm 1.012303943116e-09 ||r(i)||/||b|| 1.276745900202e-07 Linear solve converged due to CONVERGED_RTOL iterations 77 iter = 5, Function value: -4.33214, Residual: 0.00947882 0 KSP preconditioned resid norm 5.329136712372e-01 true resid norm 5.927453203187e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.664093052616e-01 true resid norm 4.065588115303e-03 ||r(i)||/||b|| 6.858912210588e-01 2 KSP preconditioned resid norm 2.151473001343e-01 true resid norm 3.177958116214e-03 ||r(i)||/||b|| 5.361422532202e-01 3 KSP preconditioned resid norm 1.694601062455e-01 true resid norm 2.550533948058e-03 ||r(i)||/||b|| 4.302917054979e-01 4 KSP preconditioned resid norm 1.490935605013e-01 true resid norm 2.128826532466e-03 ||r(i)||/||b|| 3.591469151239e-01 5 KSP preconditioned resid norm 1.307585700081e-01 true resid norm 1.931498902237e-03 ||r(i)||/||b|| 3.258564574072e-01 6 KSP preconditioned resid norm 1.201544554759e-01 true resid norm 1.671023961826e-03 ||r(i)||/||b|| 2.819126367674e-01 7 KSP preconditioned resid norm 1.116244994383e-01 true resid norm 1.577516302907e-03 ||r(i)||/||b|| 2.661372850753e-01 8 KSP preconditioned resid norm 9.495490236895e-02 true resid norm 1.361468244749e-03 ||r(i)||/||b|| 2.296885691172e-01 9 KSP preconditioned resid norm 8.118422785956e-02 true resid norm 1.209410879861e-03 ||r(i)||/||b|| 2.040355003074e-01 10 KSP preconditioned resid norm 6.709535676765e-02 true resid norm 9.980029944157e-04 ||r(i)||/||b|| 1.683696117380e-01 11 KSP preconditioned resid norm 5.919171655500e-02 true resid norm 8.920529672100e-04 ||r(i)||/||b|| 1.504951513966e-01 12 KSP preconditioned resid norm 5.067138041274e-02 true resid norm 7.601006840020e-04 ||r(i)||/||b|| 1.282339409434e-01 13 KSP preconditioned resid norm 4.416001891393e-02 true resid norm 6.756683010601e-04 ||r(i)||/||b|| 1.139896474757e-01 14 KSP preconditioned resid norm 3.647454065183e-02 true resid norm 5.531003314108e-04 ||r(i)||/||b|| 9.331163190177e-02 15 KSP preconditioned resid norm 2.986371608486e-02 true resid norm 4.479365473114e-04 ||r(i)||/||b|| 7.556981589842e-02 16 KSP preconditioned resid norm 2.405962889876e-02 true resid norm 3.642092201297e-04 ||r(i)||/||b|| 6.144446993421e-02 17 KSP preconditioned resid norm 1.966721103217e-02 true resid norm 3.022774028686e-04 ||r(i)||/||b|| 5.099616859161e-02 18 KSP preconditioned resid norm 1.662061553107e-02 true resid norm 2.516612765823e-04 ||r(i)||/||b|| 4.245689809023e-02 19 KSP preconditioned resid norm 1.398844565797e-02 true resid norm 2.111530131261e-04 ||r(i)||/||b|| 3.562288994750e-02 20 KSP preconditioned resid norm 1.181033334376e-02 true resid norm 1.794877915672e-04 ||r(i)||/||b|| 3.028076062594e-02 21 KSP preconditioned resid norm 9.640934439034e-03 true resid norm 1.474825846225e-04 ||r(i)||/||b|| 2.488127355324e-02 22 KSP preconditioned resid norm 8.156007324220e-03 true resid norm 1.258066352815e-04 ||r(i)||/||b|| 2.122439958090e-02 23 KSP preconditioned resid norm 6.637989351340e-03 true resid norm 1.008526654973e-04 ||r(i)||/||b|| 1.701450218840e-02 24 KSP preconditioned resid norm 5.476945032608e-03 true resid norm 8.395519862224e-05 ||r(i)||/||b|| 1.416378936186e-02 25 KSP preconditioned resid norm 4.322504107675e-03 true resid norm 6.616893612549e-05 ||r(i)||/||b|| 1.116313091935e-02 26 KSP preconditioned resid norm 3.461079910913e-03 true resid norm 5.347371060896e-05 ||r(i)||/||b|| 9.021363606921e-03 27 KSP preconditioned resid norm 2.772679072192e-03 true resid norm 4.238410209063e-05 ||r(i)||/||b|| 7.150474350069e-03 28 KSP preconditioned resid norm 2.223172474694e-03 true resid norm 3.423967659879e-05 ||r(i)||/||b|| 5.776456671202e-03 29 KSP preconditioned resid norm 1.823564665075e-03 true resid norm 2.808015905473e-05 ||r(i)||/||b|| 4.737305903930e-03 30 KSP preconditioned resid norm 1.568714926950e-03 true resid norm 2.405937268973e-05 ||r(i)||/||b|| 4.058973030238e-03 31 KSP preconditioned resid norm 1.330951220115e-03 true resid norm 2.050232337301e-05 ||r(i)||/||b|| 3.458875619969e-03 32 KSP preconditioned resid norm 1.112445380693e-03 true resid norm 1.712991499703e-05 ||r(i)||/||b|| 2.889928340188e-03 33 KSP preconditioned resid norm 9.563483445891e-04 true resid norm 1.503181056583e-05 ||r(i)||/||b|| 2.535964443843e-03 34 KSP preconditioned resid norm 8.050833361618e-04 true resid norm 1.247054349296e-05 ||r(i)||/||b|| 2.103861990215e-03 35 KSP preconditioned resid norm 6.691621997592e-04 true resid norm 1.048761682431e-05 ||r(i)||/||b|| 1.769329333325e-03 36 KSP preconditioned resid norm 5.616827971815e-04 true resid norm 8.696097029531e-06 ||r(i)||/||b|| 1.467088263954e-03 37 KSP preconditioned resid norm 4.644596069741e-04 true resid norm 7.111056878819e-06 ||r(i)||/||b|| 1.199681656701e-03 38 KSP preconditioned resid norm 3.797383619251e-04 true resid norm 5.800312293723e-06 ||r(i)||/||b|| 9.785504996656e-04 39 KSP preconditioned resid norm 3.140877180449e-04 true resid norm 4.881383224291e-06 ||r(i)||/||b|| 8.235211746026e-04 40 KSP preconditioned resid norm 2.620714588908e-04 true resid norm 4.145223034537e-06 ||r(i)||/||b|| 6.993261511214e-04 41 KSP preconditioned resid norm 2.207767033161e-04 true resid norm 3.526818155252e-06 ||r(i)||/||b|| 5.949972162336e-04 42 KSP preconditioned resid norm 1.866103035841e-04 true resid norm 2.915123525249e-06 ||r(i)||/||b|| 4.918003441481e-04 43 KSP preconditioned resid norm 1.584957350172e-04 true resid norm 2.555374739511e-06 ||r(i)||/||b|| 4.311083785760e-04 44 KSP preconditioned resid norm 1.375895017144e-04 true resid norm 2.222469481167e-06 ||r(i)||/||b|| 3.749450910842e-04 45 KSP preconditioned resid norm 1.157547186203e-04 true resid norm 1.863218425765e-06 ||r(i)||/||b|| 3.143370958649e-04 46 KSP preconditioned resid norm 9.415239922608e-05 true resid norm 1.514217348913e-06 ||r(i)||/||b|| 2.554583388527e-04 47 KSP preconditioned resid norm 7.725611584946e-05 true resid norm 1.241595324784e-06 ||r(i)||/||b|| 2.094652259957e-04 48 KSP preconditioned resid norm 6.781705852316e-05 true resid norm 1.099284519571e-06 ||r(i)||/||b|| 1.854564653467e-04 49 KSP preconditioned resid norm 5.923195829130e-05 true resid norm 9.691441107602e-07 ||r(i)||/||b|| 1.635009299169e-04 50 KSP preconditioned resid norm 5.182397514281e-05 true resid norm 8.331364930504e-07 ||r(i)||/||b|| 1.405555580941e-04 51 KSP preconditioned resid norm 4.397998700492e-05 true resid norm 7.092893285835e-07 ||r(i)||/||b|| 1.196617340146e-04 52 KSP preconditioned resid norm 3.761635024544e-05 true resid norm 6.078477562331e-07 ||r(i)||/||b|| 1.025478794006e-04 53 KSP preconditioned resid norm 3.152684368196e-05 true resid norm 5.059022395950e-07 ||r(i)||/||b|| 8.534900609978e-05 54 KSP preconditioned resid norm 2.577740905240e-05 true resid norm 4.161979041119e-07 ||r(i)||/||b|| 7.021529986742e-05 55 KSP preconditioned resid norm 2.145915909493e-05 true resid norm 3.460469446994e-07 ||r(i)||/||b|| 5.838037565836e-05 56 KSP preconditioned resid norm 1.757927671844e-05 true resid norm 2.834225953553e-07 ||r(i)||/||b|| 4.781523963831e-05 57 KSP preconditioned resid norm 1.439118538326e-05 true resid norm 2.327344418164e-07 ||r(i)||/||b|| 3.926381766983e-05 58 KSP preconditioned resid norm 1.169740799247e-05 true resid norm 1.865988357324e-07 ||r(i)||/||b|| 3.148044013777e-05 59 KSP preconditioned resid norm 9.305161536024e-06 true resid norm 1.485317220345e-07 ||r(i)||/||b|| 2.505826987459e-05 60 KSP preconditioned resid norm 7.340392997175e-06 true resid norm 1.169870113782e-07 ||r(i)||/||b|| 1.973647152798e-05 61 KSP preconditioned resid norm 5.948184830516e-06 true resid norm 9.498101240174e-08 ||r(i)||/||b|| 1.602391603036e-05 62 KSP preconditioned resid norm 4.856045883121e-06 true resid norm 7.665203902884e-08 ||r(i)||/||b|| 1.293169872478e-05 63 KSP preconditioned resid norm 4.003374563206e-06 true resid norm 6.412161229166e-08 ||r(i)||/||b|| 1.081773404085e-05 64 KSP preconditioned resid norm 3.294000767888e-06 true resid norm 5.265548496789e-08 ||r(i)||/||b|| 8.883323606768e-06 65 KSP preconditioned resid norm 2.618902827840e-06 true resid norm 4.194004801676e-08 ||r(i)||/||b|| 7.075559532753e-06 66 KSP preconditioned resid norm 2.045372991215e-06 true resid norm 3.283504880375e-08 ||r(i)||/||b|| 5.539486804567e-06 67 KSP preconditioned resid norm 1.614463359023e-06 true resid norm 2.526644905659e-08 ||r(i)||/||b|| 4.262614682981e-06 68 KSP preconditioned resid norm 1.354701377277e-06 true resid norm 2.102179670960e-08 ||r(i)||/||b|| 3.546514158610e-06 69 KSP preconditioned resid norm 1.163662438531e-06 true resid norm 1.840504650815e-08 ||r(i)||/||b|| 3.105051339461e-06 70 KSP preconditioned resid norm 9.875758627213e-07 true resid norm 1.573128686295e-08 ||r(i)||/||b|| 2.653970655474e-06 71 KSP preconditioned resid norm 8.173474958423e-07 true resid norm 1.281461932427e-08 ||r(i)||/||b|| 2.161909826192e-06 72 KSP preconditioned resid norm 6.544147953464e-07 true resid norm 1.052638171516e-08 ||r(i)||/||b|| 1.775869223143e-06 73 KSP preconditioned resid norm 5.322733883208e-07 true resid norm 8.489246895874e-09 ||r(i)||/||b|| 1.432191297826e-06 74 KSP preconditioned resid norm 4.310003151637e-07 true resid norm 6.915921258683e-09 ||r(i)||/||b|| 1.166761005378e-06 75 KSP preconditioned resid norm 3.468225555240e-07 true resid norm 5.566954772995e-09 ||r(i)||/||b|| 9.391815645210e-07 76 KSP preconditioned resid norm 2.799050744056e-07 true resid norm 4.442411026782e-09 ||r(i)||/||b|| 7.494637029599e-07 77 KSP preconditioned resid norm 2.261070613956e-07 true resid norm 3.545644602364e-09 ||r(i)||/||b|| 5.981733606024e-07 78 KSP preconditioned resid norm 1.812855257508e-07 true resid norm 2.831041207041e-09 ||r(i)||/||b|| 4.776151088832e-07 79 KSP preconditioned resid norm 1.438330554410e-07 true resid norm 2.223841575568e-09 ||r(i)||/||b|| 3.751765723553e-07 80 KSP preconditioned resid norm 1.100582738876e-07 true resid norm 1.713411156369e-09 ||r(i)||/||b|| 2.890636328343e-07 81 KSP preconditioned resid norm 8.720476127648e-08 true resid norm 1.353029064479e-09 ||r(i)||/||b|| 2.282648243854e-07 82 KSP preconditioned resid norm 6.966553683318e-08 true resid norm 1.101422472758e-09 ||r(i)||/||b|| 1.858171519879e-07 83 KSP preconditioned resid norm 5.817764905132e-08 true resid norm 8.990163519562e-10 ||r(i)||/||b|| 1.516699198018e-07 84 KSP preconditioned resid norm 4.792502348292e-08 true resid norm 7.515752811570e-10 ||r(i)||/||b|| 1.267956499012e-07 Linear solve converged due to CONVERGED_RTOL iterations 84 iter = 6, Function value: -4.33719, Residual: 0.0066053 0 KSP preconditioned resid norm 3.474242904644e-01 true resid norm 3.961934578937e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.711581205688e-01 true resid norm 2.880817137667e-03 ||r(i)||/||b|| 7.271238533272e-01 2 KSP preconditioned resid norm 1.418231439181e-01 true resid norm 2.235990655353e-03 ||r(i)||/||b|| 5.643683939762e-01 3 KSP preconditioned resid norm 1.127043237665e-01 true resid norm 1.902671099409e-03 ||r(i)||/||b|| 4.802378892182e-01 4 KSP preconditioned resid norm 9.741197813294e-02 true resid norm 1.485705758073e-03 ||r(i)||/||b|| 3.749950254030e-01 5 KSP preconditioned resid norm 8.225949043715e-02 true resid norm 1.339669573870e-03 ||r(i)||/||b|| 3.381352082371e-01 6 KSP preconditioned resid norm 7.428222259201e-02 true resid norm 1.159243387601e-03 ||r(i)||/||b|| 2.925952876063e-01 7 KSP preconditioned resid norm 6.753369882308e-02 true resid norm 1.101790764376e-03 ||r(i)||/||b|| 2.780941336673e-01 8 KSP preconditioned resid norm 6.390597621527e-02 true resid norm 1.019027163634e-03 ||r(i)||/||b|| 2.572044397330e-01 9 KSP preconditioned resid norm 5.964499562612e-02 true resid norm 9.682973562020e-04 ||r(i)||/||b|| 2.444001375868e-01 10 KSP preconditioned resid norm 5.259971621876e-02 true resid norm 8.285243794341e-04 ||r(i)||/||b|| 2.091211661694e-01 11 KSP preconditioned resid norm 4.650789668926e-02 true resid norm 7.472133185795e-04 ||r(i)||/||b|| 1.885980961301e-01 12 KSP preconditioned resid norm 4.036296170738e-02 true resid norm 6.225524707407e-04 ||r(i)||/||b|| 1.571334554716e-01 13 KSP preconditioned resid norm 3.466786738847e-02 true resid norm 5.345919136867e-04 ||r(i)||/||b|| 1.349320396477e-01 14 KSP preconditioned resid norm 2.956467242261e-02 true resid norm 4.494419922228e-04 ||r(i)||/||b|| 1.134400337179e-01 15 KSP preconditioned resid norm 2.508446152309e-02 true resid norm 3.868398311138e-04 ||r(i)||/||b|| 9.763912639306e-02 16 KSP preconditioned resid norm 2.120358648392e-02 true resid norm 3.260363448712e-04 ||r(i)||/||b|| 8.229220810574e-02 17 KSP preconditioned resid norm 1.821392978794e-02 true resid norm 2.849694195055e-04 ||r(i)||/||b|| 7.192683620284e-02 18 KSP preconditioned resid norm 1.508257940402e-02 true resid norm 2.344946234797e-04 ||r(i)||/||b|| 5.918689943199e-02 19 KSP preconditioned resid norm 1.275741100495e-02 true resid norm 1.968038828670e-04 ||r(i)||/||b|| 4.967368313282e-02 20 KSP preconditioned resid norm 1.105496755734e-02 true resid norm 1.720437605441e-04 ||r(i)||/||b|| 4.342418006060e-02 21 KSP preconditioned resid norm 9.407757049979e-03 true resid norm 1.489926846925e-04 ||r(i)||/||b|| 3.760604364459e-02 22 KSP preconditioned resid norm 7.949404961184e-03 true resid norm 1.257443423064e-04 ||r(i)||/||b|| 3.173811677127e-02 23 KSP preconditioned resid norm 6.638161409345e-03 true resid norm 1.051120781328e-04 ||r(i)||/||b|| 2.653049313122e-02 24 KSP preconditioned resid norm 5.505887585297e-03 true resid norm 8.726930384494e-05 ||r(i)||/||b|| 2.202694216833e-02 25 KSP preconditioned resid norm 4.453481255837e-03 true resid norm 7.104772588582e-05 ||r(i)||/||b|| 1.793258431463e-02 26 KSP preconditioned resid norm 3.659927082290e-03 true resid norm 5.828556955789e-05 ||r(i)||/||b|| 1.471139121473e-02 27 KSP preconditioned resid norm 3.022802873924e-03 true resid norm 4.846200879961e-05 ||r(i)||/||b|| 1.223190535686e-02 28 KSP preconditioned resid norm 2.512003968093e-03 true resid norm 4.042394057365e-05 ||r(i)||/||b|| 1.020308128977e-02 29 KSP preconditioned resid norm 2.104820918184e-03 true resid norm 3.432319450940e-05 ||r(i)||/||b|| 8.663241107483e-03 30 KSP preconditioned resid norm 1.806080704286e-03 true resid norm 2.912804366027e-05 ||r(i)||/||b|| 7.351974920316e-03 31 KSP preconditioned resid norm 1.624745714602e-03 true resid norm 2.648506730262e-05 ||r(i)||/||b|| 6.684882542844e-03 32 KSP preconditioned resid norm 1.420108267995e-03 true resid norm 2.320957758719e-05 ||r(i)||/||b|| 5.858142562623e-03 33 KSP preconditioned resid norm 1.252744503163e-03 true resid norm 2.038034473742e-05 ||r(i)||/||b|| 5.144038683973e-03 34 KSP preconditioned resid norm 1.081080028832e-03 true resid norm 1.759517304493e-05 ||r(i)||/||b|| 4.441055927191e-03 35 KSP preconditioned resid norm 9.709745091152e-04 true resid norm 1.541385365060e-05 ||r(i)||/||b|| 3.890486665920e-03 36 KSP preconditioned resid norm 8.703396579731e-04 true resid norm 1.392746318119e-05 ||r(i)||/||b|| 3.515318818043e-03 37 KSP preconditioned resid norm 7.607332195048e-04 true resid norm 1.223676284304e-05 ||r(i)||/||b|| 3.088582761588e-03 38 KSP preconditioned resid norm 6.237168732407e-04 true resid norm 1.005981441910e-05 ||r(i)||/||b|| 2.539116741750e-03 39 KSP preconditioned resid norm 5.197262908567e-04 true resid norm 8.310419662042e-06 ||r(i)||/||b|| 2.097566099709e-03 40 KSP preconditioned resid norm 4.425053497849e-04 true resid norm 7.119774202159e-06 ||r(i)||/||b|| 1.797044867932e-03 41 KSP preconditioned resid norm 3.861043837958e-04 true resid norm 6.287354142416e-06 ||r(i)||/||b|| 1.586940424469e-03 42 KSP preconditioned resid norm 3.337360926896e-04 true resid norm 5.525215170658e-06 ||r(i)||/||b|| 1.394575064423e-03 43 KSP preconditioned resid norm 2.901746152663e-04 true resid norm 4.720830735591e-06 ||r(i)||/||b|| 1.191546867202e-03 44 KSP preconditioned resid norm 2.574662881464e-04 true resid norm 4.120383675600e-06 ||r(i)||/||b|| 1.039992860434e-03 45 KSP preconditioned resid norm 2.214469185508e-04 true resid norm 3.507298630253e-06 ||r(i)||/||b|| 8.852490015610e-04 46 KSP preconditioned resid norm 1.875353947062e-04 true resid norm 3.004375403889e-06 ||r(i)||/||b|| 7.583102002395e-04 47 KSP preconditioned resid norm 1.540875725788e-04 true resid norm 2.441434159676e-06 ||r(i)||/||b|| 6.162227343823e-04 48 KSP preconditioned resid norm 1.298145537129e-04 true resid norm 2.082948051347e-06 ||r(i)||/||b|| 5.257401428133e-04 49 KSP preconditioned resid norm 1.106637251370e-04 true resid norm 1.774951572776e-06 ||r(i)||/||b|| 4.480012320779e-04 50 KSP preconditioned resid norm 9.695360634495e-05 true resid norm 1.576110756896e-06 ||r(i)||/||b|| 3.978134230875e-04 51 KSP preconditioned resid norm 8.299759057479e-05 true resid norm 1.349801485228e-06 ||r(i)||/||b|| 3.406925223865e-04 52 KSP preconditioned resid norm 6.999667985659e-05 true resid norm 1.147015029675e-06 ||r(i)||/||b|| 2.895088262619e-04 53 KSP preconditioned resid norm 5.939989661658e-05 true resid norm 9.733357605049e-07 ||r(i)||/||b|| 2.456718406406e-04 54 KSP preconditioned resid norm 5.011816072046e-05 true resid norm 8.329979314243e-07 ||r(i)||/||b|| 2.102502993999e-04 55 KSP preconditioned resid norm 4.240737830147e-05 true resid norm 7.058725265191e-07 ||r(i)||/||b|| 1.781635997403e-04 56 KSP preconditioned resid norm 3.559705420947e-05 true resid norm 6.020149532401e-07 ||r(i)||/||b|| 1.519497460762e-04 57 KSP preconditioned resid norm 3.007965802532e-05 true resid norm 4.994266139406e-07 ||r(i)||/||b|| 1.260562495393e-04 58 KSP preconditioned resid norm 2.475534870412e-05 true resid norm 4.097309511098e-07 ||r(i)||/||b|| 1.034168896397e-04 59 KSP preconditioned resid norm 2.043564637907e-05 true resid norm 3.420386965339e-07 ||r(i)||/||b|| 8.633123281549e-05 60 KSP preconditioned resid norm 1.656142621722e-05 true resid norm 2.734533807265e-07 ||r(i)||/||b|| 6.902016559795e-05 61 KSP preconditioned resid norm 1.401633443807e-05 true resid norm 2.277324663823e-07 ||r(i)||/||b|| 5.748011781745e-05 62 KSP preconditioned resid norm 1.195949616259e-05 true resid norm 1.956134783230e-07 ||r(i)||/||b|| 4.937322270866e-05 63 KSP preconditioned resid norm 1.011491957601e-05 true resid norm 1.689421240417e-07 ||r(i)||/||b|| 4.264132096978e-05 64 KSP preconditioned resid norm 8.727602388246e-06 true resid norm 1.448041471437e-07 ||r(i)||/||b|| 3.654884861390e-05 65 KSP preconditioned resid norm 7.735117263729e-06 true resid norm 1.256026877535e-07 ||r(i)||/||b|| 3.170236288638e-05 66 KSP preconditioned resid norm 6.756267827530e-06 true resid norm 1.102996903578e-07 ||r(i)||/||b|| 2.783985655499e-05 67 KSP preconditioned resid norm 5.643058075763e-06 true resid norm 9.482529110440e-08 ||r(i)||/||b|| 2.393408806105e-05 68 KSP preconditioned resid norm 4.616637281689e-06 true resid norm 7.759628345890e-08 ||r(i)||/||b|| 1.958545299345e-05 69 KSP preconditioned resid norm 3.746133639040e-06 true resid norm 6.135950502325e-08 ||r(i)||/||b|| 1.548725850989e-05 70 KSP preconditioned resid norm 3.063597550441e-06 true resid norm 5.001602889000e-08 ||r(i)||/||b|| 1.262414305272e-05 71 KSP preconditioned resid norm 2.523754570469e-06 true resid norm 4.126380553334e-08 ||r(i)||/||b|| 1.041506484047e-05 72 KSP preconditioned resid norm 2.039188787404e-06 true resid norm 3.302137658554e-08 ||r(i)||/||b|| 8.334659729390e-06 73 KSP preconditioned resid norm 1.595664944021e-06 true resid norm 2.607645627141e-08 ||r(i)||/||b|| 6.581748323166e-06 74 KSP preconditioned resid norm 1.263362826252e-06 true resid norm 2.094000499886e-08 ||r(i)||/||b|| 5.285298023391e-06 75 KSP preconditioned resid norm 1.003885058459e-06 true resid norm 1.663093550128e-08 ||r(i)||/||b|| 4.197680494197e-06 76 KSP preconditioned resid norm 7.979532670862e-07 true resid norm 1.347974301787e-08 ||r(i)||/||b|| 3.402313377292e-06 77 KSP preconditioned resid norm 6.450856821257e-07 true resid norm 1.077717244727e-08 ||r(i)||/||b|| 2.720179304465e-06 78 KSP preconditioned resid norm 5.388602093955e-07 true resid norm 9.008405077328e-09 ||r(i)||/||b|| 2.273738977221e-06 79 KSP preconditioned resid norm 4.536966600695e-07 true resid norm 7.554603563040e-09 ||r(i)||/||b|| 1.906796645054e-06 80 KSP preconditioned resid norm 3.778076094541e-07 true resid norm 6.338687380953e-09 ||r(i)||/||b|| 1.599897033801e-06 81 KSP preconditioned resid norm 3.097009198498e-07 true resid norm 5.261103646847e-09 ||r(i)||/||b|| 1.327912801695e-06 82 KSP preconditioned resid norm 2.586663678939e-07 true resid norm 4.404454681252e-09 ||r(i)||/||b|| 1.111692935231e-06 83 KSP preconditioned resid norm 2.224946898937e-07 true resid norm 3.800029691501e-09 ||r(i)||/||b|| 9.591348912481e-07 84 KSP preconditioned resid norm 1.881436511955e-07 true resid norm 3.199086896759e-09 ||r(i)||/||b|| 8.074557600641e-07 85 KSP preconditioned resid norm 1.570450789636e-07 true resid norm 2.640669638632e-09 ||r(i)||/||b|| 6.665101571013e-07 86 KSP preconditioned resid norm 1.305781174045e-07 true resid norm 2.172187389986e-09 ||r(i)||/||b|| 5.482643255984e-07 87 KSP preconditioned resid norm 1.119038853979e-07 true resid norm 1.833628403124e-09 ||r(i)||/||b|| 4.628113782779e-07 88 KSP preconditioned resid norm 9.509771492227e-08 true resid norm 1.546485079686e-09 ||r(i)||/||b|| 3.903358444907e-07 89 KSP preconditioned resid norm 8.096687622075e-08 true resid norm 1.321235470378e-09 ||r(i)||/||b|| 3.334824046320e-07 90 KSP preconditioned resid norm 6.851137194289e-08 true resid norm 1.130297805011e-09 ||r(i)||/||b|| 2.852893662153e-07 91 KSP preconditioned resid norm 5.658021776650e-08 true resid norm 9.362235291973e-10 ||r(i)||/||b|| 2.363046412161e-07 92 KSP preconditioned resid norm 4.571104453997e-08 true resid norm 7.487150085066e-10 ||r(i)||/||b|| 1.889771255909e-07 93 KSP preconditioned resid norm 3.789901271190e-08 true resid norm 6.075196468897e-10 ||r(i)||/||b|| 1.533391414688e-07 94 KSP preconditioned resid norm 3.280005901845e-08 true resid norm 5.349512202283e-10 ||r(i)||/||b|| 1.350227293182e-07 Linear solve converged due to CONVERGED_RTOL iterations 94 iter = 7, Function value: -4.33959, Residual: 0.00499036 0 KSP preconditioned resid norm 2.227135824297e-01 true resid norm 2.885649999271e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.117329976209e-01 true resid norm 1.976488530071e-03 ||r(i)||/||b|| 6.849370265176e-01 2 KSP preconditioned resid norm 8.395782924997e-02 true resid norm 1.507238385971e-03 ||r(i)||/||b|| 5.223219677896e-01 3 KSP preconditioned resid norm 6.723893428241e-02 true resid norm 1.178422091524e-03 ||r(i)||/||b|| 4.083731886479e-01 4 KSP preconditioned resid norm 5.949033337171e-02 true resid norm 1.019949270842e-03 ||r(i)||/||b|| 3.534556412244e-01 5 KSP preconditioned resid norm 5.103617038169e-02 true resid norm 8.635076643283e-04 ||r(i)||/||b|| 2.992419955803e-01 6 KSP preconditioned resid norm 4.385744566872e-02 true resid norm 7.549423809662e-04 ||r(i)||/||b|| 2.616195245982e-01 7 KSP preconditioned resid norm 4.058790001615e-02 true resid norm 6.727259966837e-04 ||r(i)||/||b|| 2.331280636438e-01 8 KSP preconditioned resid norm 3.697052330890e-02 true resid norm 5.947015342644e-04 ||r(i)||/||b|| 2.060892812415e-01 9 KSP preconditioned resid norm 3.555374138821e-02 true resid norm 5.488462780438e-04 ||r(i)||/||b|| 1.901984919108e-01 10 KSP preconditioned resid norm 3.320637068068e-02 true resid norm 5.145096199418e-04 ||r(i)||/||b|| 1.782993849121e-01 11 KSP preconditioned resid norm 3.148648645139e-02 true resid norm 4.914223746345e-04 ||r(i)||/||b|| 1.702986761245e-01 12 KSP preconditioned resid norm 2.757435903590e-02 true resid norm 4.278993339338e-04 ||r(i)||/||b|| 1.482852508246e-01 13 KSP preconditioned resid norm 2.387535647702e-02 true resid norm 3.787067110210e-04 ||r(i)||/||b|| 1.312379225189e-01 14 KSP preconditioned resid norm 2.018708972566e-02 true resid norm 3.185324421095e-04 ||r(i)||/||b|| 1.103849885433e-01 15 KSP preconditioned resid norm 1.716177336666e-02 true resid norm 2.735217322199e-04 ||r(i)||/||b|| 9.478687030270e-02 16 KSP preconditioned resid norm 1.417279867737e-02 true resid norm 2.288780821359e-04 ||r(i)||/||b|| 7.931595383836e-02 17 KSP preconditioned resid norm 1.188661837672e-02 true resid norm 1.933504026258e-04 ||r(i)||/||b|| 6.700410745400e-02 18 KSP preconditioned resid norm 1.006999239673e-02 true resid norm 1.651740015026e-04 ||r(i)||/||b|| 5.723979053049e-02 19 KSP preconditioned resid norm 8.458854449409e-03 true resid norm 1.389843086347e-04 ||r(i)||/||b|| 4.816395220136e-02 20 KSP preconditioned resid norm 7.366150486832e-03 true resid norm 1.196917098180e-04 ||r(i)||/||b|| 4.147824921535e-02 21 KSP preconditioned resid norm 6.497187986366e-03 true resid norm 1.063338341813e-04 ||r(i)||/||b|| 3.684917928652e-02 22 KSP preconditioned resid norm 5.827165888069e-03 true resid norm 9.523847375892e-05 ||r(i)||/||b|| 3.300416675029e-02 23 KSP preconditioned resid norm 4.931096810044e-03 true resid norm 8.074512380836e-05 ||r(i)||/||b|| 2.798160685764e-02 24 KSP preconditioned resid norm 4.202613829020e-03 true resid norm 6.941393955966e-05 ||r(i)||/||b|| 2.405487137289e-02 25 KSP preconditioned resid norm 3.465789364999e-03 true resid norm 5.702987179756e-05 ||r(i)||/||b|| 1.976326713633e-02 26 KSP preconditioned resid norm 2.902762094140e-03 true resid norm 4.810274900340e-05 ||r(i)||/||b|| 1.666964081422e-02 27 KSP preconditioned resid norm 2.380474750771e-03 true resid norm 3.950999784913e-05 ||r(i)||/||b|| 1.369188843384e-02 28 KSP preconditioned resid norm 1.978893964352e-03 true resid norm 3.281412887435e-05 ||r(i)||/||b|| 1.137148610630e-02 29 KSP preconditioned resid norm 1.645760099757e-03 true resid norm 2.751179724744e-05 ||r(i)||/||b|| 9.534003518925e-03 30 KSP preconditioned resid norm 1.423066188994e-03 true resid norm 2.352995721189e-05 ||r(i)||/||b|| 8.154127221885e-03 31 KSP preconditioned resid norm 1.268762472840e-03 true resid norm 2.102777144020e-05 ||r(i)||/||b|| 7.287013825484e-03 32 KSP preconditioned resid norm 1.110006457706e-03 true resid norm 1.858137454395e-05 ||r(i)||/||b|| 6.439233638398e-03 33 KSP preconditioned resid norm 9.960483048182e-04 true resid norm 1.674056259132e-05 ||r(i)||/||b|| 5.801314295061e-03 34 KSP preconditioned resid norm 8.710070186926e-04 true resid norm 1.449004087785e-05 ||r(i)||/||b|| 5.021413158737e-03 35 KSP preconditioned resid norm 7.585273169470e-04 true resid norm 1.274506206496e-05 ||r(i)||/||b|| 4.416704059114e-03 36 KSP preconditioned resid norm 6.521486257904e-04 true resid norm 1.063065053081e-05 ||r(i)||/||b|| 3.683970867395e-03 37 KSP preconditioned resid norm 5.767087228669e-04 true resid norm 9.365402872902e-06 ||r(i)||/||b|| 3.245508940886e-03 38 KSP preconditioned resid norm 4.958538740952e-04 true resid norm 8.227376689908e-06 ||r(i)||/||b|| 2.851134646262e-03 39 KSP preconditioned resid norm 4.190138425860e-04 true resid norm 6.799965137006e-06 ||r(i)||/||b|| 2.356476058678e-03 40 KSP preconditioned resid norm 3.385523838861e-04 true resid norm 5.447955608655e-06 ||r(i)||/||b|| 1.887947467652e-03 41 KSP preconditioned resid norm 2.812473404588e-04 true resid norm 4.502316638953e-06 ||r(i)||/||b|| 1.560243494565e-03 42 KSP preconditioned resid norm 2.415772108871e-04 true resid norm 3.915573427365e-06 ||r(i)||/||b|| 1.356912109353e-03 43 KSP preconditioned resid norm 2.167341688705e-04 true resid norm 3.578775570689e-06 ||r(i)||/||b|| 1.240197380692e-03 44 KSP preconditioned resid norm 1.938775019971e-04 true resid norm 3.186094930185e-06 ||r(i)||/||b|| 1.104116899482e-03 45 KSP preconditioned resid norm 1.673352118713e-04 true resid norm 2.742092493214e-06 ||r(i)||/||b|| 9.502512411092e-04 46 KSP preconditioned resid norm 1.465545003838e-04 true resid norm 2.401731188726e-06 ||r(i)||/||b|| 8.323016267853e-04 47 KSP preconditioned resid norm 1.278085571838e-04 true resid norm 2.156332602427e-06 ||r(i)||/||b|| 7.472606182219e-04 48 KSP preconditioned resid norm 1.145395283644e-04 true resid norm 1.923618967772e-06 ||r(i)||/||b|| 6.666154829097e-04 49 KSP preconditioned resid norm 1.001440697871e-04 true resid norm 1.659957668766e-06 ||r(i)||/||b|| 5.752456705372e-04 50 KSP preconditioned resid norm 8.677514762931e-05 true resid norm 1.476287312102e-06 ||r(i)||/||b|| 5.115961091868e-04 51 KSP preconditioned resid norm 7.591574862560e-05 true resid norm 1.299260321326e-06 ||r(i)||/||b|| 4.502487556198e-04 52 KSP preconditioned resid norm 6.640132611549e-05 true resid norm 1.142088339713e-06 ||r(i)||/||b|| 3.957820040550e-04 53 KSP preconditioned resid norm 5.748830895110e-05 true resid norm 9.875982789612e-07 ||r(i)||/||b|| 3.422446517113e-04 54 KSP preconditioned resid norm 4.940245072152e-05 true resid norm 8.426470346471e-07 ||r(i)||/||b|| 2.920129034567e-04 55 KSP preconditioned resid norm 4.263172313668e-05 true resid norm 7.338677809012e-07 ||r(i)||/||b|| 2.543162826700e-04 56 KSP preconditioned resid norm 3.673068816357e-05 true resid norm 6.416127893384e-07 ||r(i)||/||b|| 2.223460189214e-04 57 KSP preconditioned resid norm 3.264403753267e-05 true resid norm 5.685224179767e-07 ||r(i)||/||b|| 1.970171081456e-04 58 KSP preconditioned resid norm 2.805906768969e-05 true resid norm 4.836249085245e-07 ||r(i)||/||b|| 1.675965237110e-04 59 KSP preconditioned resid norm 2.394524372714e-05 true resid norm 4.096839696077e-07 ||r(i)||/||b|| 1.419728552358e-04 60 KSP preconditioned resid norm 1.966271880533e-05 true resid norm 3.459186627132e-07 ||r(i)||/||b|| 1.198754744340e-04 61 KSP preconditioned resid norm 1.633081889428e-05 true resid norm 2.834970606872e-07 ||r(i)||/||b|| 9.824374430677e-05 62 KSP preconditioned resid norm 1.384613199798e-05 true resid norm 2.328715138572e-07 ||r(i)||/||b|| 8.069984714571e-05 63 KSP preconditioned resid norm 1.199732665707e-05 true resid norm 2.068542038897e-07 ||r(i)||/||b|| 7.168374679602e-05 64 KSP preconditioned resid norm 1.052713729472e-05 true resid norm 1.806385724770e-07 ||r(i)||/||b|| 6.259891966199e-05 65 KSP preconditioned resid norm 9.195699405680e-06 true resid norm 1.577418743274e-07 ||r(i)||/||b|| 5.466424353864e-05 66 KSP preconditioned resid norm 7.941874399883e-06 true resid norm 1.343367535892e-07 ||r(i)||/||b|| 4.655337744464e-05 67 KSP preconditioned resid norm 6.760037690776e-06 true resid norm 1.150430592683e-07 ||r(i)||/||b|| 3.986729482001e-05 68 KSP preconditioned resid norm 5.783154898670e-06 true resid norm 9.994491239218e-08 ||r(i)||/||b|| 3.463514716526e-05 69 KSP preconditioned resid norm 4.904178405653e-06 true resid norm 8.412520188205e-08 ||r(i)||/||b|| 2.915294713610e-05 70 KSP preconditioned resid norm 4.174275408684e-06 true resid norm 7.058544912990e-08 ||r(i)||/||b|| 2.446084908001e-05 71 KSP preconditioned resid norm 3.506802725391e-06 true resid norm 6.017240561483e-08 ||r(i)||/||b|| 2.085228826435e-05 72 KSP preconditioned resid norm 2.986707695018e-06 true resid norm 5.110420521590e-08 ||r(i)||/||b|| 1.770977257422e-05 73 KSP preconditioned resid norm 2.495813341764e-06 true resid norm 4.249367177950e-08 ||r(i)||/||b|| 1.472585787959e-05 74 KSP preconditioned resid norm 2.108541319177e-06 true resid norm 3.612860716922e-08 ||r(i)||/||b|| 1.252009328170e-05 75 KSP preconditioned resid norm 1.758008764090e-06 true resid norm 3.006388307729e-08 ||r(i)||/||b|| 1.041840939992e-05 76 KSP preconditioned resid norm 1.474087087893e-06 true resid norm 2.517284760668e-08 ||r(i)||/||b|| 8.723458358788e-06 77 KSP preconditioned resid norm 1.256008159579e-06 true resid norm 2.153029469865e-08 ||r(i)||/||b|| 7.461159428237e-06 78 KSP preconditioned resid norm 1.109890008983e-06 true resid norm 1.904836843952e-08 ||r(i)||/||b|| 6.601066811404e-06 79 KSP preconditioned resid norm 1.004373742540e-06 true resid norm 1.728591642542e-08 ||r(i)||/||b|| 5.990302507161e-06 80 KSP preconditioned resid norm 8.939016234631e-07 true resid norm 1.554312330527e-08 ||r(i)||/||b|| 5.386350842685e-06 81 KSP preconditioned resid norm 7.837137763429e-07 true resid norm 1.373315671199e-08 ||r(i)||/||b|| 4.759120723393e-06 82 KSP preconditioned resid norm 6.598813923488e-07 true resid norm 1.163193336350e-08 ||r(i)||/||b|| 4.030957796835e-06 83 KSP preconditioned resid norm 5.499520915549e-07 true resid norm 9.772711918176e-09 ||r(i)||/||b|| 3.386658784206e-06 84 KSP preconditioned resid norm 4.549642134890e-07 true resid norm 8.147467633825e-09 ||r(i)||/||b|| 2.823442772298e-06 85 KSP preconditioned resid norm 3.656026410605e-07 true resid norm 6.447061640346e-09 ||r(i)||/||b|| 2.234180043309e-06 86 KSP preconditioned resid norm 2.888743586297e-07 true resid norm 5.094527991959e-09 ||r(i)||/||b|| 1.765469822482e-06 87 KSP preconditioned resid norm 2.342917985524e-07 true resid norm 4.151118296604e-09 ||r(i)||/||b|| 1.438538387418e-06 88 KSP preconditioned resid norm 1.928230563493e-07 true resid norm 3.351130063095e-09 ||r(i)||/||b|| 1.161308566161e-06 89 KSP preconditioned resid norm 1.588626070574e-07 true resid norm 2.721254426539e-09 ||r(i)||/||b|| 9.430299680233e-07 90 KSP preconditioned resid norm 1.334758351805e-07 true resid norm 2.222272016972e-09 ||r(i)||/||b|| 7.701114194492e-07 91 KSP preconditioned resid norm 1.128806917646e-07 true resid norm 1.890132315663e-09 ||r(i)||/||b|| 6.550109390051e-07 92 KSP preconditioned resid norm 9.818618430104e-08 true resid norm 1.660605073594e-09 ||r(i)||/||b|| 5.754700237429e-07 93 KSP preconditioned resid norm 8.545787727719e-08 true resid norm 1.457979708551e-09 ||r(i)||/||b|| 5.052517487981e-07 94 KSP preconditioned resid norm 7.344124358177e-08 true resid norm 1.236325833164e-09 ||r(i)||/||b|| 4.284392887136e-07 95 KSP preconditioned resid norm 6.199539193848e-08 true resid norm 1.039001967351e-09 ||r(i)||/||b|| 3.600582078956e-07 96 KSP preconditioned resid norm 5.269750043691e-08 true resid norm 8.926079644187e-10 ||r(i)||/||b|| 3.093264826449e-07 97 KSP preconditioned resid norm 4.579889093423e-08 true resid norm 7.600648357004e-10 ||r(i)||/||b|| 2.633946722203e-07 98 KSP preconditioned resid norm 4.120038839958e-08 true resid norm 6.868713709193e-10 ||r(i)||/||b|| 2.380300352062e-07 99 KSP preconditioned resid norm 3.732660316701e-08 true resid norm 6.213163136750e-10 ||r(i)||/||b|| 2.153124300702e-07 100 KSP preconditioned resid norm 3.327169042246e-08 true resid norm 5.546240223426e-10 ||r(i)||/||b|| 1.922007251339e-07 101 KSP preconditioned resid norm 2.968967801300e-08 true resid norm 4.802412088211e-10 ||r(i)||/||b|| 1.664239283844e-07 102 KSP preconditioned resid norm 2.627260749907e-08 true resid norm 4.286604038728e-10 ||r(i)||/||b|| 1.485489938077e-07 103 KSP preconditioned resid norm 2.296927279665e-08 true resid norm 3.757642817289e-10 ||r(i)||/||b|| 1.302182460880e-07 104 KSP preconditioned resid norm 1.972466663752e-08 true resid norm 3.239010639550e-10 ||r(i)||/||b|| 1.122454435004e-07 Linear solve converged due to CONVERGED_RTOL iterations 104 iter = 8, Function value: -4.34067, Residual: 0.00380523 0 KSP preconditioned resid norm 1.588507251562e-01 true resid norm 2.111728178921e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 7.953321562513e-02 true resid norm 1.509280088011e-03 ||r(i)||/||b|| 7.147132396474e-01 2 KSP preconditioned resid norm 6.263787688589e-02 true resid norm 1.161509246828e-03 ||r(i)||/||b|| 5.500278200682e-01 3 KSP preconditioned resid norm 4.797517613008e-02 true resid norm 9.091076526194e-04 ||r(i)||/||b|| 4.305041063969e-01 4 KSP preconditioned resid norm 4.078133877680e-02 true resid norm 7.396301283272e-04 ||r(i)||/||b|| 3.502487373660e-01 5 KSP preconditioned resid norm 3.569317547805e-02 true resid norm 6.635452203600e-04 ||r(i)||/||b|| 3.142190491104e-01 6 KSP preconditioned resid norm 3.213182113700e-02 true resid norm 5.633779989808e-04 ||r(i)||/||b|| 2.667852825967e-01 7 KSP preconditioned resid norm 2.893656657812e-02 true resid norm 5.128687192381e-04 ||r(i)||/||b|| 2.428668255496e-01 8 KSP preconditioned resid norm 2.576657132492e-02 true resid norm 4.425822360097e-04 ||r(i)||/||b|| 2.095829569485e-01 9 KSP preconditioned resid norm 2.410856748761e-02 true resid norm 4.019544016551e-04 ||r(i)||/||b|| 1.903438168167e-01 10 KSP preconditioned resid norm 2.214366495393e-02 true resid norm 3.600785860315e-04 ||r(i)||/||b|| 1.705137004023e-01 11 KSP preconditioned resid norm 2.162531124984e-02 true resid norm 3.447059031305e-04 ||r(i)||/||b|| 1.632340310516e-01 12 KSP preconditioned resid norm 2.044832510799e-02 true resid norm 3.308571495015e-04 ||r(i)||/||b|| 1.566760120001e-01 13 KSP preconditioned resid norm 1.922530650362e-02 true resid norm 3.098773544575e-04 ||r(i)||/||b|| 1.467411182702e-01 14 KSP preconditioned resid norm 1.677824479507e-02 true resid norm 2.760250196613e-04 ||r(i)||/||b|| 1.307104874655e-01 15 KSP preconditioned resid norm 1.469648964620e-02 true resid norm 2.394716682259e-04 ||r(i)||/||b|| 1.134008015882e-01 16 KSP preconditioned resid norm 1.244840630131e-02 true resid norm 2.046752016176e-04 ||r(i)||/||b|| 9.692308113359e-02 17 KSP preconditioned resid norm 1.045343398778e-02 true resid norm 1.703556071147e-04 ||r(i)||/||b|| 8.067118145941e-02 18 KSP preconditioned resid norm 8.828707796998e-03 true resid norm 1.449101380420e-04 ||r(i)||/||b|| 6.862158656996e-02 19 KSP preconditioned resid norm 7.479543899877e-03 true resid norm 1.223421323421e-04 ||r(i)||/||b|| 5.793460236183e-02 20 KSP preconditioned resid norm 6.391026564485e-03 true resid norm 1.046063960570e-04 ||r(i)||/||b|| 4.953591901702e-02 21 KSP preconditioned resid norm 5.349115329151e-03 true resid norm 8.462600840691e-05 ||r(i)||/||b|| 4.007429045634e-02 22 KSP preconditioned resid norm 4.644050055069e-03 true resid norm 7.367557141609e-05 ||r(i)||/||b|| 3.488875706235e-02 23 KSP preconditioned resid norm 4.022144285696e-03 true resid norm 6.646220009596e-05 ||r(i)||/||b|| 3.147289540358e-02 24 KSP preconditioned resid norm 3.506284618016e-03 true resid norm 5.850752376116e-05 ||r(i)||/||b|| 2.770599187205e-02 25 KSP preconditioned resid norm 2.988937562339e-03 true resid norm 4.973734144972e-05 ||r(i)||/||b|| 2.355290891422e-02 26 KSP preconditioned resid norm 2.569704370228e-03 true resid norm 4.304630498516e-05 ||r(i)||/||b|| 2.038439673005e-02 27 KSP preconditioned resid norm 2.145010783349e-03 true resid norm 3.677690988038e-05 ||r(i)||/||b|| 1.741555103895e-02 28 KSP preconditioned resid norm 1.785528086714e-03 true resid norm 3.023375067673e-05 ||r(i)||/||b|| 1.431706550991e-02 29 KSP preconditioned resid norm 1.512325184162e-03 true resid norm 2.572166934430e-05 ||r(i)||/||b|| 1.218038836677e-02 30 KSP preconditioned resid norm 1.248958171594e-03 true resid norm 2.173514928823e-05 ||r(i)||/||b|| 1.029258855623e-02 31 KSP preconditioned resid norm 1.069824444697e-03 true resid norm 1.857668170137e-05 ||r(i)||/||b|| 8.796909510800e-03 32 KSP preconditioned resid norm 9.502019917694e-04 true resid norm 1.616879145518e-05 ||r(i)||/||b|| 7.656663209108e-03 33 KSP preconditioned resid norm 8.621479612558e-04 true resid norm 1.461998630183e-05 ||r(i)||/||b|| 6.923233040962e-03 34 KSP preconditioned resid norm 7.709955045496e-04 true resid norm 1.306213109817e-05 ||r(i)||/||b|| 6.185517259539e-03 35 KSP preconditioned resid norm 7.021836605070e-04 true resid norm 1.200828372480e-05 ||r(i)||/||b|| 5.686472266967e-03 36 KSP preconditioned resid norm 6.153077107959e-04 true resid norm 1.065053857757e-05 ||r(i)||/||b|| 5.043517761367e-03 37 KSP preconditioned resid norm 5.484280564042e-04 true resid norm 9.311901688920e-06 ||r(i)||/||b|| 4.409611891280e-03 38 KSP preconditioned resid norm 4.921351418618e-04 true resid norm 8.177974779226e-06 ||r(i)||/||b|| 3.872645570987e-03 39 KSP preconditioned resid norm 4.338523874473e-04 true resid norm 7.293428348227e-06 ||r(i)||/||b|| 3.453772327817e-03 40 KSP preconditioned resid norm 3.697803076066e-04 true resid norm 6.253974170283e-06 ||r(i)||/||b|| 2.961543172417e-03 41 KSP preconditioned resid norm 2.997145770065e-04 true resid norm 5.139219246110e-06 ||r(i)||/||b|| 2.433655665256e-03 42 KSP preconditioned resid norm 2.429069180563e-04 true resid norm 4.131388551469e-06 ||r(i)||/||b|| 1.956401677407e-03 43 KSP preconditioned resid norm 2.047801571591e-04 true resid norm 3.505361341977e-06 ||r(i)||/||b|| 1.659949124593e-03 44 KSP preconditioned resid norm 1.771704774526e-04 true resid norm 3.016093770584e-06 ||r(i)||/||b|| 1.428258523370e-03 45 KSP preconditioned resid norm 1.512055860859e-04 true resid norm 2.624005665840e-06 ||r(i)||/||b|| 1.242586849971e-03 46 KSP preconditioned resid norm 1.295883175578e-04 true resid norm 2.265946239562e-06 ||r(i)||/||b|| 1.073029314180e-03 47 KSP preconditioned resid norm 1.112141885213e-04 true resid norm 1.892173706291e-06 ||r(i)||/||b|| 8.960309026413e-04 48 KSP preconditioned resid norm 9.649490813502e-05 true resid norm 1.651517058477e-06 ||r(i)||/||b|| 7.820689589514e-04 49 KSP preconditioned resid norm 8.381924122370e-05 true resid norm 1.461684078350e-06 ||r(i)||/||b|| 6.921743493983e-04 50 KSP preconditioned resid norm 7.449214730418e-05 true resid norm 1.281845950237e-06 ||r(i)||/||b|| 6.070127599908e-04 51 KSP preconditioned resid norm 6.690662664963e-05 true resid norm 1.151132125641e-06 ||r(i)||/||b|| 5.451137779621e-04 52 KSP preconditioned resid norm 6.123200510976e-05 true resid norm 1.060727747638e-06 ||r(i)||/||b|| 5.023031648799e-04 53 KSP preconditioned resid norm 5.514153429767e-05 true resid norm 9.627845603338e-07 ||r(i)||/||b|| 4.559225803512e-04 54 KSP preconditioned resid norm 4.771748901183e-05 true resid norm 8.409450418229e-07 ||r(i)||/||b|| 3.982259886558e-04 55 KSP preconditioned resid norm 4.050283573750e-05 true resid norm 7.125730455587e-07 ||r(i)||/||b|| 3.374359695872e-04 56 KSP preconditioned resid norm 3.453616493048e-05 true resid norm 5.972918513561e-07 ||r(i)||/||b|| 2.828450447924e-04 57 KSP preconditioned resid norm 3.063399957005e-05 true resid norm 5.334441359243e-07 ||r(i)||/||b|| 2.526102276084e-04 58 KSP preconditioned resid norm 2.671501219222e-05 true resid norm 4.714875800874e-07 ||r(i)||/||b|| 2.232709610990e-04 59 KSP preconditioned resid norm 2.379297875336e-05 true resid norm 4.174954334313e-07 ||r(i)||/||b|| 1.977032070693e-04 60 KSP preconditioned resid norm 2.076644500548e-05 true resid norm 3.654820187065e-07 ||r(i)||/||b|| 1.730724732259e-04 61 KSP preconditioned resid norm 1.800737091319e-05 true resid norm 3.228282302102e-07 ||r(i)||/||b|| 1.528739510287e-04 62 KSP preconditioned resid norm 1.551100225282e-05 true resid norm 2.798303126243e-07 ||r(i)||/||b|| 1.325124679481e-04 63 KSP preconditioned resid norm 1.345636889946e-05 true resid norm 2.363045607918e-07 ||r(i)||/||b|| 1.119010311794e-04 64 KSP preconditioned resid norm 1.169909963508e-05 true resid norm 2.050181818912e-07 ||r(i)||/||b|| 9.708549800001e-05 65 KSP preconditioned resid norm 1.034217686115e-05 true resid norm 1.824494061657e-07 ||r(i)||/||b|| 8.639814914956e-05 66 KSP preconditioned resid norm 9.021773257524e-06 true resid norm 1.583818804439e-07 ||r(i)||/||b|| 7.500107354009e-05 67 KSP preconditioned resid norm 7.734927322885e-06 true resid norm 1.341006807559e-07 ||r(i)||/||b|| 6.350281352236e-05 68 KSP preconditioned resid norm 6.633012470652e-06 true resid norm 1.162919778510e-07 ||r(i)||/||b|| 5.506957714152e-05 69 KSP preconditioned resid norm 5.513539642110e-06 true resid norm 9.566376091826e-08 ||r(i)||/||b|| 4.530117174794e-05 70 KSP preconditioned resid norm 4.596598866565e-06 true resid norm 8.005912699390e-08 ||r(i)||/||b|| 3.791166296546e-05 71 KSP preconditioned resid norm 3.870779671132e-06 true resid norm 6.711982173833e-08 ||r(i)||/||b|| 3.178430936724e-05 72 KSP preconditioned resid norm 3.445215213827e-06 true resid norm 5.949393991083e-08 ||r(i)||/||b|| 2.817310509217e-05 73 KSP preconditioned resid norm 3.127640301560e-06 true resid norm 5.365042394003e-08 ||r(i)||/||b|| 2.540593267428e-05 74 KSP preconditioned resid norm 2.850002434094e-06 true resid norm 4.905466565130e-08 ||r(i)||/||b|| 2.322963066031e-05 75 KSP preconditioned resid norm 2.529735201581e-06 true resid norm 4.429824170832e-08 ||r(i)||/||b|| 2.097724610132e-05 76 KSP preconditioned resid norm 2.269085820293e-06 true resid norm 3.984449123100e-08 ||r(i)||/||b|| 1.886819128936e-05 77 KSP preconditioned resid norm 1.980780237612e-06 true resid norm 3.499711230373e-08 ||r(i)||/||b|| 1.657273537999e-05 78 KSP preconditioned resid norm 1.675477193048e-06 true resid norm 2.981088416611e-08 ||r(i)||/||b|| 1.411681885182e-05 79 KSP preconditioned resid norm 1.425922466607e-06 true resid norm 2.508378870433e-08 ||r(i)||/||b|| 1.187832267179e-05 80 KSP preconditioned resid norm 1.258667641268e-06 true resid norm 2.183907172141e-08 ||r(i)||/||b|| 1.034180058750e-05 81 KSP preconditioned resid norm 1.130369036064e-06 true resid norm 1.979509163279e-08 ||r(i)||/||b|| 9.373882410808e-06 82 KSP preconditioned resid norm 1.005496104308e-06 true resid norm 1.765174889043e-08 ||r(i)||/||b|| 8.358911467219e-06 83 KSP preconditioned resid norm 8.838704225493e-07 true resid norm 1.538145008911e-08 ||r(i)||/||b|| 7.283821015718e-06 84 KSP preconditioned resid norm 7.684695151055e-07 true resid norm 1.333409523241e-08 ||r(i)||/||b|| 6.314304731790e-06 85 KSP preconditioned resid norm 6.777464803531e-07 true resid norm 1.175894469434e-08 ||r(i)||/||b|| 5.568398817478e-06 86 KSP preconditioned resid norm 5.658383068941e-07 true resid norm 9.874887575618e-09 ||r(i)||/||b|| 4.676211490754e-06 87 KSP preconditioned resid norm 4.569313887945e-07 true resid norm 7.949727247625e-09 ||r(i)||/||b|| 3.764559912104e-06 88 KSP preconditioned resid norm 3.501576211721e-07 true resid norm 6.122302489929e-09 ||r(i)||/||b|| 2.899190601822e-06 89 KSP preconditioned resid norm 2.715916520243e-07 true resid norm 4.698283006594e-09 ||r(i)||/||b|| 2.224852163026e-06 90 KSP preconditioned resid norm 2.181564811465e-07 true resid norm 3.746364473294e-09 ||r(i)||/||b|| 1.774075144088e-06 91 KSP preconditioned resid norm 1.790542677206e-07 true resid norm 3.052307341591e-09 ||r(i)||/||b|| 1.445407307654e-06 92 KSP preconditioned resid norm 1.507096650456e-07 true resid norm 2.529245611205e-09 ||r(i)||/||b|| 1.197713624534e-06 93 KSP preconditioned resid norm 1.314504687332e-07 true resid norm 2.207539157863e-09 ||r(i)||/||b|| 1.045370886224e-06 94 KSP preconditioned resid norm 1.193297434357e-07 true resid norm 2.004281819945e-09 ||r(i)||/||b|| 9.491192284839e-07 95 KSP preconditioned resid norm 1.070377720016e-07 true resid norm 1.773980996663e-09 ||r(i)||/||b|| 8.400612419587e-07 96 KSP preconditioned resid norm 9.767336936506e-08 true resid norm 1.608407816132e-09 ||r(i)||/||b|| 7.616547584995e-07 97 KSP preconditioned resid norm 8.582518964132e-08 true resid norm 1.404431974568e-09 ||r(i)||/||b|| 6.650628563785e-07 98 KSP preconditioned resid norm 7.507408194135e-08 true resid norm 1.212440197703e-09 ||r(i)||/||b|| 5.741459577067e-07 99 KSP preconditioned resid norm 6.561206695933e-08 true resid norm 1.071605228353e-09 ||r(i)||/||b|| 5.074541501363e-07 100 KSP preconditioned resid norm 5.866626124681e-08 true resid norm 9.762712731695e-10 ||r(i)||/||b|| 4.623091565073e-07 101 KSP preconditioned resid norm 5.168477931396e-08 true resid norm 8.833901788827e-10 ||r(i)||/||b|| 4.183257048425e-07 102 KSP preconditioned resid norm 4.512424946802e-08 true resid norm 7.900268191801e-10 ||r(i)||/||b|| 3.741138784177e-07 103 KSP preconditioned resid norm 3.929560567886e-08 true resid norm 6.791739204818e-10 ||r(i)||/||b|| 3.216199543395e-07 104 KSP preconditioned resid norm 3.389650408012e-08 true resid norm 5.901070254444e-10 ||r(i)||/||b|| 2.794427006917e-07 105 KSP preconditioned resid norm 2.984888775028e-08 true resid norm 5.242238362569e-10 ||r(i)||/||b|| 2.482439934693e-07 106 KSP preconditioned resid norm 2.549604905992e-08 true resid norm 4.580403697316e-10 ||r(i)||/||b|| 2.169030911761e-07 107 KSP preconditioned resid norm 2.160174543398e-08 true resid norm 3.811938720994e-10 ||r(i)||/||b|| 1.805127553369e-07 108 KSP preconditioned resid norm 1.844141666379e-08 true resid norm 3.255264385025e-10 ||r(i)||/||b|| 1.541516762204e-07 109 KSP preconditioned resid norm 1.591859631783e-08 true resid norm 2.797982293814e-10 ||r(i)||/||b|| 1.324972750633e-07 110 KSP preconditioned resid norm 1.357228851155e-08 true resid norm 2.415993129844e-10 ||r(i)||/||b|| 1.144083388175e-07 Linear solve converged due to CONVERGED_RTOL iterations 110 iter = 9, Function value: -4.34125, Residual: 0.00267839 0 KSP preconditioned resid norm 1.081543834783e-01 true resid norm 1.477972613198e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.158676419371e-02 true resid norm 1.041365232540e-03 ||r(i)||/||b|| 7.045903443955e-01 2 KSP preconditioned resid norm 4.048407937564e-02 true resid norm 7.632550635179e-04 ||r(i)||/||b|| 5.164203021775e-01 3 KSP preconditioned resid norm 3.150503631214e-02 true resid norm 6.236548295545e-04 ||r(i)||/||b|| 4.219664315734e-01 4 KSP preconditioned resid norm 2.635487697911e-02 true resid norm 4.955439162996e-04 ||r(i)||/||b|| 3.352862643559e-01 5 KSP preconditioned resid norm 2.268943095048e-02 true resid norm 4.296969003616e-04 ||r(i)||/||b|| 2.907340071964e-01 6 KSP preconditioned resid norm 2.007928121645e-02 true resid norm 3.696014510484e-04 ||r(i)||/||b|| 2.500732745302e-01 7 KSP preconditioned resid norm 1.832652876598e-02 true resid norm 3.429545737554e-04 ||r(i)||/||b|| 2.320439301059e-01 8 KSP preconditioned resid norm 1.628271400443e-02 true resid norm 2.934187394538e-04 ||r(i)||/||b|| 1.985278596055e-01 9 KSP preconditioned resid norm 1.483044705568e-02 true resid norm 2.716223074898e-04 ||r(i)||/||b|| 1.837803387317e-01 10 KSP preconditioned resid norm 1.305239383933e-02 true resid norm 2.230237428877e-04 ||r(i)||/||b|| 1.508984272754e-01 11 KSP preconditioned resid norm 1.231274809272e-02 true resid norm 2.106550806509e-04 ||r(i)||/||b|| 1.425297591916e-01 12 KSP preconditioned resid norm 1.170068323625e-02 true resid norm 1.931431760451e-04 ||r(i)||/||b|| 1.306811603411e-01 13 KSP preconditioned resid norm 1.136763209043e-02 true resid norm 1.865757376545e-04 ||r(i)||/||b|| 1.262376149520e-01 14 KSP preconditioned resid norm 1.088192976918e-02 true resid norm 1.742464978768e-04 ||r(i)||/||b|| 1.178956202035e-01 15 KSP preconditioned resid norm 9.994767525053e-03 true resid norm 1.636374241989e-04 ||r(i)||/||b|| 1.107174941793e-01 16 KSP preconditioned resid norm 8.857575088935e-03 true resid norm 1.439452517996e-04 ||r(i)||/||b|| 9.739372063740e-02 17 KSP preconditioned resid norm 7.794800913321e-03 true resid norm 1.269564929985e-04 ||r(i)||/||b|| 8.589908355861e-02 18 KSP preconditioned resid norm 6.734215620401e-03 true resid norm 1.086981502019e-04 ||r(i)||/||b|| 7.354544274451e-02 19 KSP preconditioned resid norm 5.574236175983e-03 true resid norm 9.216488479704e-05 ||r(i)||/||b|| 6.235899364711e-02 20 KSP preconditioned resid norm 4.666091899591e-03 true resid norm 7.788668729950e-05 ||r(i)||/||b|| 5.269832918687e-02 21 KSP preconditioned resid norm 3.797273912040e-03 true resid norm 6.501504679952e-05 ||r(i)||/||b|| 4.398934474085e-02 22 KSP preconditioned resid norm 3.189234958478e-03 true resid norm 5.395905419811e-05 ||r(i)||/||b|| 3.650883224510e-02 23 KSP preconditioned resid norm 2.736646935143e-03 true resid norm 4.723967969475e-05 ||r(i)||/||b|| 3.196248649866e-02 24 KSP preconditioned resid norm 2.438173046572e-03 true resid norm 4.309696086602e-05 ||r(i)||/||b|| 2.915951248431e-02 25 KSP preconditioned resid norm 2.139061276136e-03 true resid norm 3.781291280434e-05 ||r(i)||/||b|| 2.558431223060e-02 26 KSP preconditioned resid norm 1.870156209729e-03 true resid norm 3.269555309291e-05 ||r(i)||/||b|| 2.212189373534e-02 27 KSP preconditioned resid norm 1.598877552064e-03 true resid norm 2.789183834255e-05 ||r(i)||/||b|| 1.887168821227e-02 28 KSP preconditioned resid norm 1.364921502857e-03 true resid norm 2.376398925051e-05 ||r(i)||/||b|| 1.607877509928e-02 29 KSP preconditioned resid norm 1.183403555243e-03 true resid norm 2.043221156512e-05 ||r(i)||/||b|| 1.382448590905e-02 30 KSP preconditioned resid norm 1.004543574347e-03 true resid norm 1.747397295440e-05 ||r(i)||/||b|| 1.182293419943e-02 31 KSP preconditioned resid norm 8.404405674624e-04 true resid norm 1.488346633083e-05 ||r(i)||/||b|| 1.007019088034e-02 32 KSP preconditioned resid norm 7.029647994169e-04 true resid norm 1.216739995201e-05 ||r(i)||/||b|| 8.232493513987e-03 33 KSP preconditioned resid norm 6.145684797228e-04 true resid norm 1.076821570552e-05 ||r(i)||/||b|| 7.285801921743e-03 34 KSP preconditioned resid norm 5.440586340289e-04 true resid norm 9.548572028889e-06 ||r(i)||/||b|| 6.460587932158e-03 35 KSP preconditioned resid norm 5.017104780518e-04 true resid norm 8.926449214853e-06 ||r(i)||/||b|| 6.039658066150e-03 36 KSP preconditioned resid norm 4.644988680483e-04 true resid norm 8.299453328532e-06 ||r(i)||/||b|| 5.615431067137e-03 37 KSP preconditioned resid norm 4.236634606615e-04 true resid norm 7.747007850385e-06 ||r(i)||/||b|| 5.241645062435e-03 38 KSP preconditioned resid norm 3.792999202756e-04 true resid norm 6.823766852937e-06 ||r(i)||/||b|| 4.616977873609e-03 39 KSP preconditioned resid norm 3.334578769227e-04 true resid norm 6.085541623008e-06 ||r(i)||/||b|| 4.117492820005e-03 40 KSP preconditioned resid norm 2.886588518905e-04 true resid norm 5.258660355506e-06 ||r(i)||/||b|| 3.558022867641e-03 41 KSP preconditioned resid norm 2.480218972111e-04 true resid norm 4.496628933851e-06 ||r(i)||/||b|| 3.042430484635e-03 42 KSP preconditioned resid norm 2.096378851115e-04 true resid norm 3.747274646876e-06 ||r(i)||/||b|| 2.535415482948e-03 43 KSP preconditioned resid norm 1.714965196343e-04 true resid norm 3.058118088322e-06 ||r(i)||/||b|| 2.069130416229e-03 44 KSP preconditioned resid norm 1.448132438149e-04 true resid norm 2.549682779469e-06 ||r(i)||/||b|| 1.725121803138e-03 45 KSP preconditioned resid norm 1.238080426885e-04 true resid norm 2.192113918259e-06 ||r(i)||/||b|| 1.483189809259e-03 46 KSP preconditioned resid norm 1.063858194887e-04 true resid norm 1.882557958529e-06 ||r(i)||/||b|| 1.273743465690e-03 47 KSP preconditioned resid norm 9.201549772619e-05 true resid norm 1.618940446703e-06 ||r(i)||/||b|| 1.095379191905e-03 48 KSP preconditioned resid norm 8.023716994284e-05 true resid norm 1.402171502794e-06 ||r(i)||/||b|| 9.487127774034e-04 49 KSP preconditioned resid norm 6.823284586995e-05 true resid norm 1.176551430607e-06 ||r(i)||/||b|| 7.960576671718e-04 50 KSP preconditioned resid norm 5.784494765483e-05 true resid norm 1.003818897603e-06 ||r(i)||/||b|| 6.791863994223e-04 51 KSP preconditioned resid norm 4.901558190373e-05 true resid norm 8.633031151298e-07 ||r(i)||/||b|| 5.841130663860e-04 52 KSP preconditioned resid norm 4.336664174647e-05 true resid norm 7.631570916703e-07 ||r(i)||/||b|| 5.163540141782e-04 53 KSP preconditioned resid norm 3.913644406353e-05 true resid norm 6.884579008835e-07 ||r(i)||/||b|| 4.658123531760e-04 54 KSP preconditioned resid norm 3.567370300610e-05 true resid norm 6.250236885752e-07 ||r(i)||/||b|| 4.228926050413e-04 55 KSP preconditioned resid norm 3.264510131963e-05 true resid norm 5.691089952861e-07 ||r(i)||/||b|| 3.850605824519e-04 56 KSP preconditioned resid norm 2.951667615693e-05 true resid norm 5.135990688242e-07 ||r(i)||/||b|| 3.475024261192e-04 57 KSP preconditioned resid norm 2.662820877241e-05 true resid norm 4.634279580650e-07 ||r(i)||/||b|| 3.135565259644e-04 58 KSP preconditioned resid norm 2.375834847224e-05 true resid norm 4.222177180146e-07 ||r(i)||/||b|| 2.856735735454e-04 59 KSP preconditioned resid norm 2.079068048340e-05 true resid norm 3.746720424867e-07 ||r(i)||/||b|| 2.535040494938e-04 60 KSP preconditioned resid norm 1.796300059473e-05 true resid norm 3.192204594145e-07 ||r(i)||/||b|| 2.159853684458e-04 61 KSP preconditioned resid norm 1.581549509158e-05 true resid norm 2.827100230472e-07 ||r(i)||/||b|| 1.912823150595e-04 62 KSP preconditioned resid norm 1.341762130452e-05 true resid norm 2.418104843897e-07 ||r(i)||/||b|| 1.636095839871e-04 63 KSP preconditioned resid norm 1.118479118751e-05 true resid norm 2.037474255958e-07 ||r(i)||/||b|| 1.378560223487e-04 64 KSP preconditioned resid norm 9.587749298800e-06 true resid norm 1.734559424939e-07 ||r(i)||/||b|| 1.173607284363e-04 65 KSP preconditioned resid norm 8.433310696123e-06 true resid norm 1.523527933733e-07 ||r(i)||/||b|| 1.030822844840e-04 66 KSP preconditioned resid norm 7.342661121942e-06 true resid norm 1.351702431508e-07 ||r(i)||/||b|| 9.145652764043e-05 67 KSP preconditioned resid norm 6.359867858935e-06 true resid norm 1.165389319496e-07 ||r(i)||/||b|| 7.885053546250e-05 68 KSP preconditioned resid norm 5.601874423412e-06 true resid norm 1.027838036651e-07 ||r(i)||/||b|| 6.954378095187e-05 69 KSP preconditioned resid norm 4.923277660163e-06 true resid norm 8.954876250645e-08 ||r(i)||/||b|| 6.058891870308e-05 70 KSP preconditioned resid norm 4.250408582634e-06 true resid norm 7.591520685666e-08 ||r(i)||/||b|| 5.136442054388e-05 71 KSP preconditioned resid norm 3.546974381149e-06 true resid norm 6.284379761388e-08 ||r(i)||/||b|| 4.252027206234e-05 72 KSP preconditioned resid norm 2.913555663022e-06 true resid norm 5.251641796314e-08 ||r(i)||/||b|| 3.553274092778e-05 73 KSP preconditioned resid norm 2.381023743041e-06 true resid norm 4.194224620436e-08 ||r(i)||/||b|| 2.837822963011e-05 74 KSP preconditioned resid norm 1.950803671217e-06 true resid norm 3.424426686208e-08 ||r(i)||/||b|| 2.316975738000e-05 75 KSP preconditioned resid norm 1.668138493955e-06 true resid norm 2.991617170512e-08 ||r(i)||/||b|| 2.024135727412e-05 76 KSP preconditioned resid norm 1.473361105516e-06 true resid norm 2.645237105088e-08 ||r(i)||/||b|| 1.789774100999e-05 77 KSP preconditioned resid norm 1.340493255510e-06 true resid norm 2.357237673706e-08 ||r(i)||/||b|| 1.594912958911e-05 78 KSP preconditioned resid norm 1.249479759684e-06 true resid norm 2.241891611866e-08 ||r(i)||/||b|| 1.516869522375e-05 79 KSP preconditioned resid norm 1.169991337332e-06 true resid norm 2.120260652385e-08 ||r(i)||/||b|| 1.434573708235e-05 80 KSP preconditioned resid norm 1.067245812041e-06 true resid norm 1.944819511683e-08 ||r(i)||/||b|| 1.315869789681e-05 81 KSP preconditioned resid norm 9.792861372120e-07 true resid norm 1.782976182628e-08 ||r(i)||/||b|| 1.206366184803e-05 82 KSP preconditioned resid norm 8.661918751873e-07 true resid norm 1.568240775251e-08 ||r(i)||/||b|| 1.061075666251e-05 83 KSP preconditioned resid norm 7.599883189520e-07 true resid norm 1.379557876073e-08 ||r(i)||/||b|| 9.334123404948e-06 84 KSP preconditioned resid norm 6.570580774931e-07 true resid norm 1.200661191232e-08 ||r(i)||/||b|| 8.123703920559e-06 85 KSP preconditioned resid norm 5.773343245647e-07 true resid norm 1.052441444440e-08 ||r(i)||/||b|| 7.120845373197e-06 86 KSP preconditioned resid norm 5.046170061675e-07 true resid norm 9.256657786196e-09 ||r(i)||/||b|| 6.263078018858e-06 87 KSP preconditioned resid norm 4.486370048462e-07 true resid norm 8.158953390153e-09 ||r(i)||/||b|| 5.520368454255e-06 88 KSP preconditioned resid norm 3.911602305186e-07 true resid norm 7.128137784774e-09 ||r(i)||/||b|| 4.822916014221e-06 89 KSP preconditioned resid norm 3.356311296245e-07 true resid norm 6.059808469765e-09 ||r(i)||/||b|| 4.100081703579e-06 90 KSP preconditioned resid norm 2.818751123447e-07 true resid norm 5.102233707991e-09 ||r(i)||/||b|| 3.452184203163e-06 91 KSP preconditioned resid norm 2.310282350805e-07 true resid norm 4.200697644267e-09 ||r(i)||/||b|| 2.842202627271e-06 92 KSP preconditioned resid norm 1.860945172934e-07 true resid norm 3.346207897968e-09 ||r(i)||/||b|| 2.264052708478e-06 93 KSP preconditioned resid norm 1.510293674315e-07 true resid norm 2.707775495329e-09 ||r(i)||/||b|| 1.832087733662e-06 94 KSP preconditioned resid norm 1.284508105354e-07 true resid norm 2.303378932188e-09 ||r(i)||/||b|| 1.558471998479e-06 95 KSP preconditioned resid norm 1.090744029590e-07 true resid norm 1.937769834589e-09 ||r(i)||/||b|| 1.311099960367e-06 96 KSP preconditioned resid norm 9.328826987918e-08 true resid norm 1.681401042116e-09 ||r(i)||/||b|| 1.137640188391e-06 97 KSP preconditioned resid norm 8.119952416938e-08 true resid norm 1.435109635651e-09 ||r(i)||/||b|| 9.709988012193e-07 98 KSP preconditioned resid norm 7.361210991420e-08 true resid norm 1.326186480808e-09 ||r(i)||/||b|| 8.973011197670e-07 99 KSP preconditioned resid norm 6.715599053986e-08 true resid norm 1.222831297136e-09 ||r(i)||/||b|| 8.273707416605e-07 100 KSP preconditioned resid norm 5.985184721629e-08 true resid norm 1.080505108603e-09 ||r(i)||/||b|| 7.310724833154e-07 101 KSP preconditioned resid norm 5.368410437041e-08 true resid norm 9.563525961387e-10 ||r(i)||/||b|| 6.470705800625e-07 102 KSP preconditioned resid norm 4.742437366901e-08 true resid norm 8.499391888744e-10 ||r(i)||/||b|| 5.750710001555e-07 103 KSP preconditioned resid norm 4.145550886245e-08 true resid norm 7.596357303662e-10 ||r(i)||/||b|| 5.139714522330e-07 104 KSP preconditioned resid norm 3.669613473790e-08 true resid norm 6.621694173198e-10 ||r(i)||/||b|| 4.480254988534e-07 105 KSP preconditioned resid norm 3.280067707230e-08 true resid norm 5.970256353207e-10 ||r(i)||/||b|| 4.039490515516e-07 106 KSP preconditioned resid norm 2.851444743678e-08 true resid norm 5.299340186452e-10 ||r(i)||/||b|| 3.585546943921e-07 107 KSP preconditioned resid norm 2.416182353798e-08 true resid norm 4.549650485466e-10 ||r(i)||/||b|| 3.078305000267e-07 108 KSP preconditioned resid norm 2.042910554018e-08 true resid norm 3.784468209220e-10 ||r(i)||/||b|| 2.560580741094e-07 109 KSP preconditioned resid norm 1.791052663672e-08 true resid norm 3.277752869640e-10 ||r(i)||/||b|| 2.217735863554e-07 110 KSP preconditioned resid norm 1.586172489975e-08 true resid norm 2.915951650748e-10 ||r(i)||/||b|| 1.972940245786e-07 111 KSP preconditioned resid norm 1.394920090918e-08 true resid norm 2.563432756195e-10 ||r(i)||/||b|| 1.734425072091e-07 112 KSP preconditioned resid norm 1.219105526920e-08 true resid norm 2.256543661631e-10 ||r(i)||/||b|| 1.526783136223e-07 113 KSP preconditioned resid norm 1.082357865624e-08 true resid norm 1.996271642138e-10 ||r(i)||/||b|| 1.350682430994e-07 114 KSP preconditioned resid norm 9.695789377441e-09 true resid norm 1.763494226474e-10 ||r(i)||/||b|| 1.193184644104e-07 Linear solve converged due to CONVERGED_RTOL iterations 114 iter = 10, Function value: -4.34153, Residual: 0.00197318 0 KSP preconditioned resid norm 7.874946444818e-02 true resid norm 1.082620300776e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.836353131299e-02 true resid norm 7.708347834842e-04 ||r(i)||/||b|| 7.120084326256e-01 2 KSP preconditioned resid norm 2.925512260523e-02 true resid norm 5.462276926541e-04 ||r(i)||/||b|| 5.045422594261e-01 3 KSP preconditioned resid norm 2.216266797424e-02 true resid norm 4.384191751652e-04 ||r(i)||/||b|| 4.049611621461e-01 4 KSP preconditioned resid norm 1.828595571705e-02 true resid norm 3.545445624006e-04 ||r(i)||/||b|| 3.274874507216e-01 5 KSP preconditioned resid norm 1.606075745459e-02 true resid norm 3.154605412604e-04 ||r(i)||/||b|| 2.913861314389e-01 6 KSP preconditioned resid norm 1.394247281936e-02 true resid norm 2.702632926002e-04 ||r(i)||/||b|| 2.496381163428e-01 7 KSP preconditioned resid norm 1.238790928160e-02 true resid norm 2.421611195195e-04 ||r(i)||/||b|| 2.236805640408e-01 8 KSP preconditioned resid norm 1.120028770745e-02 true resid norm 2.133434753622e-04 ||r(i)||/||b|| 1.970621419248e-01 9 KSP preconditioned resid norm 1.037248624705e-02 true resid norm 1.963182715206e-04 ||r(i)||/||b|| 1.813362185984e-01 10 KSP preconditioned resid norm 9.077324835431e-03 true resid norm 1.701233851393e-04 ||r(i)||/||b|| 1.571403981778e-01 11 KSP preconditioned resid norm 8.194327444024e-03 true resid norm 1.509038749059e-04 ||r(i)||/||b|| 1.393876272205e-01 12 KSP preconditioned resid norm 7.556248789941e-03 true resid norm 1.307276199214e-04 ||r(i)||/||b|| 1.207511256049e-01 13 KSP preconditioned resid norm 7.198978398180e-03 true resid norm 1.274934851720e-04 ||r(i)||/||b|| 1.177638042448e-01 14 KSP preconditioned resid norm 6.824533850872e-03 true resid norm 1.184239920364e-04 ||r(i)||/||b|| 1.093864505880e-01 15 KSP preconditioned resid norm 6.600080133600e-03 true resid norm 1.130973415655e-04 ||r(i)||/||b|| 1.044663040998e-01 16 KSP preconditioned resid norm 6.194840453839e-03 true resid norm 1.077747735316e-04 ||r(i)||/||b|| 9.954992849697e-02 17 KSP preconditioned resid norm 5.626075339249e-03 true resid norm 9.656765780304e-05 ||r(i)||/||b|| 8.919808517707e-02 18 KSP preconditioned resid norm 4.962412342328e-03 true resid norm 8.657122589981e-05 ||r(i)||/||b|| 7.996453219818e-02 19 KSP preconditioned resid norm 4.154643084069e-03 true resid norm 7.036031170831e-05 ||r(i)||/||b|| 6.499075590756e-02 20 KSP preconditioned resid norm 3.512093674440e-03 true resid norm 6.086383217609e-05 ||r(i)||/||b|| 5.621900137330e-02 21 KSP preconditioned resid norm 2.887445360588e-03 true resid norm 5.072587664814e-05 ||r(i)||/||b|| 4.685472516244e-02 22 KSP preconditioned resid norm 2.388742850091e-03 true resid norm 4.209626936694e-05 ||r(i)||/||b|| 3.888368741725e-02 23 KSP preconditioned resid norm 1.988224576340e-03 true resid norm 3.385733942714e-05 ||r(i)||/||b|| 3.127351242432e-02 24 KSP preconditioned resid norm 1.754759810591e-03 true resid norm 3.172075237042e-05 ||r(i)||/||b|| 2.929997927037e-02 25 KSP preconditioned resid norm 1.570412327127e-03 true resid norm 2.785991345847e-05 ||r(i)||/||b|| 2.573378075258e-02 26 KSP preconditioned resid norm 1.403919698211e-03 true resid norm 2.565259599576e-05 ||r(i)||/||b|| 2.369491499224e-02 27 KSP preconditioned resid norm 1.249092796661e-03 true resid norm 2.231536105669e-05 ||r(i)||/||b|| 2.061236154606e-02 28 KSP preconditioned resid norm 1.090184495624e-03 true resid norm 2.009518478067e-05 ||r(i)||/||b|| 1.856161829431e-02 29 KSP preconditioned resid norm 9.335830400615e-04 true resid norm 1.639390327873e-05 ||r(i)||/||b|| 1.514280054325e-02 30 KSP preconditioned resid norm 7.901454629134e-04 true resid norm 1.417038488214e-05 ||r(i)||/||b|| 1.308897022528e-02 31 KSP preconditioned resid norm 6.822899097573e-04 true resid norm 1.205980907217e-05 ||r(i)||/||b|| 1.113946326660e-02 32 KSP preconditioned resid norm 5.864194111593e-04 true resid norm 1.048352636813e-05 ||r(i)||/||b|| 9.683474770069e-03 33 KSP preconditioned resid norm 5.151378472270e-04 true resid norm 9.112272821563e-06 ||r(i)||/||b|| 8.416868605763e-03 34 KSP preconditioned resid norm 4.386078665161e-04 true resid norm 7.813187123103e-06 ||r(i)||/||b|| 7.216922791401e-03 35 KSP preconditioned resid norm 3.819367626579e-04 true resid norm 6.980233635333e-06 ||r(i)||/||b|| 6.447536251010e-03 36 KSP preconditioned resid norm 3.523176469009e-04 true resid norm 6.455919088468e-06 ||r(i)||/||b|| 5.963234832970e-03 37 KSP preconditioned resid norm 3.379649462213e-04 true resid norm 6.300410520924e-06 ||r(i)||/||b|| 5.819593920794e-03 38 KSP preconditioned resid norm 3.123205896593e-04 true resid norm 5.814859678071e-06 ||r(i)||/||b|| 5.371097949949e-03 39 KSP preconditioned resid norm 2.845622568239e-04 true resid norm 5.253424287634e-06 ||r(i)||/||b|| 4.852508570057e-03 40 KSP preconditioned resid norm 2.521829904781e-04 true resid norm 4.707978370119e-06 ||r(i)||/||b|| 4.348688424505e-03 41 KSP preconditioned resid norm 2.190577359521e-04 true resid norm 4.120917707668e-06 ||r(i)||/||b|| 3.806429368370e-03 42 KSP preconditioned resid norm 1.844924405192e-04 true resid norm 3.412388768950e-06 ||r(i)||/||b|| 3.151971902341e-03 43 KSP preconditioned resid norm 1.524062972821e-04 true resid norm 2.795242781318e-06 ||r(i)||/||b|| 2.581923486299e-03 44 KSP preconditioned resid norm 1.268616103452e-04 true resid norm 2.262934924141e-06 ||r(i)||/||b|| 2.090238768402e-03 45 KSP preconditioned resid norm 1.046262147687e-04 true resid norm 1.931036194211e-06 ||r(i)||/||b|| 1.783668930674e-03 46 KSP preconditioned resid norm 9.017884447022e-05 true resid norm 1.620843293909e-06 ||r(i)||/||b|| 1.497148439529e-03 47 KSP preconditioned resid norm 7.675255209044e-05 true resid norm 1.402710706435e-06 ||r(i)||/||b|| 1.295662667169e-03 48 KSP preconditioned resid norm 6.607843393014e-05 true resid norm 1.206346489630e-06 ||r(i)||/||b|| 1.114284009606e-03 49 KSP preconditioned resid norm 5.655391949210e-05 true resid norm 1.034472543089e-06 ||r(i)||/||b|| 9.555266443351e-04 50 KSP preconditioned resid norm 4.858819604751e-05 true resid norm 8.828327652702e-07 ||r(i)||/||b|| 8.154592747221e-04 51 KSP preconditioned resid norm 4.215166220025e-05 true resid norm 7.639608052401e-07 ||r(i)||/||b|| 7.056590428726e-04 52 KSP preconditioned resid norm 3.698153566707e-05 true resid norm 6.840243611038e-07 ||r(i)||/||b|| 6.318229582555e-04 53 KSP preconditioned resid norm 3.240388180779e-05 true resid norm 6.109241289003e-07 ||r(i)||/||b|| 5.643013792207e-04 54 KSP preconditioned resid norm 2.863162350542e-05 true resid norm 5.297406271420e-07 ||r(i)||/||b|| 4.893134063368e-04 55 KSP preconditioned resid norm 2.589788007691e-05 true resid norm 4.759211400795e-07 ||r(i)||/||b|| 4.396011600174e-04 56 KSP preconditioned resid norm 2.395588812796e-05 true resid norm 4.386072480715e-07 ||r(i)||/||b|| 4.051348822456e-04 57 KSP preconditioned resid norm 2.248889680593e-05 true resid norm 4.184215944520e-07 ||r(i)||/||b|| 3.864896992529e-04 58 KSP preconditioned resid norm 2.040252470747e-05 true resid norm 3.743778860796e-07 ||r(i)||/||b|| 3.458071918763e-04 59 KSP preconditioned resid norm 1.851945700472e-05 true resid norm 3.437365387587e-07 ||r(i)||/||b|| 3.175042427270e-04 60 KSP preconditioned resid norm 1.686023286533e-05 true resid norm 3.109039389114e-07 ||r(i)||/||b|| 2.871772667560e-04 61 KSP preconditioned resid norm 1.524456296999e-05 true resid norm 2.847569788214e-07 ||r(i)||/||b|| 2.630257151259e-04 62 KSP preconditioned resid norm 1.339277841691e-05 true resid norm 2.489208234734e-07 ||r(i)||/||b|| 2.299244003600e-04 63 KSP preconditioned resid norm 1.140161671824e-05 true resid norm 2.139760177170e-07 ||r(i)||/||b|| 1.976464117324e-04 64 KSP preconditioned resid norm 9.550513341507e-06 true resid norm 1.787851326623e-07 ||r(i)||/||b|| 1.651411233783e-04 65 KSP preconditioned resid norm 8.033762776762e-06 true resid norm 1.497991432779e-07 ||r(i)||/||b|| 1.383672033219e-04 66 KSP preconditioned resid norm 6.770973784587e-06 true resid norm 1.265124951315e-07 ||r(i)||/||b|| 1.168576785793e-04 67 KSP preconditioned resid norm 5.675071155464e-06 true resid norm 1.051746391652e-07 ||r(i)||/||b|| 9.714822370299e-05 68 KSP preconditioned resid norm 4.825907750340e-06 true resid norm 9.027136005182e-08 ||r(i)||/||b|| 8.338229015946e-05 69 KSP preconditioned resid norm 4.141889105746e-06 true resid norm 7.734649795559e-08 ||r(i)||/||b|| 7.144379049622e-05 70 KSP preconditioned resid norm 3.609246290733e-06 true resid norm 6.787992807774e-08 ||r(i)||/||b|| 6.269966305738e-05 71 KSP preconditioned resid norm 3.088128956390e-06 true resid norm 5.783350333058e-08 ||r(i)||/||b|| 5.341993244457e-05 72 KSP preconditioned resid norm 2.610628201375e-06 true resid norm 4.884499495613e-08 ||r(i)||/||b|| 4.511738318699e-05 73 KSP preconditioned resid norm 2.222821218514e-06 true resid norm 4.163981244326e-08 ||r(i)||/||b|| 3.846206505957e-05 74 KSP preconditioned resid norm 1.908097661100e-06 true resid norm 3.534683915296e-08 ||r(i)||/||b|| 3.264934079624e-05 75 KSP preconditioned resid norm 1.607160927854e-06 true resid norm 2.898993354749e-08 ||r(i)||/||b|| 2.677756322019e-05 76 KSP preconditioned resid norm 1.354284638600e-06 true resid norm 2.438171527172e-08 ||r(i)||/||b|| 2.252102168622e-05 77 KSP preconditioned resid norm 1.149066858939e-06 true resid norm 2.093498218956e-08 ||r(i)||/||b|| 1.933732646114e-05 78 KSP preconditioned resid norm 9.944646929550e-07 true resid norm 1.783142751715e-08 ||r(i)||/||b|| 1.647061994346e-05 79 KSP preconditioned resid norm 8.905035576619e-07 true resid norm 1.612554125842e-08 ||r(i)||/||b|| 1.489491860338e-05 80 KSP preconditioned resid norm 8.220035270642e-07 true resid norm 1.491905449795e-08 ||r(i)||/||b|| 1.378050502772e-05 81 KSP preconditioned resid norm 7.540905258504e-07 true resid norm 1.376003070362e-08 ||r(i)||/||b|| 1.270993227613e-05 82 KSP preconditioned resid norm 6.966198580725e-07 true resid norm 1.257441188587e-08 ||r(i)||/||b|| 1.161479410358e-05 83 KSP preconditioned resid norm 6.511707808680e-07 true resid norm 1.178653469672e-08 ||r(i)||/||b|| 1.088704385857e-05 84 KSP preconditioned resid norm 5.848708787756e-07 true resid norm 1.056803486322e-08 ||r(i)||/||b|| 9.761533989008e-06 85 KSP preconditioned resid norm 5.196676227443e-07 true resid norm 9.445339945527e-09 ||r(i)||/||b|| 8.724517671391e-06 86 KSP preconditioned resid norm 4.574229694637e-07 true resid norm 8.412999432063e-09 ||r(i)||/||b|| 7.770960350580e-06 87 KSP preconditioned resid norm 4.061948061931e-07 true resid norm 7.354161087167e-09 ||r(i)||/||b|| 6.792927383586e-06 88 KSP preconditioned resid norm 3.564985985913e-07 true resid norm 6.350291727630e-09 ||r(i)||/||b|| 5.865668437104e-06 89 KSP preconditioned resid norm 3.140282171475e-07 true resid norm 5.656760105149e-09 ||r(i)||/||b|| 5.225063765287e-06 90 KSP preconditioned resid norm 2.773920979455e-07 true resid norm 4.942272777220e-09 ||r(i)||/||b|| 4.565102625249e-06 91 KSP preconditioned resid norm 2.366867302448e-07 true resid norm 4.193605757317e-09 ||r(i)||/||b|| 3.873570220613e-06 92 KSP preconditioned resid norm 1.956188982003e-07 true resid norm 3.487847666677e-09 ||r(i)||/||b|| 3.221672145051e-06 93 KSP preconditioned resid norm 1.591484577701e-07 true resid norm 2.788502623340e-09 ||r(i)||/||b|| 2.575697704303e-06 94 KSP preconditioned resid norm 1.333988119803e-07 true resid norm 2.356149842725e-09 ||r(i)||/||b|| 2.176339979064e-06 95 KSP preconditioned resid norm 1.154240184317e-07 true resid norm 2.036406775312e-09 ||r(i)||/||b|| 1.880998142980e-06 96 KSP preconditioned resid norm 1.019856999259e-07 true resid norm 1.803027192070e-09 ||r(i)||/||b|| 1.665428951201e-06 97 KSP preconditioned resid norm 9.150506659395e-08 true resid norm 1.640918922009e-09 ||r(i)||/||b|| 1.515691993613e-06 98 KSP preconditioned resid norm 8.185775908050e-08 true resid norm 1.458367471407e-09 ||r(i)||/||b|| 1.347071979310e-06 99 KSP preconditioned resid norm 7.209124970124e-08 true resid norm 1.273353750835e-09 ||r(i)||/||b|| 1.176177603470e-06 100 KSP preconditioned resid norm 6.458672913912e-08 true resid norm 1.176408664631e-09 ||r(i)||/||b|| 1.086630893387e-06 101 KSP preconditioned resid norm 5.733811833658e-08 true resid norm 1.044717390376e-09 ||r(i)||/||b|| 9.649896548465e-07 102 KSP preconditioned resid norm 4.933092886016e-08 true resid norm 8.869629250831e-10 ||r(i)||/||b|| 8.192742408833e-07 103 KSP preconditioned resid norm 4.243789287130e-08 true resid norm 7.579756261584e-10 ||r(i)||/||b|| 7.001306234652e-07 104 KSP preconditioned resid norm 3.609241101259e-08 true resid norm 6.519165378332e-10 ||r(i)||/||b|| 6.021654474481e-07 105 KSP preconditioned resid norm 3.108390895460e-08 true resid norm 5.626021086075e-10 ||r(i)||/||b|| 5.196670598216e-07 106 KSP preconditioned resid norm 2.693411804730e-08 true resid norm 4.816894925112e-10 ||r(i)||/||b|| 4.449292999272e-07 107 KSP preconditioned resid norm 2.383174442310e-08 true resid norm 4.271528495019e-10 ||r(i)||/||b|| 3.945546275049e-07 108 KSP preconditioned resid norm 2.151098027337e-08 true resid norm 3.906392383820e-10 ||r(i)||/||b|| 3.608275570872e-07 109 KSP preconditioned resid norm 1.902601106534e-08 true resid norm 3.434772069152e-10 ||r(i)||/||b|| 3.172647018248e-07 110 KSP preconditioned resid norm 1.656752388443e-08 true resid norm 2.954328393839e-10 ||r(i)||/||b|| 2.728868460827e-07 111 KSP preconditioned resid norm 1.456915915131e-08 true resid norm 2.603052158281e-10 ||r(i)||/||b|| 2.404399914186e-07 112 KSP preconditioned resid norm 1.286214115008e-08 true resid norm 2.268469220832e-10 ||r(i)||/||b|| 2.095350714563e-07 113 KSP preconditioned resid norm 1.126092630172e-08 true resid norm 1.995862567038e-10 ||r(i)||/||b|| 1.843548070923e-07 114 KSP preconditioned resid norm 9.918459604519e-09 true resid norm 1.770231640754e-10 ||r(i)||/||b|| 1.635136196397e-07 115 KSP preconditioned resid norm 8.870628641383e-09 true resid norm 1.584590304720e-10 ||r(i)||/||b|| 1.463662101647e-07 116 KSP preconditioned resid norm 7.864654208334e-09 true resid norm 1.404388194054e-10 ||r(i)||/||b|| 1.297212137116e-07 Linear solve converged due to CONVERGED_RTOL iterations 116 iter = 11, Function value: -4.34167, Residual: 0.00126025 0 KSP preconditioned resid norm 5.066591745231e-02 true resid norm 7.002088467130e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.342475175611e-02 true resid norm 4.859083359895e-04 ||r(i)||/||b|| 6.939477246975e-01 2 KSP preconditioned resid norm 1.775043591174e-02 true resid norm 3.520087809380e-04 ||r(i)||/||b|| 5.027196994018e-01 3 KSP preconditioned resid norm 1.379320515986e-02 true resid norm 2.811630471387e-04 ||r(i)||/||b|| 4.015416949651e-01 4 KSP preconditioned resid norm 1.152920881799e-02 true resid norm 2.278825054309e-04 ||r(i)||/||b|| 3.254493377235e-01 5 KSP preconditioned resid norm 9.502871942875e-03 true resid norm 1.971507440013e-04 ||r(i)||/||b|| 2.815599159119e-01 6 KSP preconditioned resid norm 8.435049715766e-03 true resid norm 1.667876899749e-04 ||r(i)||/||b|| 2.381970618593e-01 7 KSP preconditioned resid norm 7.471961032076e-03 true resid norm 1.484464441511e-04 ||r(i)||/||b|| 2.120030971445e-01 8 KSP preconditioned resid norm 6.629116689134e-03 true resid norm 1.288277088997e-04 ||r(i)||/||b|| 1.839846918593e-01 9 KSP preconditioned resid norm 6.148557051308e-03 true resid norm 1.173566243133e-04 ||r(i)||/||b|| 1.676023158865e-01 10 KSP preconditioned resid norm 5.554693253429e-03 true resid norm 1.042913104594e-04 ||r(i)||/||b|| 1.489431488176e-01 11 KSP preconditioned resid norm 5.048263839301e-03 true resid norm 9.660703478803e-05 ||r(i)||/||b|| 1.379688863423e-01 12 KSP preconditioned resid norm 4.465204147332e-03 true resid norm 8.154440985046e-05 ||r(i)||/||b|| 1.164572687610e-01 13 KSP preconditioned resid norm 4.078324311849e-03 true resid norm 7.289158504071e-05 ||r(i)||/||b|| 1.040997773491e-01 14 KSP preconditioned resid norm 3.750460614491e-03 true resid norm 6.458316681740e-05 ||r(i)||/||b|| 9.223414859806e-02 15 KSP preconditioned resid norm 3.624882012500e-03 true resid norm 6.332671092862e-05 ||r(i)||/||b|| 9.043974697821e-02 16 KSP preconditioned resid norm 3.568245152338e-03 true resid norm 6.016269490561e-05 ||r(i)||/||b|| 8.592107224586e-02 17 KSP preconditioned resid norm 3.432818643009e-03 true resid norm 5.957113427425e-05 ||r(i)||/||b|| 8.507623768808e-02 18 KSP preconditioned resid norm 3.199612487326e-03 true resid norm 5.517898361788e-05 ||r(i)||/||b|| 7.880360820477e-02 19 KSP preconditioned resid norm 2.822236424045e-03 true resid norm 4.873037634834e-05 ||r(i)||/||b|| 6.959405979673e-02 20 KSP preconditioned resid norm 2.457039292679e-03 true resid norm 4.330452544436e-05 ||r(i)||/||b|| 6.184515612399e-02 21 KSP preconditioned resid norm 2.066432397744e-03 true resid norm 3.565224617556e-05 ||r(i)||/||b|| 5.091658916183e-02 22 KSP preconditioned resid norm 1.725368783235e-03 true resid norm 3.082481966758e-05 ||r(i)||/||b|| 4.402232250033e-02 23 KSP preconditioned resid norm 1.423216218369e-03 true resid norm 2.488484488699e-05 ||r(i)||/||b|| 3.553917521009e-02 24 KSP preconditioned resid norm 1.183921928765e-03 true resid norm 2.095500208176e-05 ||r(i)||/||b|| 2.992678852906e-02 25 KSP preconditioned resid norm 1.004721394185e-03 true resid norm 1.807664844204e-05 ||r(i)||/||b|| 2.581608119763e-02 26 KSP preconditioned resid norm 8.982961643006e-04 true resid norm 1.641713796620e-05 ||r(i)||/||b|| 2.344605904833e-02 27 KSP preconditioned resid norm 8.075660176076e-04 true resid norm 1.463521227680e-05 ||r(i)||/||b|| 2.090121018251e-02 28 KSP preconditioned resid norm 7.127769564136e-04 true resid norm 1.298710816477e-05 ||r(i)||/||b|| 1.854747797852e-02 29 KSP preconditioned resid norm 6.115849101993e-04 true resid norm 1.119065138942e-05 ||r(i)||/||b|| 1.598187661004e-02 30 KSP preconditioned resid norm 5.217005573094e-04 true resid norm 9.532959511605e-06 ||r(i)||/||b|| 1.361445168303e-02 31 KSP preconditioned resid norm 4.500790983878e-04 true resid norm 8.008239779498e-06 ||r(i)||/||b|| 1.143693030600e-02 32 KSP preconditioned resid norm 3.890751243413e-04 true resid norm 7.013179442009e-06 ||r(i)||/||b|| 1.001583952407e-02 33 KSP preconditioned resid norm 3.276266173346e-04 true resid norm 5.924001702185e-06 ||r(i)||/||b|| 8.460335412776e-03 34 KSP preconditioned resid norm 2.750228936892e-04 true resid norm 4.961057328019e-06 ||r(i)||/||b|| 7.085110894140e-03 35 KSP preconditioned resid norm 2.332248137414e-04 true resid norm 4.341204938807e-06 ||r(i)||/||b|| 6.199871594290e-03 36 KSP preconditioned resid norm 1.988926360286e-04 true resid norm 3.665817479611e-06 ||r(i)||/||b|| 5.235320143155e-03 37 KSP preconditioned resid norm 1.771496961050e-04 true resid norm 3.306540455130e-06 ||r(i)||/||b|| 4.722220335621e-03 38 KSP preconditioned resid norm 1.641557532122e-04 true resid norm 3.099551798206e-06 ||r(i)||/||b|| 4.426610450234e-03 39 KSP preconditioned resid norm 1.563306597609e-04 true resid norm 2.960717504070e-06 ||r(i)||/||b|| 4.228334900321e-03 40 KSP preconditioned resid norm 1.510382263745e-04 true resid norm 2.768075776785e-06 ||r(i)||/||b|| 3.953214515611e-03 41 KSP preconditioned resid norm 1.423037183632e-04 true resid norm 2.648070793768e-06 ||r(i)||/||b|| 3.781829958589e-03 42 KSP preconditioned resid norm 1.299647819718e-04 true resid norm 2.454010640244e-06 ||r(i)||/||b|| 3.504683855059e-03 43 KSP preconditioned resid norm 1.144035725724e-04 true resid norm 2.101966404722e-06 ||r(i)||/||b|| 3.001913521357e-03 44 KSP preconditioned resid norm 9.678543707060e-05 true resid norm 1.778028016335e-06 ||r(i)||/||b|| 2.539282422211e-03 45 KSP preconditioned resid norm 8.031098007488e-05 true resid norm 1.453898374609e-06 ||r(i)||/||b|| 2.076378185500e-03 46 KSP preconditioned resid norm 6.662143218109e-05 true resid norm 1.207090584541e-06 ||r(i)||/||b|| 1.723900790754e-03 47 KSP preconditioned resid norm 5.533377126263e-05 true resid norm 1.019036592226e-06 ||r(i)||/||b|| 1.455332358353e-03 48 KSP preconditioned resid norm 4.682938567759e-05 true resid norm 8.460921977478e-07 ||r(i)||/||b|| 1.208342627660e-03 49 KSP preconditioned resid norm 4.026058625987e-05 true resid norm 7.394900131240e-07 ||r(i)||/||b|| 1.056099214678e-03 50 KSP preconditioned resid norm 3.557546406415e-05 true resid norm 6.472290800539e-07 ||r(i)||/||b|| 9.243371932421e-04 51 KSP preconditioned resid norm 3.011872058319e-05 true resid norm 5.496922611753e-07 ||r(i)||/||b|| 7.850404400855e-04 52 KSP preconditioned resid norm 2.575133112606e-05 true resid norm 4.660019313025e-07 ||r(i)||/||b|| 6.655184856491e-04 53 KSP preconditioned resid norm 2.203532786425e-05 true resid norm 4.067627921461e-07 ||r(i)||/||b|| 5.809163852408e-04 54 KSP preconditioned resid norm 1.895837864673e-05 true resid norm 3.579914364229e-07 ||r(i)||/||b|| 5.112638009409e-04 55 KSP preconditioned resid norm 1.658230242941e-05 true resid norm 3.141302822528e-07 ||r(i)||/||b|| 4.486236980973e-04 56 KSP preconditioned resid norm 1.433102684068e-05 true resid norm 2.695472127328e-07 ||r(i)||/||b|| 3.849525952124e-04 57 KSP preconditioned resid norm 1.275112674995e-05 true resid norm 2.358404013011e-07 ||r(i)||/||b|| 3.368143696103e-04 58 KSP preconditioned resid norm 1.159984700749e-05 true resid norm 2.150145384715e-07 ||r(i)||/||b|| 3.070720107020e-04 59 KSP preconditioned resid norm 1.086784224943e-05 true resid norm 1.963387235023e-07 ||r(i)||/||b|| 2.804002326220e-04 60 KSP preconditioned resid norm 9.773354142316e-06 true resid norm 1.799371756381e-07 ||r(i)||/||b|| 2.569764385052e-04 61 KSP preconditioned resid norm 9.101064097392e-06 true resid norm 1.651576088350e-07 ||r(i)||/||b|| 2.358690690788e-04 62 KSP preconditioned resid norm 8.493129275533e-06 true resid norm 1.542149478541e-07 ||r(i)||/||b|| 2.202413588146e-04 63 KSP preconditioned resid norm 7.909342920282e-06 true resid norm 1.436391340170e-07 ||r(i)||/||b|| 2.051375595885e-04 64 KSP preconditioned resid norm 7.214136503631e-06 true resid norm 1.335240940709e-07 ||r(i)||/||b|| 1.906918124466e-04 65 KSP preconditioned resid norm 6.403166974175e-06 true resid norm 1.190363462429e-07 ||r(i)||/||b|| 1.700012029293e-04 66 KSP preconditioned resid norm 5.545790250842e-06 true resid norm 1.035992344935e-07 ||r(i)||/||b|| 1.479547637535e-04 67 KSP preconditioned resid norm 4.778691537152e-06 true resid norm 8.925036352030e-08 ||r(i)||/||b|| 1.274624905687e-04 68 KSP preconditioned resid norm 4.038280394073e-06 true resid norm 7.513989411213e-08 ||r(i)||/||b|| 1.073106894677e-04 69 KSP preconditioned resid norm 3.410431298429e-06 true resid norm 6.325964698086e-08 ||r(i)||/||b|| 9.034396991387e-05 70 KSP preconditioned resid norm 2.926775564316e-06 true resid norm 5.499068138670e-08 ||r(i)||/||b|| 7.853468525119e-05 71 KSP preconditioned resid norm 2.498125684411e-06 true resid norm 4.727526939775e-08 ||r(i)||/||b|| 6.751595558908e-05 72 KSP preconditioned resid norm 2.170071419566e-06 true resid norm 4.102540712809e-08 ||r(i)||/||b|| 5.859024392605e-05 73 KSP preconditioned resid norm 1.831764612253e-06 true resid norm 3.490050684279e-08 ||r(i)||/||b|| 4.984299613840e-05 74 KSP preconditioned resid norm 1.546790710999e-06 true resid norm 2.896313846199e-08 ||r(i)||/||b|| 4.136357116588e-05 75 KSP preconditioned resid norm 1.303026369812e-06 true resid norm 2.400320964045e-08 ||r(i)||/||b|| 3.428007194301e-05 76 KSP preconditioned resid norm 1.094920410522e-06 true resid norm 2.016237928086e-08 ||r(i)||/||b|| 2.879480797124e-05 77 KSP preconditioned resid norm 8.972048943092e-07 true resid norm 1.623172655613e-08 ||r(i)||/||b|| 2.318126460746e-05 78 KSP preconditioned resid norm 7.368706084779e-07 true resid norm 1.340975520702e-08 ||r(i)||/||b|| 1.915107938148e-05 79 KSP preconditioned resid norm 6.132438460063e-07 true resid norm 1.112546282456e-08 ||r(i)||/||b|| 1.588877786505e-05 80 KSP preconditioned resid norm 5.161134518848e-07 true resid norm 9.420431944230e-09 ||r(i)||/||b|| 1.345374596230e-05 81 KSP preconditioned resid norm 4.405771007766e-07 true resid norm 8.046114081186e-09 ||r(i)||/||b|| 1.149102031338e-05 82 KSP preconditioned resid norm 3.874375343695e-07 true resid norm 7.063977453169e-09 ||r(i)||/||b|| 1.008838646688e-05 83 KSP preconditioned resid norm 3.496418807632e-07 true resid norm 6.424239897428e-09 ||r(i)||/||b|| 9.174748259159e-06 84 KSP preconditioned resid norm 3.186826436047e-07 true resid norm 5.786168593267e-09 ||r(i)||/||b|| 8.263489700866e-06 85 KSP preconditioned resid norm 2.936917057866e-07 true resid norm 5.366050827427e-09 ||r(i)||/||b|| 7.663500472205e-06 86 KSP preconditioned resid norm 2.590529526460e-07 true resid norm 4.717277407083e-09 ||r(i)||/||b|| 6.736957736577e-06 87 KSP preconditioned resid norm 2.333235475902e-07 true resid norm 4.248307122794e-09 ||r(i)||/||b|| 6.067200011449e-06 88 KSP preconditioned resid norm 2.139376950936e-07 true resid norm 3.922183656401e-09 ||r(i)||/||b|| 5.601448303336e-06 89 KSP preconditioned resid norm 1.960085837088e-07 true resid norm 3.606772644865e-09 ||r(i)||/||b|| 5.150995537684e-06 90 KSP preconditioned resid norm 1.805044132002e-07 true resid norm 3.322840327786e-09 ||r(i)||/||b|| 4.745498922763e-06 91 KSP preconditioned resid norm 1.654760732746e-07 true resid norm 3.086023034996e-09 ||r(i)||/||b|| 4.407289410128e-06 92 KSP preconditioned resid norm 1.475566056138e-07 true resid norm 2.746533880067e-09 ||r(i)||/||b|| 3.922449556243e-06 93 KSP preconditioned resid norm 1.275759819936e-07 true resid norm 2.373488455263e-09 ||r(i)||/||b|| 3.389686471980e-06 94 KSP preconditioned resid norm 1.080347385532e-07 true resid norm 2.018100763084e-09 ||r(i)||/||b|| 2.882141196241e-06 95 KSP preconditioned resid norm 8.892944411364e-08 true resid norm 1.664509653953e-09 ||r(i)||/||b|| 2.377161702208e-06 96 KSP preconditioned resid norm 7.242519716029e-08 true resid norm 1.324033304272e-09 ||r(i)||/||b|| 1.890911990740e-06 97 KSP preconditioned resid norm 5.978311191652e-08 true resid norm 1.086852839621e-09 ||r(i)||/||b|| 1.552183815904e-06 98 KSP preconditioned resid norm 5.090674336577e-08 true resid norm 9.237647346721e-10 ||r(i)||/||b|| 1.319270299152e-06 99 KSP preconditioned resid norm 4.459452702447e-08 true resid norm 8.085511211514e-10 ||r(i)||/||b|| 1.154728514138e-06 100 KSP preconditioned resid norm 3.972943102244e-08 true resid norm 7.215580515257e-10 ||r(i)||/||b|| 1.030489767322e-06 101 KSP preconditioned resid norm 3.540438044609e-08 true resid norm 6.405783782006e-10 ||r(i)||/||b|| 9.148390243965e-07 102 KSP preconditioned resid norm 3.170686489217e-08 true resid norm 5.705982997517e-10 ||r(i)||/||b|| 8.148973016127e-07 103 KSP preconditioned resid norm 2.837980201389e-08 true resid norm 5.049173616098e-10 ||r(i)||/||b|| 7.210953760155e-07 104 KSP preconditioned resid norm 2.480522238196e-08 true resid norm 4.430968897145e-10 ||r(i)||/||b|| 6.328067572904e-07 105 KSP preconditioned resid norm 2.194422481575e-08 true resid norm 3.885609008918e-10 ||r(i)||/||b|| 5.549214391047e-07 106 KSP preconditioned resid norm 1.900800915978e-08 true resid norm 3.310280832385e-10 ||r(i)||/||b|| 4.727562137960e-07 107 KSP preconditioned resid norm 1.638974818734e-08 true resid norm 2.880065534266e-10 ||r(i)||/||b|| 4.113152165652e-07 108 KSP preconditioned resid norm 1.414064986165e-08 true resid norm 2.492699049426e-10 ||r(i)||/||b|| 3.559936526263e-07 109 KSP preconditioned resid norm 1.232213243507e-08 true resid norm 2.163985227484e-10 ||r(i)||/||b|| 3.090485413949e-07 110 KSP preconditioned resid norm 1.080443583600e-08 true resid norm 1.936800893757e-10 ||r(i)||/||b|| 2.766033166889e-07 111 KSP preconditioned resid norm 9.296689415186e-09 true resid norm 1.676888286274e-10 ||r(i)||/||b|| 2.394840188247e-07 112 KSP preconditioned resid norm 7.896505700552e-09 true resid norm 1.414471395007e-10 ||r(i)||/||b|| 2.020070728394e-07 113 KSP preconditioned resid norm 6.661370463239e-09 true resid norm 1.181827178739e-10 ||r(i)||/||b|| 1.687820975538e-07 114 KSP preconditioned resid norm 5.692136778568e-09 true resid norm 1.025743903117e-10 ||r(i)||/||b|| 1.464911373132e-07 115 KSP preconditioned resid norm 4.822644293253e-09 true resid norm 8.717821451151e-11 ||r(i)||/||b|| 1.245031606224e-07 Linear solve converged due to CONVERGED_RTOL iterations 115 iter = 12, Function value: -4.34173, Residual: 0.000838293 0 KSP preconditioned resid norm 2.920028076044e-02 true resid norm 4.426564165080e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.415059321831e-02 true resid norm 2.997130801024e-04 ||r(i)||/||b|| 6.770783590279e-01 2 KSP preconditioned resid norm 1.103050021990e-02 true resid norm 2.198099364934e-04 ||r(i)||/||b|| 4.965700897944e-01 3 KSP preconditioned resid norm 8.286760775208e-03 true resid norm 1.804353530127e-04 ||r(i)||/||b|| 4.076194228385e-01 4 KSP preconditioned resid norm 6.969471867495e-03 true resid norm 1.369041316474e-04 ||r(i)||/||b|| 3.092785432264e-01 5 KSP preconditioned resid norm 5.791658163831e-03 true resid norm 1.197284332884e-04 ||r(i)||/||b|| 2.704771213595e-01 6 KSP preconditioned resid norm 5.029930317125e-03 true resid norm 9.643233307088e-05 ||r(i)||/||b|| 2.178491703150e-01 7 KSP preconditioned resid norm 4.463843284125e-03 true resid norm 8.806202807420e-05 ||r(i)||/||b|| 1.989399109334e-01 8 KSP preconditioned resid norm 3.887091622810e-03 true resid norm 7.591695484607e-05 ||r(i)||/||b|| 1.715031162204e-01 9 KSP preconditioned resid norm 3.558358970702e-03 true resid norm 7.020936684391e-05 ||r(i)||/||b|| 1.586091700596e-01 10 KSP preconditioned resid norm 3.172542468176e-03 true resid norm 5.942108178901e-05 ||r(i)||/||b|| 1.342374798445e-01 11 KSP preconditioned resid norm 2.936689449005e-03 true resid norm 5.684840339927e-05 ||r(i)||/||b|| 1.284255717961e-01 12 KSP preconditioned resid norm 2.641644953734e-03 true resid norm 4.874007239201e-05 ||r(i)||/||b|| 1.101081348295e-01 13 KSP preconditioned resid norm 2.355824912595e-03 true resid norm 4.292000691995e-05 ||r(i)||/||b|| 9.696009211510e-02 14 KSP preconditioned resid norm 2.084347487335e-03 true resid norm 3.674207334884e-05 ||r(i)||/||b|| 8.300359370976e-02 15 KSP preconditioned resid norm 1.920321995999e-03 true resid norm 3.451515502635e-05 ||r(i)||/||b|| 7.797278823751e-02 16 KSP preconditioned resid norm 1.833212612858e-03 true resid norm 3.160823889953e-05 ||r(i)||/||b|| 7.140580757619e-02 17 KSP preconditioned resid norm 1.787587223270e-03 true resid norm 3.108516938285e-05 ||r(i)||/||b|| 7.022414726996e-02 18 KSP preconditioned resid norm 1.758736658043e-03 true resid norm 3.112258812547e-05 ||r(i)||/||b|| 7.030867951942e-02 19 KSP preconditioned resid norm 1.662986099714e-03 true resid norm 2.845877195869e-05 ||r(i)||/||b|| 6.429088317119e-02 20 KSP preconditioned resid norm 1.512674775988e-03 true resid norm 2.594797938654e-05 ||r(i)||/||b|| 5.861878065891e-02 21 KSP preconditioned resid norm 1.329740382039e-03 true resid norm 2.289128044409e-05 ||r(i)||/||b|| 5.171342736806e-02 22 KSP preconditioned resid norm 1.157682467990e-03 true resid norm 1.985473567453e-05 ||r(i)||/||b|| 4.485360413651e-02 23 KSP preconditioned resid norm 9.621289501660e-04 true resid norm 1.701402025000e-05 ||r(i)||/||b|| 3.843617671743e-02 24 KSP preconditioned resid norm 7.876719805710e-04 true resid norm 1.407967645193e-05 ||r(i)||/||b|| 3.180723452063e-02 25 KSP preconditioned resid norm 6.381291701367e-04 true resid norm 1.150056534162e-05 ||r(i)||/||b|| 2.598079438755e-02 26 KSP preconditioned resid norm 5.445418019507e-04 true resid norm 9.878047184694e-06 ||r(i)||/||b|| 2.231538235144e-02 27 KSP preconditioned resid norm 4.732608477956e-04 true resid norm 8.725075104076e-06 ||r(i)||/||b|| 1.971071643535e-02 28 KSP preconditioned resid norm 4.250167935197e-04 true resid norm 7.945115883029e-06 ||r(i)||/||b|| 1.794871956383e-02 29 KSP preconditioned resid norm 3.758671739013e-04 true resid norm 7.081306060104e-06 ||r(i)||/||b|| 1.599729676567e-02 30 KSP preconditioned resid norm 3.234156919391e-04 true resid norm 5.884875006392e-06 ||r(i)||/||b|| 1.329445318520e-02 31 KSP preconditioned resid norm 2.797176515100e-04 true resid norm 5.174311985396e-06 ||r(i)||/||b|| 1.168922846802e-02 32 KSP preconditioned resid norm 2.377518942529e-04 true resid norm 4.340721579681e-06 ||r(i)||/||b|| 9.806074006391e-03 33 KSP preconditioned resid norm 2.055540830318e-04 true resid norm 3.757588158064e-06 ||r(i)||/||b|| 8.488724025977e-03 34 KSP preconditioned resid norm 1.749487969405e-04 true resid norm 3.109765402566e-06 ||r(i)||/||b|| 7.025235118240e-03 35 KSP preconditioned resid norm 1.514607363050e-04 true resid norm 2.701257005361e-06 ||r(i)||/||b|| 6.102378514403e-03 36 KSP preconditioned resid norm 1.292399241270e-04 true resid norm 2.320942442747e-06 ||r(i)||/||b|| 5.243214276789e-03 37 KSP preconditioned resid norm 1.113093655985e-04 true resid norm 2.012620032643e-06 ||r(i)||/||b|| 4.546686679750e-03 38 KSP preconditioned resid norm 9.638370609107e-05 true resid norm 1.745852020030e-06 ||r(i)||/||b|| 3.944034142333e-03 39 KSP preconditioned resid norm 8.513224656629e-05 true resid norm 1.567471611197e-06 ||r(i)||/||b|| 3.541057020165e-03 40 KSP preconditioned resid norm 7.704949362208e-05 true resid norm 1.441870032586e-06 ||r(i)||/||b|| 3.257311944015e-03 41 KSP preconditioned resid norm 7.416600969234e-05 true resid norm 1.387116363643e-06 ||r(i)||/||b|| 3.133618562644e-03 42 KSP preconditioned resid norm 7.089738304852e-05 true resid norm 1.340002853411e-06 ||r(i)||/||b|| 3.027184975612e-03 43 KSP preconditioned resid norm 6.581250801038e-05 true resid norm 1.249219785304e-06 ||r(i)||/||b|| 2.822097994555e-03 44 KSP preconditioned resid norm 6.017536328335e-05 true resid norm 1.138318209892e-06 ||r(i)||/||b|| 2.571561525916e-03 45 KSP preconditioned resid norm 5.291453805278e-05 true resid norm 9.879388097098e-07 ||r(i)||/||b|| 2.231841159117e-03 46 KSP preconditioned resid norm 4.487340354867e-05 true resid norm 8.485532777055e-07 ||r(i)||/||b|| 1.916956912993e-03 47 KSP preconditioned resid norm 3.739882357030e-05 true resid norm 7.003096353684e-07 ||r(i)||/||b|| 1.582061412083e-03 48 KSP preconditioned resid norm 3.137542049643e-05 true resid norm 5.955225458651e-07 ||r(i)||/||b|| 1.345338107969e-03 49 KSP preconditioned resid norm 2.602851465755e-05 true resid norm 4.947611952217e-07 ||r(i)||/||b|| 1.117709304035e-03 50 KSP preconditioned resid norm 2.239648622827e-05 true resid norm 4.176334446981e-07 ||r(i)||/||b|| 9.434708932782e-04 51 KSP preconditioned resid norm 1.943272403913e-05 true resid norm 3.617586189005e-07 ||r(i)||/||b|| 8.172447194018e-04 52 KSP preconditioned resid norm 1.694651141780e-05 true resid norm 3.124113892535e-07 ||r(i)||/||b|| 7.057649626274e-04 53 KSP preconditioned resid norm 1.437962880232e-05 true resid norm 2.673007067596e-07 ||r(i)||/||b|| 6.038559406148e-04 54 KSP preconditioned resid norm 1.238515886214e-05 true resid norm 2.296627404421e-07 ||r(i)||/||b|| 5.188284454428e-04 55 KSP preconditioned resid norm 1.066480833551e-05 true resid norm 2.011009805376e-07 ||r(i)||/||b|| 4.543049033923e-04 56 KSP preconditioned resid norm 9.252762614443e-06 true resid norm 1.750957732474e-07 ||r(i)||/||b|| 3.955568398369e-04 57 KSP preconditioned resid norm 8.156742519122e-06 true resid norm 1.543548242077e-07 ||r(i)||/||b|| 3.487012013185e-04 58 KSP preconditioned resid norm 7.047446830436e-06 true resid norm 1.329697427176e-07 ||r(i)||/||b|| 3.003904106181e-04 59 KSP preconditioned resid norm 6.272543501603e-06 true resid norm 1.173596968440e-07 ||r(i)||/||b|| 2.651259362053e-04 60 KSP preconditioned resid norm 5.617850076658e-06 true resid norm 1.048872581175e-07 ||r(i)||/||b|| 2.369495938746e-04 61 KSP preconditioned resid norm 5.088697078617e-06 true resid norm 9.338259193215e-08 ||r(i)||/||b|| 2.109595353182e-04 62 KSP preconditioned resid norm 4.609889053145e-06 true resid norm 8.448514669193e-08 ||r(i)||/||b|| 1.908594194984e-04 63 KSP preconditioned resid norm 4.348711544154e-06 true resid norm 7.896125674079e-08 ||r(i)||/||b|| 1.783804634839e-04 64 KSP preconditioned resid norm 4.134741875454e-06 true resid norm 7.522427150948e-08 ||r(i)||/||b|| 1.699382832918e-04 65 KSP preconditioned resid norm 3.890648765226e-06 true resid norm 7.140013752843e-08 ||r(i)||/||b|| 1.612992263654e-04 66 KSP preconditioned resid norm 3.630637183476e-06 true resid norm 6.620414074232e-08 ||r(i)||/||b|| 1.495610100145e-04 67 KSP preconditioned resid norm 3.351402191268e-06 true resid norm 6.142354083693e-08 ||r(i)||/||b|| 1.387612119609e-04 68 KSP preconditioned resid norm 3.003069764131e-06 true resid norm 5.521174933891e-08 ||r(i)||/||b|| 1.247282255038e-04 69 KSP preconditioned resid norm 2.553922397494e-06 true resid norm 4.752558914508e-08 ||r(i)||/||b|| 1.073645097478e-04 70 KSP preconditioned resid norm 2.175897257712e-06 true resid norm 4.034864166737e-08 ||r(i)||/||b|| 9.115115055977e-05 71 KSP preconditioned resid norm 1.819866210793e-06 true resid norm 3.431149032452e-08 ||r(i)||/||b|| 7.751269166095e-05 72 KSP preconditioned resid norm 1.548242180926e-06 true resid norm 2.928994370346e-08 ||r(i)||/||b|| 6.616857366380e-05 73 KSP preconditioned resid norm 1.310285862681e-06 true resid norm 2.457648929714e-08 ||r(i)||/||b|| 5.552046323200e-05 74 KSP preconditioned resid norm 1.117760857018e-06 true resid norm 2.124356033291e-08 ||r(i)||/||b|| 4.799108188805e-05 75 KSP preconditioned resid norm 9.400046718828e-07 true resid norm 1.801132753984e-08 ||r(i)||/||b|| 4.068918210183e-05 76 KSP preconditioned resid norm 7.879724131876e-07 true resid norm 1.487185493020e-08 ||r(i)||/||b|| 3.359683577508e-05 77 KSP preconditioned resid norm 6.501647223826e-07 true resid norm 1.238639168275e-08 ||r(i)||/||b|| 2.798195444779e-05 78 KSP preconditioned resid norm 5.244102352647e-07 true resid norm 9.938643942450e-09 ||r(i)||/||b|| 2.245227578729e-05 79 KSP preconditioned resid norm 4.252748195173e-07 true resid norm 8.086882864631e-09 ||r(i)||/||b|| 1.826898371524e-05 80 KSP preconditioned resid norm 3.570931502725e-07 true resid norm 6.678020281780e-09 ||r(i)||/||b|| 1.508623851985e-05 81 KSP preconditioned resid norm 3.065742074773e-07 true resid norm 5.781105865334e-09 ||r(i)||/||b|| 1.306002951666e-05 82 KSP preconditioned resid norm 2.643367534957e-07 true resid norm 4.982866732987e-09 ||r(i)||/||b|| 1.125673670856e-05 83 KSP preconditioned resid norm 2.280950542432e-07 true resid norm 4.327517506073e-09 ||r(i)||/||b|| 9.776244836147e-06 84 KSP preconditioned resid norm 1.995403320679e-07 true resid norm 3.788717715943e-09 ||r(i)||/||b|| 8.559048450786e-06 85 KSP preconditioned resid norm 1.801197754396e-07 true resid norm 3.391421449432e-09 ||r(i)||/||b|| 7.661521041955e-06 86 KSP preconditioned resid norm 1.657011734354e-07 true resid norm 3.116913459728e-09 ||r(i)||/||b|| 7.041383211650e-06 87 KSP preconditioned resid norm 1.543040611531e-07 true resid norm 2.878254030645e-09 ||r(i)||/||b|| 6.502230450765e-06 88 KSP preconditioned resid norm 1.407455045236e-07 true resid norm 2.663937606316e-09 ||r(i)||/||b|| 6.018070690879e-06 89 KSP preconditioned resid norm 1.274777598637e-07 true resid norm 2.419915125875e-09 ||r(i)||/||b|| 5.466802322591e-06 90 KSP preconditioned resid norm 1.175684432017e-07 true resid norm 2.220924203193e-09 ||r(i)||/||b|| 5.017264226537e-06 91 KSP preconditioned resid norm 1.075781206789e-07 true resid norm 1.993444802292e-09 ||r(i)||/||b|| 4.503368138245e-06 92 KSP preconditioned resid norm 9.759578419791e-08 true resid norm 1.821314287582e-09 ||r(i)||/||b|| 4.114510079736e-06 93 KSP preconditioned resid norm 8.925485244272e-08 true resid norm 1.674835427967e-09 ||r(i)||/||b|| 3.783601379100e-06 94 KSP preconditioned resid norm 8.089788818430e-08 true resid norm 1.515996211883e-09 ||r(i)||/||b|| 3.424769539867e-06 95 KSP preconditioned resid norm 7.289960311973e-08 true resid norm 1.368485652074e-09 ||r(i)||/||b|| 3.091530137232e-06 96 KSP preconditioned resid norm 6.469530000382e-08 true resid norm 1.226036696280e-09 ||r(i)||/||b|| 2.769725345794e-06 97 KSP preconditioned resid norm 5.424755061660e-08 true resid norm 1.031810853292e-09 ||r(i)||/||b|| 2.330951986265e-06 98 KSP preconditioned resid norm 4.410372113577e-08 true resid norm 8.347415824027e-10 ||r(i)||/||b|| 1.885755071592e-06 99 KSP preconditioned resid norm 3.626671465664e-08 true resid norm 6.760895394329e-10 ||r(i)||/||b|| 1.527346072980e-06 100 KSP preconditioned resid norm 3.009853212523e-08 true resid norm 5.671239345142e-10 ||r(i)||/||b|| 1.281183132932e-06 101 KSP preconditioned resid norm 2.466608372990e-08 true resid norm 4.710596536551e-10 ||r(i)||/||b|| 1.064165425120e-06 102 KSP preconditioned resid norm 2.046065106600e-08 true resid norm 3.883310134620e-10 ||r(i)||/||b|| 8.772741091736e-07 103 KSP preconditioned resid norm 1.744110409526e-08 true resid norm 3.265789399039e-10 ||r(i)||/||b|| 7.377707127353e-07 104 KSP preconditioned resid norm 1.491687421251e-08 true resid norm 2.810628271215e-10 ||r(i)||/||b|| 6.349457878387e-07 105 KSP preconditioned resid norm 1.255818686884e-08 true resid norm 2.373067242255e-10 ||r(i)||/||b|| 5.360968809569e-07 106 KSP preconditioned resid norm 1.048044151279e-08 true resid norm 1.942962727142e-10 ||r(i)||/||b|| 4.389324665097e-07 107 KSP preconditioned resid norm 8.811173613915e-09 true resid norm 1.601952795475e-10 ||r(i)||/||b|| 3.618953065478e-07 108 KSP preconditioned resid norm 7.580770590862e-09 true resid norm 1.356161927586e-10 ||r(i)||/||b|| 3.063689753522e-07 109 KSP preconditioned resid norm 6.667645657219e-09 true resid norm 1.184207170417e-10 ||r(i)||/||b|| 2.675228746843e-07 110 KSP preconditioned resid norm 5.786888011591e-09 true resid norm 1.032767550023e-10 ||r(i)||/||b|| 2.333113248803e-07 111 KSP preconditioned resid norm 5.105514156375e-09 true resid norm 9.072728069484e-11 ||r(i)||/||b|| 2.049609523580e-07 112 KSP preconditioned resid norm 4.598490784391e-09 true resid norm 8.132033496205e-11 ||r(i)||/||b|| 1.837098298576e-07 113 KSP preconditioned resid norm 4.289089905641e-09 true resid norm 7.522280292960e-11 ||r(i)||/||b|| 1.699349656400e-07 114 KSP preconditioned resid norm 3.931201484852e-09 true resid norm 6.933688891365e-11 ||r(i)||/||b|| 1.566381652403e-07 115 KSP preconditioned resid norm 3.615770085121e-09 true resid norm 6.449782831177e-11 ||r(i)||/||b|| 1.457062993023e-07 116 KSP preconditioned resid norm 3.343513105555e-09 true resid norm 5.983074039029e-11 ||r(i)||/||b|| 1.351629348610e-07 117 KSP preconditioned resid norm 2.979612262456e-09 true resid norm 5.432722174073e-11 ||r(i)||/||b|| 1.227299994188e-07 118 KSP preconditioned resid norm 2.618943928923e-09 true resid norm 4.837467850025e-11 ||r(i)||/||b|| 1.092826777072e-07 Linear solve converged due to CONVERGED_RTOL iterations 118 iter = 13, Function value: -4.34175, Residual: 0.000473351 0 KSP preconditioned resid norm 1.665406546128e-02 true resid norm 2.542389909048e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 7.762821677879e-03 true resid norm 1.723130360340e-04 ||r(i)||/||b|| 6.777600690625e-01 2 KSP preconditioned resid norm 5.730360207287e-03 true resid norm 1.175034297243e-04 ||r(i)||/||b|| 4.621770614575e-01 3 KSP preconditioned resid norm 4.158254684144e-03 true resid norm 9.233346809360e-05 ||r(i)||/||b|| 3.631758754430e-01 4 KSP preconditioned resid norm 3.462269316449e-03 true resid norm 7.319286387970e-05 ||r(i)||/||b|| 2.878900030999e-01 5 KSP preconditioned resid norm 2.947014730376e-03 true resid norm 6.231760845069e-05 ||r(i)||/||b|| 2.451142849054e-01 6 KSP preconditioned resid norm 2.489751932136e-03 true resid norm 5.121995881800e-05 ||r(i)||/||b|| 2.014638220350e-01 7 KSP preconditioned resid norm 2.187122046702e-03 true resid norm 4.438983484254e-05 ||r(i)||/||b|| 1.745988476613e-01 8 KSP preconditioned resid norm 1.933326609914e-03 true resid norm 3.907945163153e-05 ||r(i)||/||b|| 1.537114802590e-01 9 KSP preconditioned resid norm 1.770984446142e-03 true resid norm 3.616765436381e-05 ||r(i)||/||b|| 1.422584877130e-01 10 KSP preconditioned resid norm 1.552274290148e-03 true resid norm 3.077428893596e-05 ||r(i)||/||b|| 1.210447257772e-01 11 KSP preconditioned resid norm 1.420403944918e-03 true resid norm 2.771678069237e-05 ||r(i)||/||b|| 1.090186072315e-01 12 KSP preconditioned resid norm 1.290086011974e-03 true resid norm 2.423194900580e-05 ||r(i)||/||b|| 9.531169440045e-02 13 KSP preconditioned resid norm 1.176835496138e-03 true resid norm 2.279581957798e-05 ||r(i)||/||b|| 8.966295648379e-02 14 KSP preconditioned resid norm 1.029531641046e-03 true resid norm 1.906420570115e-05 ||r(i)||/||b|| 7.498537353890e-02 15 KSP preconditioned resid norm 9.051957508244e-04 true resid norm 1.651822339016e-05 ||r(i)||/||b|| 6.497124351924e-02 16 KSP preconditioned resid norm 8.326573125341e-04 true resid norm 1.512108143206e-05 ||r(i)||/||b|| 5.947585528972e-02 17 KSP preconditioned resid norm 7.947963753897e-04 true resid norm 1.433488854741e-05 ||r(i)||/||b|| 5.638351732123e-02 18 KSP preconditioned resid norm 7.750505115026e-04 true resid norm 1.375763579889e-05 ||r(i)||/||b|| 5.411300505061e-02 19 KSP preconditioned resid norm 7.515608274598e-04 true resid norm 1.356212679763e-05 ||r(i)||/||b|| 5.334400812939e-02 20 KSP preconditioned resid norm 7.168397682319e-04 true resid norm 1.288201885123e-05 ||r(i)||/||b|| 5.066893478999e-02 21 KSP preconditioned resid norm 6.571699209222e-04 true resid norm 1.198728836689e-05 ||r(i)||/||b|| 4.714968512195e-02 22 KSP preconditioned resid norm 5.847535237162e-04 true resid norm 1.075832320564e-05 ||r(i)||/||b|| 4.231578786304e-02 23 KSP preconditioned resid norm 5.040039697040e-04 true resid norm 9.218464930846e-06 ||r(i)||/||b|| 3.625905254752e-02 24 KSP preconditioned resid norm 4.267511938192e-04 true resid norm 8.008254242835e-06 ||r(i)||/||b|| 3.149892238926e-02 25 KSP preconditioned resid norm 3.514634470641e-04 true resid norm 6.497919814166e-06 ||r(i)||/||b|| 2.555831342408e-02 26 KSP preconditioned resid norm 2.916096460573e-04 true resid norm 5.460677972310e-06 ||r(i)||/||b|| 2.147852283741e-02 27 KSP preconditioned resid norm 2.484153941515e-04 true resid norm 4.723322757188e-06 ||r(i)||/||b|| 1.857827841582e-02 28 KSP preconditioned resid norm 2.191777036887e-04 true resid norm 4.191375391372e-06 ||r(i)||/||b|| 1.648596612367e-02 29 KSP preconditioned resid norm 1.951913502236e-04 true resid norm 3.770547041410e-06 ||r(i)||/||b|| 1.483071903326e-02 30 KSP preconditioned resid norm 1.719469806897e-04 true resid norm 3.339161788619e-06 ||r(i)||/||b|| 1.313394840318e-02 31 KSP preconditioned resid norm 1.516623276863e-04 true resid norm 2.918435521540e-06 ||r(i)||/||b|| 1.147910283609e-02 32 KSP preconditioned resid norm 1.332909230881e-04 true resid norm 2.570137389514e-06 ||r(i)||/||b|| 1.010913935887e-02 33 KSP preconditioned resid norm 1.166102561032e-04 true resid norm 2.239798146899e-06 ||r(i)||/||b|| 8.809813706890e-03 34 KSP preconditioned resid norm 9.862102896104e-05 true resid norm 1.867740593429e-06 ||r(i)||/||b|| 7.346397131225e-03 35 KSP preconditioned resid norm 8.345932858878e-05 true resid norm 1.594139752687e-06 ||r(i)||/||b|| 6.270241031926e-03 36 KSP preconditioned resid norm 7.124244422748e-05 true resid norm 1.358179091408e-06 ||r(i)||/||b|| 5.342135313606e-03 37 KSP preconditioned resid norm 6.080305219803e-05 true resid norm 1.183439247526e-06 ||r(i)||/||b|| 4.654829864272e-03 38 KSP preconditioned resid norm 5.146086530179e-05 true resid norm 9.863020309344e-07 ||r(i)||/||b|| 3.879428672306e-03 39 KSP preconditioned resid norm 4.528243684605e-05 true resid norm 8.672324661035e-07 ||r(i)||/||b|| 3.411091520687e-03 40 KSP preconditioned resid norm 4.015530901780e-05 true resid norm 7.810501070816e-07 ||r(i)||/||b|| 3.072109845551e-03 41 KSP preconditioned resid norm 3.601609891090e-05 true resid norm 6.965475410266e-07 ||r(i)||/||b|| 2.739735311832e-03 42 KSP preconditioned resid norm 3.375820115143e-05 true resid norm 6.456503328774e-07 ||r(i)||/||b|| 2.539540967259e-03 43 KSP preconditioned resid norm 3.238518318841e-05 true resid norm 6.292058277199e-07 ||r(i)||/||b|| 2.474859680180e-03 44 KSP preconditioned resid norm 3.081908393977e-05 true resid norm 5.974807824183e-07 ||r(i)||/||b|| 2.350075337744e-03 45 KSP preconditioned resid norm 2.833116548689e-05 true resid norm 5.570285287043e-07 ||r(i)||/||b|| 2.190964205459e-03 46 KSP preconditioned resid norm 2.557534575364e-05 true resid norm 4.938717716617e-07 ||r(i)||/||b|| 1.942549291531e-03 47 KSP preconditioned resid norm 2.200016307922e-05 true resid norm 4.309049518621e-07 ||r(i)||/||b|| 1.694881459089e-03 48 KSP preconditioned resid norm 1.893145369922e-05 true resid norm 3.668739167400e-07 ||r(i)||/||b|| 1.443027741081e-03 49 KSP preconditioned resid norm 1.597334300065e-05 true resid norm 3.071175098862e-07 ||r(i)||/||b|| 1.207987448319e-03 50 KSP preconditioned resid norm 1.310136774183e-05 true resid norm 2.564578937007e-07 ||r(i)||/||b|| 1.008727625877e-03 51 KSP preconditioned resid norm 1.084764737205e-05 true resid norm 2.049360568312e-07 ||r(i)||/||b|| 8.060764247917e-04 52 KSP preconditioned resid norm 9.351073254127e-06 true resid norm 1.802829844611e-07 ||r(i)||/||b|| 7.091083229190e-04 53 KSP preconditioned resid norm 8.036962182361e-06 true resid norm 1.565039884598e-07 ||r(i)||/||b|| 6.155782317370e-04 54 KSP preconditioned resid norm 6.851814685728e-06 true resid norm 1.309663966713e-07 ||r(i)||/||b|| 5.151310434532e-04 55 KSP preconditioned resid norm 5.883865692944e-06 true resid norm 1.112887538854e-07 ||r(i)||/||b|| 4.377328335410e-04 56 KSP preconditioned resid norm 5.072708150447e-06 true resid norm 9.692464391766e-08 ||r(i)||/||b|| 3.812343794031e-04 57 KSP preconditioned resid norm 4.460772452998e-06 true resid norm 8.439709311622e-08 ||r(i)||/||b|| 3.319596762710e-04 58 KSP preconditioned resid norm 3.840182059036e-06 true resid norm 7.345567785883e-08 ||r(i)||/||b|| 2.889237311610e-04 59 KSP preconditioned resid norm 3.336336514839e-06 true resid norm 6.430064814244e-08 ||r(i)||/||b|| 2.529141887859e-04 60 KSP preconditioned resid norm 2.900267096427e-06 true resid norm 5.680968553960e-08 ||r(i)||/||b|| 2.234499332200e-04 61 KSP preconditioned resid norm 2.622574784800e-06 true resid norm 5.117589550846e-08 ||r(i)||/||b|| 2.012905075116e-04 62 KSP preconditioned resid norm 2.348923915177e-06 true resid norm 4.559088530578e-08 ||r(i)||/||b|| 1.793229478434e-04 63 KSP preconditioned resid norm 2.135618976340e-06 true resid norm 4.109934908821e-08 ||r(i)||/||b|| 1.616563570440e-04 64 KSP preconditioned resid norm 1.948652255387e-06 true resid norm 3.792162147274e-08 ||r(i)||/||b|| 1.491573788024e-04 65 KSP preconditioned resid norm 1.772853480251e-06 true resid norm 3.404740806878e-08 ||r(i)||/||b|| 1.339189081408e-04 66 KSP preconditioned resid norm 1.632419572311e-06 true resid norm 3.192205055143e-08 ||r(i)||/||b|| 1.255592245620e-04 67 KSP preconditioned resid norm 1.537342676530e-06 true resid norm 2.933588200712e-08 ||r(i)||/||b|| 1.153870297499e-04 68 KSP preconditioned resid norm 1.465519778017e-06 true resid norm 2.822501838826e-08 ||r(i)||/||b|| 1.110176621132e-04 69 KSP preconditioned resid norm 1.337225700066e-06 true resid norm 2.582699048234e-08 ||r(i)||/||b|| 1.015854821891e-04 70 KSP preconditioned resid norm 1.176472143456e-06 true resid norm 2.210340345603e-08 ||r(i)||/||b|| 8.693947131151e-05 71 KSP preconditioned resid norm 1.007724709002e-06 true resid norm 1.880177249637e-08 ||r(i)||/||b|| 7.395314318020e-05 72 KSP preconditioned resid norm 8.806895093453e-07 true resid norm 1.640579686125e-08 ||r(i)||/||b|| 6.452903546724e-05 73 KSP preconditioned resid norm 7.645074064644e-07 true resid norm 1.455849027510e-08 ||r(i)||/||b|| 5.726301155967e-05 74 KSP preconditioned resid norm 6.690568602477e-07 true resid norm 1.279302677709e-08 ||r(i)||/||b|| 5.031890164276e-05 75 KSP preconditioned resid norm 5.847136611453e-07 true resid norm 1.123348464954e-08 ||r(i)||/||b|| 4.418474369161e-05 76 KSP preconditioned resid norm 5.063952503599e-07 true resid norm 9.841336997405e-09 ||r(i)||/||b|| 3.870899960065e-05 77 KSP preconditioned resid norm 4.499914848548e-07 true resid norm 8.491434535985e-09 ||r(i)||/||b|| 3.339941881363e-05 78 KSP preconditioned resid norm 3.903526436765e-07 true resid norm 7.456808129350e-09 ||r(i)||/||b|| 2.932991553661e-05 79 KSP preconditioned resid norm 3.315524318510e-07 true resid norm 6.388872412591e-09 ||r(i)||/||b|| 2.512939651725e-05 80 KSP preconditioned resid norm 2.813356386563e-07 true resid norm 5.304011984782e-09 ||r(i)||/||b|| 2.086230741361e-05 81 KSP preconditioned resid norm 2.434980744312e-07 true resid norm 4.621382142216e-09 ||r(i)||/||b|| 1.817731468241e-05 82 KSP preconditioned resid norm 2.146039101655e-07 true resid norm 4.071459409762e-09 ||r(i)||/||b|| 1.601429975501e-05 83 KSP preconditioned resid norm 1.906451692306e-07 true resid norm 3.605849643930e-09 ||r(i)||/||b|| 1.418291360856e-05 84 KSP preconditioned resid norm 1.696949140319e-07 true resid norm 3.225946797325e-09 ||r(i)||/||b|| 1.268863908657e-05 85 KSP preconditioned resid norm 1.510311127331e-07 true resid norm 2.873880025223e-09 ||r(i)||/||b|| 1.130385239099e-05 86 KSP preconditioned resid norm 1.313985697057e-07 true resid norm 2.498109538074e-09 ||r(i)||/||b|| 9.825831707337e-06 87 KSP preconditioned resid norm 1.162363605880e-07 true resid norm 2.153812725524e-09 ||r(i)||/||b|| 8.471606647977e-06 88 KSP preconditioned resid norm 1.027351892078e-07 true resid norm 1.907041977042e-09 ||r(i)||/||b|| 7.500981538098e-06 89 KSP preconditioned resid norm 9.006376613620e-08 true resid norm 1.675261777899e-09 ||r(i)||/||b|| 6.589318860719e-06 90 KSP preconditioned resid norm 7.938070031619e-08 true resid norm 1.481537361672e-09 ||r(i)||/||b|| 5.827341260281e-06 91 KSP preconditioned resid norm 7.045983067036e-08 true resid norm 1.333708610854e-09 ||r(i)||/||b|| 5.245885401398e-06 92 KSP preconditioned resid norm 6.344583591785e-08 true resid norm 1.187609504833e-09 ||r(i)||/||b|| 4.671232766487e-06 93 KSP preconditioned resid norm 5.488145237787e-08 true resid norm 1.039023079803e-09 ||r(i)||/||b|| 4.086796742332e-06 94 KSP preconditioned resid norm 4.665102853183e-08 true resid norm 8.859316833387e-10 ||r(i)||/||b|| 3.484641282542e-06 95 KSP preconditioned resid norm 3.918906865965e-08 true resid norm 7.462289012816e-10 ||r(i)||/||b|| 2.935147353385e-06 96 KSP preconditioned resid norm 3.418329138079e-08 true resid norm 6.470736288472e-10 ||r(i)||/||b|| 2.545139227246e-06 97 KSP preconditioned resid norm 2.958182063702e-08 true resid norm 5.668036922505e-10 ||r(i)||/||b|| 2.229412924561e-06 98 KSP preconditioned resid norm 2.605801895302e-08 true resid norm 5.016897131464e-10 ||r(i)||/||b|| 1.973299655419e-06 99 KSP preconditioned resid norm 2.297469741371e-08 true resid norm 4.404562612435e-10 ||r(i)||/||b|| 1.732449691041e-06 100 KSP preconditioned resid norm 2.028123585027e-08 true resid norm 3.953809065586e-10 ||r(i)||/||b|| 1.555154483392e-06 101 KSP preconditioned resid norm 1.821783809657e-08 true resid norm 3.508226051296e-10 ||r(i)||/||b|| 1.379893004928e-06 102 KSP preconditioned resid norm 1.668975071392e-08 true resid norm 3.221769138587e-10 ||r(i)||/||b|| 1.267220707226e-06 103 KSP preconditioned resid norm 1.546316021900e-08 true resid norm 3.033698969283e-10 ||r(i)||/||b|| 1.193246936076e-06 104 KSP preconditioned resid norm 1.435290425838e-08 true resid norm 2.771671469340e-10 ||r(i)||/||b|| 1.090183476372e-06 105 KSP preconditioned resid norm 1.327443549963e-08 true resid norm 2.609245531633e-10 ||r(i)||/||b|| 1.026296368762e-06 106 KSP preconditioned resid norm 1.220225779437e-08 true resid norm 2.386548866911e-10 ||r(i)||/||b|| 9.387029339667e-07 107 KSP preconditioned resid norm 1.096964523364e-08 true resid norm 2.144025306558e-10 ||r(i)||/||b|| 8.433109724545e-07 108 KSP preconditioned resid norm 9.607281795243e-09 true resid norm 1.850447180457e-10 ||r(i)||/||b|| 7.278376829105e-07 109 KSP preconditioned resid norm 8.249798087173e-09 true resid norm 1.588822791602e-10 ||r(i)||/||b|| 6.249327791727e-07 110 KSP preconditioned resid norm 7.098531926403e-09 true resid norm 1.367985030728e-10 ||r(i)||/||b|| 5.380705083273e-07 111 KSP preconditioned resid norm 6.187916604076e-09 true resid norm 1.198542566625e-10 ||r(i)||/||b|| 4.714235854854e-07 112 KSP preconditioned resid norm 5.274910109217e-09 true resid norm 1.033366212017e-10 ||r(i)||/||b|| 4.064546544725e-07 113 KSP preconditioned resid norm 4.461349389916e-09 true resid norm 8.735103877817e-11 ||r(i)||/||b|| 3.435784513906e-07 114 KSP preconditioned resid norm 3.763466701280e-09 true resid norm 7.355582001244e-11 ||r(i)||/||b|| 2.893176209938e-07 115 KSP preconditioned resid norm 3.180877156906e-09 true resid norm 6.207168940750e-11 ||r(i)||/||b|| 2.441470098139e-07 116 KSP preconditioned resid norm 2.699351210329e-09 true resid norm 5.223380269809e-11 ||r(i)||/||b|| 2.054515812551e-07 117 KSP preconditioned resid norm 2.231855875133e-09 true resid norm 4.320892913413e-11 ||r(i)||/||b|| 1.699539829841e-07 118 KSP preconditioned resid norm 1.844930804643e-09 true resid norm 3.610467103818e-11 ||r(i)||/||b|| 1.420107549581e-07 119 KSP preconditioned resid norm 1.545780195325e-09 true resid norm 2.988823067721e-11 ||r(i)||/||b|| 1.175595866348e-07 Linear solve converged due to CONVERGED_RTOL iterations 119 iter = 14, Function value: -4.34176, Residual: 0.000272484 0 KSP preconditioned resid norm 9.512781256679e-03 true resid norm 1.451643280572e-04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.247935335848e-03 true resid norm 8.559814023996e-05 ||r(i)||/||b|| 5.896637375417e-01 2 KSP preconditioned resid norm 3.048007251559e-03 true resid norm 6.106713057651e-05 ||r(i)||/||b|| 4.206758739822e-01 3 KSP preconditioned resid norm 2.280558081061e-03 true resid norm 4.731567305671e-05 ||r(i)||/||b|| 3.259455934522e-01 4 KSP preconditioned resid norm 1.782956153208e-03 true resid norm 3.604441930644e-05 ||r(i)||/||b|| 2.483008035709e-01 5 KSP preconditioned resid norm 1.454433854062e-03 true resid norm 2.977300884449e-05 ||r(i)||/||b|| 2.050986577966e-01 6 KSP preconditioned resid norm 1.245479739002e-03 true resid norm 2.500612362386e-05 ||r(i)||/||b|| 1.722608023508e-01 7 KSP preconditioned resid norm 1.065986588615e-03 true resid norm 2.133476287121e-05 ||r(i)||/||b|| 1.469697353113e-01 8 KSP preconditioned resid norm 9.191225715829e-04 true resid norm 1.807879070452e-05 ||r(i)||/||b|| 1.245401742045e-01 9 KSP preconditioned resid norm 8.312159401608e-04 true resid norm 1.657189192811e-05 ||r(i)||/||b|| 1.141595331987e-01 10 KSP preconditioned resid norm 7.361047509661e-04 true resid norm 1.476020208585e-05 ||r(i)||/||b|| 1.016792643440e-01 11 KSP preconditioned resid norm 6.655336457854e-04 true resid norm 1.289834164826e-05 ||r(i)||/||b|| 8.885338306510e-02 12 KSP preconditioned resid norm 5.938398806050e-04 true resid norm 1.126115668417e-05 ||r(i)||/||b|| 7.757523377048e-02 13 KSP preconditioned resid norm 5.476388407658e-04 true resid norm 1.008223648285e-05 ||r(i)||/||b|| 6.945395344557e-02 14 KSP preconditioned resid norm 4.998263892484e-04 true resid norm 9.194829099331e-06 ||r(i)||/||b|| 6.334083050834e-02 15 KSP preconditioned resid norm 4.376679373662e-04 true resid norm 7.978324864436e-06 ||r(i)||/||b|| 5.496064337025e-02 16 KSP preconditioned resid norm 3.839433433867e-04 true resid norm 6.679072172684e-06 ||r(i)||/||b|| 4.601042323601e-02 17 KSP preconditioned resid norm 3.466763516145e-04 true resid norm 6.070502281893e-06 ||r(i)||/||b|| 4.181814060751e-02 18 KSP preconditioned resid norm 3.415570839864e-04 true resid norm 5.882733626133e-06 ||r(i)||/||b|| 4.052465026955e-02 19 KSP preconditioned resid norm 3.356182901931e-04 true resid norm 5.707552490361e-06 ||r(i)||/||b|| 3.931787214357e-02 20 KSP preconditioned resid norm 3.234617963319e-04 true resid norm 5.649763671395e-06 ||r(i)||/||b|| 3.891977972142e-02 21 KSP preconditioned resid norm 3.052262524881e-04 true resid norm 5.323510730924e-06 ||r(i)||/||b|| 3.667230649685e-02 22 KSP preconditioned resid norm 2.793964844470e-04 true resid norm 5.025542873273e-06 ||r(i)||/||b|| 3.461968198751e-02 23 KSP preconditioned resid norm 2.438471206299e-04 true resid norm 4.375295442107e-06 ||r(i)||/||b|| 3.014029342238e-02 24 KSP preconditioned resid norm 2.075797620578e-04 true resid norm 3.763120001701e-06 ||r(i)||/||b|| 2.592317308297e-02 25 KSP preconditioned resid norm 1.717271800505e-04 true resid norm 3.127921673267e-06 ||r(i)||/||b|| 2.154745394498e-02 26 KSP preconditioned resid norm 1.409799139711e-04 true resid norm 2.557158767523e-06 ||r(i)||/||b|| 1.761561398551e-02 27 KSP preconditioned resid norm 1.132592353157e-04 true resid norm 2.103070258973e-06 ||r(i)||/||b|| 1.448751416494e-02 28 KSP preconditioned resid norm 9.409792491380e-05 true resid norm 1.778286352206e-06 ||r(i)||/||b|| 1.225016073856e-02 29 KSP preconditioned resid norm 8.298812912571e-05 true resid norm 1.562597265463e-06 ||r(i)||/||b|| 1.076433367877e-02 30 KSP preconditioned resid norm 7.516274621636e-05 true resid norm 1.411405325034e-06 ||r(i)||/||b|| 9.722810995805e-03 31 KSP preconditioned resid norm 6.668806155590e-05 true resid norm 1.268209234974e-06 ||r(i)||/||b|| 8.736369684939e-03 32 KSP preconditioned resid norm 5.983348325875e-05 true resid norm 1.134049306052e-06 ||r(i)||/||b|| 7.812176181496e-03 33 KSP preconditioned resid norm 5.500090764363e-05 true resid norm 1.040122705665e-06 ||r(i)||/||b|| 7.165139808006e-03 34 KSP preconditioned resid norm 4.846195374763e-05 true resid norm 9.285230995401e-07 ||r(i)||/||b|| 6.396358609355e-03 35 KSP preconditioned resid norm 4.223946705512e-05 true resid norm 8.079300675425e-07 ||r(i)||/||b|| 5.565623995614e-03 36 KSP preconditioned resid norm 3.714935379640e-05 true resid norm 6.949254811812e-07 ||r(i)||/||b|| 4.787164246764e-03 37 KSP preconditioned resid norm 3.274041238901e-05 true resid norm 6.218688080125e-07 ||r(i)||/||b|| 4.283895474426e-03 38 KSP preconditioned resid norm 2.861832059681e-05 true resid norm 5.516883601066e-07 ||r(i)||/||b|| 3.800440283711e-03 39 KSP preconditioned resid norm 2.486535522020e-05 true resid norm 4.758309590275e-07 ||r(i)||/||b|| 3.277878011738e-03 40 KSP preconditioned resid norm 2.166974338831e-05 true resid norm 4.096702188078e-07 ||r(i)||/||b|| 2.822113561166e-03 41 KSP preconditioned resid norm 1.950257702500e-05 true resid norm 3.797284539713e-07 ||r(i)||/||b|| 2.615852386418e-03 42 KSP preconditioned resid norm 1.805341459389e-05 true resid norm 3.413588237374e-07 ||r(i)||/||b|| 2.351533798323e-03 43 KSP preconditioned resid norm 1.660018623069e-05 true resid norm 3.196333020606e-07 ||r(i)||/||b|| 2.201872225348e-03 44 KSP preconditioned resid norm 1.507816215902e-05 true resid norm 2.854746118192e-07 ||r(i)||/||b|| 1.966561727938e-03 45 KSP preconditioned resid norm 1.372245704602e-05 true resid norm 2.628857493497e-07 ||r(i)||/||b|| 1.810952820627e-03 46 KSP preconditioned resid norm 1.267449038834e-05 true resid norm 2.416991920070e-07 ||r(i)||/||b|| 1.665004035369e-03 47 KSP preconditioned resid norm 1.129080270679e-05 true resid norm 2.157335750295e-07 ||r(i)||/||b|| 1.486133528234e-03 48 KSP preconditioned resid norm 9.901508564335e-06 true resid norm 1.899438729462e-07 ||r(i)||/||b|| 1.308474853900e-03 49 KSP preconditioned resid norm 8.468209431232e-06 true resid norm 1.620225229717e-07 ||r(i)||/||b|| 1.116131801387e-03 50 KSP preconditioned resid norm 7.067051738098e-06 true resid norm 1.368713775782e-07 ||r(i)||/||b|| 9.428719810850e-04 51 KSP preconditioned resid norm 5.760968307639e-06 true resid norm 1.123216677002e-07 ||r(i)||/||b|| 7.737552965212e-04 52 KSP preconditioned resid norm 4.767946425207e-06 true resid norm 9.140834629864e-08 ||r(i)||/||b|| 6.296887639131e-04 53 KSP preconditioned resid norm 4.043611327868e-06 true resid norm 7.632893649213e-08 ||r(i)||/||b|| 5.258105590654e-04 54 KSP preconditioned resid norm 3.409200956023e-06 true resid norm 6.608447405989e-08 ||r(i)||/||b|| 4.552390724658e-04 55 KSP preconditioned resid norm 2.872688826482e-06 true resid norm 5.606627249858e-08 ||r(i)||/||b|| 3.862262392485e-04 56 KSP preconditioned resid norm 2.386453155897e-06 true resid norm 4.677950405183e-08 ||r(i)||/||b|| 3.222520620450e-04 57 KSP preconditioned resid norm 1.996293336952e-06 true resid norm 3.854654677596e-08 ||r(i)||/||b|| 2.655373209923e-04 58 KSP preconditioned resid norm 1.669255234067e-06 true resid norm 3.261923537996e-08 ||r(i)||/||b|| 2.247055858454e-04 59 KSP preconditioned resid norm 1.424343430628e-06 true resid norm 2.774524902984e-08 ||r(i)||/||b|| 1.911299380582e-04 60 KSP preconditioned resid norm 1.215569160336e-06 true resid norm 2.363371653446e-08 ||r(i)||/||b|| 1.628066402453e-04 61 KSP preconditioned resid norm 1.080355867943e-06 true resid norm 2.064951714295e-08 ||r(i)||/||b|| 1.422492524115e-04 62 KSP preconditioned resid norm 9.642525615553e-07 true resid norm 1.852781244262e-08 ||r(i)||/||b|| 1.276333703368e-04 63 KSP preconditioned resid norm 8.838967922883e-07 true resid norm 1.654081396652e-08 ||r(i)||/||b|| 1.139454450545e-04 64 KSP preconditioned resid norm 8.114463039732e-07 true resid norm 1.542606010334e-08 ||r(i)||/||b|| 1.062661902534e-04 65 KSP preconditioned resid norm 7.569157166540e-07 true resid norm 1.437211390620e-08 ||r(i)||/||b|| 9.900582394132e-05 66 KSP preconditioned resid norm 6.934677085550e-07 true resid norm 1.313136486222e-08 ||r(i)||/||b|| 9.045862050246e-05 67 KSP preconditioned resid norm 6.369982058575e-07 true resid norm 1.198373950697e-08 ||r(i)||/||b|| 8.255292238355e-05 68 KSP preconditioned resid norm 5.920982388892e-07 true resid norm 1.125392590973e-08 ||r(i)||/||b|| 7.752542281111e-05 69 KSP preconditioned resid norm 5.498249891308e-07 true resid norm 1.041112058224e-08 ||r(i)||/||b|| 7.171955205232e-05 70 KSP preconditioned resid norm 5.095091075694e-07 true resid norm 9.730210921943e-09 ||r(i)||/||b|| 6.702893921781e-05 71 KSP preconditioned resid norm 4.440774604911e-07 true resid norm 8.557690729922e-09 ||r(i)||/||b|| 5.895174692331e-05 72 KSP preconditioned resid norm 3.858825705857e-07 true resid norm 7.384737522803e-09 ||r(i)||/||b|| 5.087157169832e-05 73 KSP preconditioned resid norm 3.350888933055e-07 true resid norm 6.323926232115e-09 ||r(i)||/||b|| 4.356391350926e-05 74 KSP preconditioned resid norm 2.939199021955e-07 true resid norm 5.508873929747e-09 ||r(i)||/||b|| 3.794922625602e-05 75 KSP preconditioned resid norm 2.609814534518e-07 true resid norm 4.866523099332e-09 ||r(i)||/||b|| 3.352423535769e-05 76 KSP preconditioned resid norm 2.344877877143e-07 true resid norm 4.398409996642e-09 ||r(i)||/||b|| 3.029952368813e-05 77 KSP preconditioned resid norm 2.067507756677e-07 true resid norm 3.887704219130e-09 ||r(i)||/||b|| 2.678140195432e-05 78 KSP preconditioned resid norm 1.798655063324e-07 true resid norm 3.350107550101e-09 ||r(i)||/||b|| 2.307803573327e-05 79 KSP preconditioned resid norm 1.555139717897e-07 true resid norm 2.983671803119e-09 ||r(i)||/||b|| 2.055375341208e-05 80 KSP preconditioned resid norm 1.305172041942e-07 true resid norm 2.478903207218e-09 ||r(i)||/||b|| 1.707653140681e-05 81 KSP preconditioned resid norm 1.096471775176e-07 true resid norm 2.116354481708e-09 ||r(i)||/||b|| 1.457902578430e-05 82 KSP preconditioned resid norm 9.569246891295e-08 true resid norm 1.828050970992e-09 ||r(i)||/||b|| 1.259297649400e-05 83 KSP preconditioned resid norm 8.508083212076e-08 true resid norm 1.638150987541e-09 ||r(i)||/||b|| 1.128480398363e-05 84 KSP preconditioned resid norm 7.497638762904e-08 true resid norm 1.425310982653e-09 ||r(i)||/||b|| 9.818603521457e-06 85 KSP preconditioned resid norm 6.644409481813e-08 true resid norm 1.267232834567e-09 ||r(i)||/||b|| 8.729643511779e-06 86 KSP preconditioned resid norm 5.845287051654e-08 true resid norm 1.126477398379e-09 ||r(i)||/||b|| 7.760015242412e-06 87 KSP preconditioned resid norm 5.078242857944e-08 true resid norm 9.721939797321e-10 ||r(i)||/||b|| 6.697196155165e-06 88 KSP preconditioned resid norm 4.410803642403e-08 true resid norm 8.434786972170e-10 ||r(i)||/||b|| 5.810509430970e-06 89 KSP preconditioned resid norm 3.840716003571e-08 true resid norm 7.349328205815e-10 ||r(i)||/||b|| 5.062764595250e-06 90 KSP preconditioned resid norm 3.315478145107e-08 true resid norm 6.322233419033e-10 ||r(i)||/||b|| 4.355225215206e-06 91 KSP preconditioned resid norm 2.920722080312e-08 true resid norm 5.507152491118e-10 ||r(i)||/||b|| 3.793736770473e-06 92 KSP preconditioned resid norm 2.661333380491e-08 true resid norm 5.011967648660e-10 ||r(i)||/||b|| 3.452616573050e-06 93 KSP preconditioned resid norm 2.440634138181e-08 true resid norm 4.604517848671e-10 ||r(i)||/||b|| 3.171934806777e-06 94 KSP preconditioned resid norm 2.204816044796e-08 true resid norm 4.149283447929e-10 ||r(i)||/||b|| 2.858335448839e-06 95 KSP preconditioned resid norm 1.971585487393e-08 true resid norm 3.693266239522e-10 ||r(i)||/||b|| 2.544196834684e-06 96 KSP preconditioned resid norm 1.732547653660e-08 true resid norm 3.229256880620e-10 ||r(i)||/||b|| 2.224552632068e-06 97 KSP preconditioned resid norm 1.487158564563e-08 true resid norm 2.776722891061e-10 ||r(i)||/||b|| 1.912813518460e-06 98 KSP preconditioned resid norm 1.298790162378e-08 true resid norm 2.438340445636e-10 ||r(i)||/||b|| 1.679710489670e-06 99 KSP preconditioned resid norm 1.124975305383e-08 true resid norm 2.111337939943e-10 ||r(i)||/||b|| 1.454446810866e-06 100 KSP preconditioned resid norm 9.983351572406e-09 true resid norm 1.895300137236e-10 ||r(i)||/||b|| 1.305623883361e-06 101 KSP preconditioned resid norm 9.079623055690e-09 true resid norm 1.701663230664e-10 ||r(i)||/||b|| 1.172232361378e-06 102 KSP preconditioned resid norm 8.134129028763e-09 true resid norm 1.533708860681e-10 ||r(i)||/||b|| 1.056532883255e-06 103 KSP preconditioned resid norm 7.209228189314e-09 true resid norm 1.374610004440e-10 ||r(i)||/||b|| 9.469337424952e-07 104 KSP preconditioned resid norm 6.451814236276e-09 true resid norm 1.226717165142e-10 ||r(i)||/||b|| 8.450541407520e-07 105 KSP preconditioned resid norm 5.970764994512e-09 true resid norm 1.104355022221e-10 ||r(i)||/||b|| 7.607619840224e-07 106 KSP preconditioned resid norm 5.467718661820e-09 true resid norm 1.039408755359e-10 ||r(i)||/||b|| 7.160221586595e-07 107 KSP preconditioned resid norm 5.007798420131e-09 true resid norm 9.598805438838e-11 ||r(i)||/||b|| 6.612372038848e-07 108 KSP preconditioned resid norm 4.555048344232e-09 true resid norm 8.542883344042e-11 ||r(i)||/||b|| 5.884974262185e-07 109 KSP preconditioned resid norm 4.099284825149e-09 true resid norm 7.689076450753e-11 ||r(i)||/||b|| 5.296808488461e-07 110 KSP preconditioned resid norm 3.615702314960e-09 true resid norm 6.778750748526e-11 ||r(i)||/||b|| 4.669708350011e-07 111 KSP preconditioned resid norm 3.199383680271e-09 true resid norm 6.040136209062e-11 ||r(i)||/||b|| 4.160895648331e-07 112 KSP preconditioned resid norm 2.843128048355e-09 true resid norm 5.334187986598e-11 ||r(i)||/||b|| 3.674585938561e-07 113 KSP preconditioned resid norm 2.481571681153e-09 true resid norm 4.734372222782e-11 ||r(i)||/||b|| 3.261388170319e-07 114 KSP preconditioned resid norm 2.154178638363e-09 true resid norm 4.052168399609e-11 ||r(i)||/||b|| 2.791435371100e-07 115 KSP preconditioned resid norm 1.845229356536e-09 true resid norm 3.437430800692e-11 ||r(i)||/||b|| 2.367958331565e-07 116 KSP preconditioned resid norm 1.579340935429e-09 true resid norm 2.953099604151e-11 ||r(i)||/||b|| 2.034314933754e-07 117 KSP preconditioned resid norm 1.329184519435e-09 true resid norm 2.474847870647e-11 ||r(i)||/||b|| 1.704859522838e-07 118 KSP preconditioned resid norm 1.106492278970e-09 true resid norm 2.032516618157e-11 ||r(i)||/||b|| 1.400148814353e-07 119 KSP preconditioned resid norm 9.310945737255e-10 true resid norm 1.751690022046e-11 ||r(i)||/||b|| 1.206694540931e-07 Linear solve converged due to CONVERGED_RTOL iterations 119 iter = 15, Function value: -4.34176, Residual: 0.000132906 0 KSP preconditioned resid norm 4.184974371826e-03 true resid norm 6.803884023334e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.909035304799e-03 true resid norm 4.421910060409e-05 ||r(i)||/||b|| 6.499096758917e-01 2 KSP preconditioned resid norm 1.410286720438e-03 true resid norm 2.770041945675e-05 ||r(i)||/||b|| 4.071265671454e-01 3 KSP preconditioned resid norm 1.008004610115e-03 true resid norm 2.224280469591e-05 ||r(i)||/||b|| 3.269133427265e-01 4 KSP preconditioned resid norm 7.938281142422e-04 true resid norm 1.629208423665e-05 ||r(i)||/||b|| 2.394527034966e-01 5 KSP preconditioned resid norm 6.489401150198e-04 true resid norm 1.372908449974e-05 ||r(i)||/||b|| 2.017830470457e-01 6 KSP preconditioned resid norm 5.236781096777e-04 true resid norm 1.100680935596e-05 ||r(i)||/||b|| 1.617724423023e-01 7 KSP preconditioned resid norm 4.629183855892e-04 true resid norm 9.512571722139e-06 ||r(i)||/||b|| 1.398109034416e-01 8 KSP preconditioned resid norm 3.901536265055e-04 true resid norm 7.791569923851e-06 ||r(i)||/||b|| 1.145165011210e-01 9 KSP preconditioned resid norm 3.528646412086e-04 true resid norm 7.034996184748e-06 ||r(i)||/||b|| 1.033967680904e-01 10 KSP preconditioned resid norm 3.150964219564e-04 true resid norm 6.194294701855e-06 ||r(i)||/||b|| 9.104056860187e-02 11 KSP preconditioned resid norm 2.853773453948e-04 true resid norm 5.627815482687e-06 ||r(i)||/||b|| 8.271474739114e-02 12 KSP preconditioned resid norm 2.499333377964e-04 true resid norm 4.699848644376e-06 ||r(i)||/||b|| 6.907596643708e-02 13 KSP preconditioned resid norm 2.256272286909e-04 true resid norm 4.225838656812e-06 ||r(i)||/||b|| 6.210921059676e-02 14 KSP preconditioned resid norm 2.060598855007e-04 true resid norm 3.863848741708e-06 ||r(i)||/||b|| 5.678886836484e-02 15 KSP preconditioned resid norm 1.885875723485e-04 true resid norm 3.522314112040e-06 ||r(i)||/||b|| 5.176916743379e-02 16 KSP preconditioned resid norm 1.655818267727e-04 true resid norm 3.065790916930e-06 ||r(i)||/||b|| 4.505942350599e-02 17 KSP preconditioned resid norm 1.417619476339e-04 true resid norm 2.529870879701e-06 ||r(i)||/||b|| 3.718274548809e-02 18 KSP preconditioned resid norm 1.317367351147e-04 true resid norm 2.341398436036e-06 ||r(i)||/||b|| 3.441267411387e-02 19 KSP preconditioned resid norm 1.250171127939e-04 true resid norm 2.157229747952e-06 ||r(i)||/||b|| 3.170585713327e-02 20 KSP preconditioned resid norm 1.244896994340e-04 true resid norm 2.172918599071e-06 ||r(i)||/||b|| 3.193644382559e-02 21 KSP preconditioned resid norm 1.208592437155e-04 true resid norm 2.088406096862e-06 ||r(i)||/||b|| 3.069432238557e-02 22 KSP preconditioned resid norm 1.164840982643e-04 true resid norm 2.084464196558e-06 ||r(i)||/||b|| 3.063638635534e-02 23 KSP preconditioned resid norm 1.062000593035e-04 true resid norm 1.896098784878e-06 ||r(i)||/||b|| 2.786788808238e-02 24 KSP preconditioned resid norm 9.387876516150e-05 true resid norm 1.750226909857e-06 ||r(i)||/||b|| 2.572393803091e-02 25 KSP preconditioned resid norm 7.960756820736e-05 true resid norm 1.459699389267e-06 ||r(i)||/||b|| 2.145391344504e-02 26 KSP preconditioned resid norm 6.676507565897e-05 true resid norm 1.213061564223e-06 ||r(i)||/||b|| 1.782895710837e-02 27 KSP preconditioned resid norm 5.519538091414e-05 true resid norm 1.032303015773e-06 ||r(i)||/||b|| 1.517226061221e-02 28 KSP preconditioned resid norm 4.477950037418e-05 true resid norm 8.477604156264e-07 ||r(i)||/||b|| 1.245994806377e-02 29 KSP preconditioned resid norm 3.777479171437e-05 true resid norm 7.287554508874e-07 ||r(i)||/||b|| 1.071087408880e-02 30 KSP preconditioned resid norm 3.288573818592e-05 true resid norm 6.286997561128e-07 ||r(i)||/||b|| 9.240306771201e-03 31 KSP preconditioned resid norm 2.884987161402e-05 true resid norm 5.565523793358e-07 ||r(i)||/||b|| 8.179921606939e-03 32 KSP preconditioned resid norm 2.487502497872e-05 true resid norm 4.752295913598e-07 ||r(i)||/||b|| 6.984680951792e-03 33 KSP preconditioned resid norm 2.241282724515e-05 true resid norm 4.231177639454e-07 ||r(i)||/||b|| 6.218768022710e-03 34 KSP preconditioned resid norm 1.989521041243e-05 true resid norm 3.721861038361e-07 ||r(i)||/||b|| 5.470200587778e-03 35 KSP preconditioned resid norm 1.750966015249e-05 true resid norm 3.320144336208e-07 ||r(i)||/||b|| 4.879777969203e-03 36 KSP preconditioned resid norm 1.495638574718e-05 true resid norm 2.857075118775e-07 ||r(i)||/||b|| 4.199182568335e-03 37 KSP preconditioned resid norm 1.297573037965e-05 true resid norm 2.495752278564e-07 ||r(i)||/||b|| 3.668128777630e-03 38 KSP preconditioned resid norm 1.147473705218e-05 true resid norm 2.136969489685e-07 ||r(i)||/||b|| 3.140808224180e-03 39 KSP preconditioned resid norm 1.020642110610e-05 true resid norm 1.954746615151e-07 ||r(i)||/||b|| 2.872986383141e-03 40 KSP preconditioned resid norm 9.118157290403e-06 true resid norm 1.737580773760e-07 ||r(i)||/||b|| 2.553807160441e-03 41 KSP preconditioned resid norm 8.045943242362e-06 true resid norm 1.561082913863e-07 ||r(i)||/||b|| 2.294399652477e-03 42 KSP preconditioned resid norm 7.260370430295e-06 true resid norm 1.413341016330e-07 ||r(i)||/||b|| 2.077256184090e-03 43 KSP preconditioned resid norm 6.617138252750e-06 true resid norm 1.291520606641e-07 ||r(i)||/||b|| 1.898210789914e-03 44 KSP preconditioned resid norm 6.035956559514e-06 true resid norm 1.161705582936e-07 ||r(i)||/||b|| 1.707415321824e-03 45 KSP preconditioned resid norm 5.319598605023e-06 true resid norm 1.049006659833e-07 ||r(i)||/||b|| 1.541776221105e-03 46 KSP preconditioned resid norm 4.769610802394e-06 true resid norm 9.361906760439e-08 ||r(i)||/||b|| 1.375965070588e-03 47 KSP preconditioned resid norm 4.333070577183e-06 true resid norm 8.458629671312e-08 ||r(i)||/||b|| 1.243206033833e-03 48 KSP preconditioned resid norm 4.030245411709e-06 true resid norm 7.828168401483e-08 ||r(i)||/||b|| 1.150544067864e-03 49 KSP preconditioned resid norm 3.655050984968e-06 true resid norm 7.115523707764e-08 ||r(i)||/||b|| 1.045803203488e-03 50 KSP preconditioned resid norm 3.154830867310e-06 true resid norm 6.162004549812e-08 ||r(i)||/||b|| 9.056598449767e-04 51 KSP preconditioned resid norm 2.666362592644e-06 true resid norm 5.362426030166e-08 ||r(i)||/||b|| 7.881418924508e-04 52 KSP preconditioned resid norm 2.215975990319e-06 true resid norm 4.334364872589e-08 ||r(i)||/||b|| 6.370427329044e-04 53 KSP preconditioned resid norm 1.841960594553e-06 true resid norm 3.623976658411e-08 ||r(i)||/||b|| 5.326335143254e-04 54 KSP preconditioned resid norm 1.545998283796e-06 true resid norm 2.993595801864e-08 ||r(i)||/||b|| 4.399833670881e-04 55 KSP preconditioned resid norm 1.330945823688e-06 true resid norm 2.586309114755e-08 ||r(i)||/||b|| 3.801224573914e-04 56 KSP preconditioned resid norm 1.124635573916e-06 true resid norm 2.177982200721e-08 ||r(i)||/||b|| 3.201086604727e-04 57 KSP preconditioned resid norm 9.368296003785e-07 true resid norm 1.809097006799e-08 ||r(i)||/||b|| 2.658918054150e-04 58 KSP preconditioned resid norm 7.819211276208e-07 true resid norm 1.494801748347e-08 ||r(i)||/||b|| 2.196982992686e-04 59 KSP preconditioned resid norm 6.589852767022e-07 true resid norm 1.248690859789e-08 ||r(i)||/||b|| 1.835261823256e-04 60 KSP preconditioned resid norm 5.493735641088e-07 true resid norm 1.048671854207e-08 ||r(i)||/||b|| 1.541284140957e-04 61 KSP preconditioned resid norm 4.613027107581e-07 true resid norm 8.721128826355e-09 ||r(i)||/||b|| 1.281786814185e-04 62 KSP preconditioned resid norm 3.981989460219e-07 true resid norm 7.432836170761e-09 ||r(i)||/||b|| 1.092440162894e-04 63 KSP preconditioned resid norm 3.585632758135e-07 true resid norm 6.709387931420e-09 ||r(i)||/||b|| 9.861114487564e-05 64 KSP preconditioned resid norm 3.277163893322e-07 true resid norm 6.078868463693e-09 ||r(i)||/||b|| 8.934409291584e-05 65 KSP preconditioned resid norm 3.021216979753e-07 true resid norm 5.618874654401e-09 ||r(i)||/||b|| 8.258333967968e-05 66 KSP preconditioned resid norm 2.819087317031e-07 true resid norm 5.346565407866e-09 ||r(i)||/||b|| 7.858107794798e-05 67 KSP preconditioned resid norm 2.556586149718e-07 true resid norm 4.881284017489e-09 ||r(i)||/||b|| 7.174261055523e-05 68 KSP preconditioned resid norm 2.359709791466e-07 true resid norm 4.436917491675e-09 ||r(i)||/||b|| 6.521153912175e-05 69 KSP preconditioned resid norm 2.169660344514e-07 true resid norm 4.086449825210e-09 ||r(i)||/||b|| 6.006054499454e-05 70 KSP preconditioned resid norm 2.028546042168e-07 true resid norm 3.851462023052e-09 ||r(i)||/||b|| 5.660681472294e-05 71 KSP preconditioned resid norm 1.900273187442e-07 true resid norm 3.618086490662e-09 ||r(i)||/||b|| 5.317678076601e-05 72 KSP preconditioned resid norm 1.698668076313e-07 true resid norm 3.253473206554e-09 ||r(i)||/||b|| 4.781788160109e-05 73 KSP preconditioned resid norm 1.482619680639e-07 true resid norm 2.832266831450e-09 ||r(i)||/||b|| 4.162720619189e-05 74 KSP preconditioned resid norm 1.284671242065e-07 true resid norm 2.458840395320e-09 ||r(i)||/||b|| 3.613877583580e-05 75 KSP preconditioned resid norm 1.135319624760e-07 true resid norm 2.139101882873e-09 ||r(i)||/||b|| 3.143942306390e-05 76 KSP preconditioned resid norm 9.867102804581e-08 true resid norm 1.857318221486e-09 ||r(i)||/||b|| 2.729791123887e-05 77 KSP preconditioned resid norm 8.652081964705e-08 true resid norm 1.636466760076e-09 ||r(i)||/||b|| 2.405194965792e-05 78 KSP preconditioned resid norm 7.519859727102e-08 true resid norm 1.401827234127e-09 ||r(i)||/||b|| 2.060333817154e-05 79 KSP preconditioned resid norm 6.594544882648e-08 true resid norm 1.224311313954e-09 ||r(i)||/||b|| 1.799430016377e-05 80 KSP preconditioned resid norm 5.688343999129e-08 true resid norm 1.087343970500e-09 ||r(i)||/||b|| 1.598122435320e-05 81 KSP preconditioned resid norm 4.949192930707e-08 true resid norm 9.483991414969e-10 ||r(i)||/||b|| 1.393908447358e-05 82 KSP preconditioned resid norm 4.145398902615e-08 true resid norm 7.881320226820e-10 ||r(i)||/||b|| 1.158356050719e-05 83 KSP preconditioned resid norm 3.529729473579e-08 true resid norm 6.721775235085e-10 ||r(i)||/||b|| 9.879320711571e-06 84 KSP preconditioned resid norm 3.066150334152e-08 true resid norm 5.853358143744e-10 ||r(i)||/||b|| 8.602965782000e-06 85 KSP preconditioned resid norm 2.655724121203e-08 true resid norm 5.052377953892e-10 ||r(i)||/||b|| 7.425726153717e-06 86 KSP preconditioned resid norm 2.275405968218e-08 true resid norm 4.323599111733e-10 ||r(i)||/||b|| 6.354604365544e-06 87 KSP preconditioned resid norm 1.911684628647e-08 true resid norm 3.688573065257e-10 ||r(i)||/||b|| 5.421275631106e-06 88 KSP preconditioned resid norm 1.591708011714e-08 true resid norm 3.110085392262e-10 ||r(i)||/||b|| 4.571044100100e-06 89 KSP preconditioned resid norm 1.363734900317e-08 true resid norm 2.635078588594e-10 ||r(i)||/||b|| 3.872903446850e-06 90 KSP preconditioned resid norm 1.169743685146e-08 true resid norm 2.282391964111e-10 ||r(i)||/||b|| 3.354542723367e-06 91 KSP preconditioned resid norm 1.011738528389e-08 true resid norm 1.944406032839e-10 ||r(i)||/||b|| 2.857788325272e-06 92 KSP preconditioned resid norm 9.045580777192e-09 true resid norm 1.742874143555e-10 ||r(i)||/||b|| 2.561587084050e-06 93 KSP preconditioned resid norm 8.198456370824e-09 true resid norm 1.577567760410e-10 ||r(i)||/||b|| 2.318628234991e-06 94 KSP preconditioned resid norm 7.534008324723e-09 true resid norm 1.447951392261e-10 ||r(i)||/||b|| 2.128124740655e-06 95 KSP preconditioned resid norm 6.869722600689e-09 true resid norm 1.310553425049e-10 ||r(i)||/||b|| 1.926184250870e-06 96 KSP preconditioned resid norm 6.298568401042e-09 true resid norm 1.193070242101e-10 ||r(i)||/||b|| 1.753513490250e-06 97 KSP preconditioned resid norm 5.781741496868e-09 true resid norm 1.096127971253e-10 ||r(i)||/||b|| 1.611032709397e-06 98 KSP preconditioned resid norm 5.264347910590e-09 true resid norm 1.001921664598e-10 ||r(i)||/||b|| 1.472573108481e-06 99 KSP preconditioned resid norm 4.648653103156e-09 true resid norm 8.775901527616e-11 ||r(i)||/||b|| 1.289837025076e-06 100 KSP preconditioned resid norm 4.080375720692e-09 true resid norm 7.813818842892e-11 ||r(i)||/||b|| 1.148435043292e-06 101 KSP preconditioned resid norm 3.665075743628e-09 true resid norm 7.000076408058e-11 ||r(i)||/||b|| 1.028835351110e-06 102 KSP preconditioned resid norm 3.282174695000e-09 true resid norm 6.237321838018e-11 ||r(i)||/||b|| 9.167295939536e-07 103 KSP preconditioned resid norm 2.951790323535e-09 true resid norm 5.580253758952e-11 ||r(i)||/||b|| 8.201570955376e-07 104 KSP preconditioned resid norm 2.598806947428e-09 true resid norm 4.968129595019e-11 ||r(i)||/||b|| 7.301902234049e-07 105 KSP preconditioned resid norm 2.301744624784e-09 true resid norm 4.468414467404e-11 ||r(i)||/||b|| 6.567446552703e-07 106 KSP preconditioned resid norm 2.034807118272e-09 true resid norm 3.860788741149e-11 ||r(i)||/||b|| 5.674389404506e-07 107 KSP preconditioned resid norm 1.806296996779e-09 true resid norm 3.514719026996e-11 ||r(i)||/||b|| 5.165753876671e-07 108 KSP preconditioned resid norm 1.617423909628e-09 true resid norm 3.139376374909e-11 ||r(i)||/||b|| 4.614094485065e-07 109 KSP preconditioned resid norm 1.455184820634e-09 true resid norm 2.761762783653e-11 ||r(i)||/||b|| 4.059097383468e-07 110 KSP preconditioned resid norm 1.286527268944e-09 true resid norm 2.437798520957e-11 ||r(i)||/||b|| 3.582951315156e-07 111 KSP preconditioned resid norm 1.132013373913e-09 true resid norm 2.142324517339e-11 ||r(i)||/||b|| 3.148678769350e-07 112 KSP preconditioned resid norm 1.007120599718e-09 true resid norm 1.896640529911e-11 ||r(i)||/||b|| 2.787585037320e-07 113 KSP preconditioned resid norm 9.135548397455e-10 true resid norm 1.725682078768e-11 ||r(i)||/||b|| 2.536319068417e-07 114 KSP preconditioned resid norm 8.269925799269e-10 true resid norm 1.572475016337e-11 ||r(i)||/||b|| 2.311143180783e-07 115 KSP preconditioned resid norm 7.284793597065e-10 true resid norm 1.389669930514e-11 ||r(i)||/||b|| 2.042465635434e-07 116 KSP preconditioned resid norm 6.390792406440e-10 true resid norm 1.218201275469e-11 ||r(i)||/||b|| 1.790449794986e-07 117 KSP preconditioned resid norm 5.545720352670e-10 true resid norm 1.053567280188e-11 ||r(i)||/||b|| 1.548479187146e-07 118 KSP preconditioned resid norm 4.773577604600e-10 true resid norm 8.976298498111e-12 ||r(i)||/||b|| 1.319290344651e-07 119 KSP preconditioned resid norm 4.030796277093e-10 true resid norm 7.596402318624e-12 ||r(i)||/||b|| 1.116480276938e-07 Linear solve converged due to CONVERGED_RTOL iterations 119 iter = 16, Function value: -4.34176, Residual: 5.34513e-05 0 KSP preconditioned resid norm 1.561324840663e-03 true resid norm 2.660200780355e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 7.048936387430e-04 true resid norm 1.578633615260e-05 ||r(i)||/||b|| 5.934264913074e-01 2 KSP preconditioned resid norm 4.677813759892e-04 true resid norm 1.011607470460e-05 ||r(i)||/||b|| 3.802748566690e-01 3 KSP preconditioned resid norm 3.349461302649e-04 true resid norm 6.860491799737e-06 ||r(i)||/||b|| 2.578937593884e-01 4 KSP preconditioned resid norm 2.570561228012e-04 true resid norm 5.331776315216e-06 ||r(i)||/||b|| 2.004275900748e-01 5 KSP preconditioned resid norm 2.075056982063e-04 true resid norm 4.227664774320e-06 ||r(i)||/||b|| 1.589227702488e-01 6 KSP preconditioned resid norm 1.629225285298e-04 true resid norm 3.354157307710e-06 ||r(i)||/||b|| 1.260866222008e-01 7 KSP preconditioned resid norm 1.352368640323e-04 true resid norm 2.772304738287e-06 ||r(i)||/||b|| 1.042141164216e-01 8 KSP preconditioned resid norm 1.144760500728e-04 true resid norm 2.355059562259e-06 ||r(i)||/||b|| 8.852939145235e-02 9 KSP preconditioned resid norm 9.959050251124e-05 true resid norm 1.983855992617e-06 ||r(i)||/||b|| 7.457542330139e-02 10 KSP preconditioned resid norm 8.422927864444e-05 true resid norm 1.677204235283e-06 ||r(i)||/||b|| 6.304803185040e-02 11 KSP preconditioned resid norm 7.693460290776e-05 true resid norm 1.534842973563e-06 ||r(i)||/||b|| 5.769650865820e-02 12 KSP preconditioned resid norm 6.860440772165e-05 true resid norm 1.362647096948e-06 ||r(i)||/||b|| 5.122346805588e-02 13 KSP preconditioned resid norm 6.088013236106e-05 true resid norm 1.180112977271e-06 ||r(i)||/||b|| 4.436180103341e-02 14 KSP preconditioned resid norm 5.364467284846e-05 true resid norm 9.989759867468e-07 ||r(i)||/||b|| 3.755265369907e-02 15 KSP preconditioned resid norm 4.945953066028e-05 true resid norm 9.318887690987e-07 ||r(i)||/||b|| 3.503076820293e-02 16 KSP preconditioned resid norm 4.567053678537e-05 true resid norm 8.791485915380e-07 ||r(i)||/||b|| 3.304820440736e-02 17 KSP preconditioned resid norm 3.992094719685e-05 true resid norm 7.451586872595e-07 ||r(i)||/||b|| 2.801137014778e-02 18 KSP preconditioned resid norm 3.459848303995e-05 true resid norm 6.272385796372e-07 ||r(i)||/||b|| 2.357861798512e-02 19 KSP preconditioned resid norm 3.101898135172e-05 true resid norm 5.580990563862e-07 ||r(i)||/||b|| 2.097958396628e-02 20 KSP preconditioned resid norm 2.915580777117e-05 true resid norm 5.216283144212e-07 ||r(i)||/||b|| 1.960860692445e-02 21 KSP preconditioned resid norm 2.808749492082e-05 true resid norm 4.981110962931e-07 ||r(i)||/||b|| 1.872456770825e-02 22 KSP preconditioned resid norm 2.786680137401e-05 true resid norm 4.974295517257e-07 ||r(i)||/||b|| 1.869894766587e-02 23 KSP preconditioned resid norm 2.710629487120e-05 true resid norm 4.825047168003e-07 ||r(i)||/||b|| 1.813790599429e-02 24 KSP preconditioned resid norm 2.500581881107e-05 true resid norm 4.577422789854e-07 ||r(i)||/||b|| 1.720705754113e-02 25 KSP preconditioned resid norm 2.260061487826e-05 true resid norm 4.188617987637e-07 ||r(i)||/||b|| 1.574549567299e-02 26 KSP preconditioned resid norm 1.929612483226e-05 true resid norm 3.660432712260e-07 ||r(i)||/||b|| 1.375998661188e-02 27 KSP preconditioned resid norm 1.623688448295e-05 true resid norm 3.021271958851e-07 ||r(i)||/||b|| 1.135730799405e-02 28 KSP preconditioned resid norm 1.331585368613e-05 true resid norm 2.455544224915e-07 ||r(i)||/||b|| 9.230672523098e-03 29 KSP preconditioned resid norm 1.114790248562e-05 true resid norm 2.098967231939e-07 ||r(i)||/||b|| 7.890258688136e-03 30 KSP preconditioned resid norm 9.274970065062e-06 true resid norm 1.804862041474e-07 ||r(i)||/||b|| 6.784683527658e-03 31 KSP preconditioned resid norm 7.968722635920e-06 true resid norm 1.560865358038e-07 ||r(i)||/||b|| 5.867471995214e-03 32 KSP preconditioned resid norm 6.885370550083e-06 true resid norm 1.307595730573e-07 ||r(i)||/||b|| 4.915402402063e-03 33 KSP preconditioned resid norm 6.086771528549e-06 true resid norm 1.191308381011e-07 ||r(i)||/||b|| 4.478264910711e-03 34 KSP preconditioned resid norm 5.376234236609e-06 true resid norm 1.031073484130e-07 ||r(i)||/||b|| 3.875923545864e-03 35 KSP preconditioned resid norm 4.774331381604e-06 true resid norm 9.187204916560e-08 ||r(i)||/||b|| 3.453575754283e-03 36 KSP preconditioned resid norm 4.251652431386e-06 true resid norm 8.219615368990e-08 ||r(i)||/||b|| 3.089847739948e-03 37 KSP preconditioned resid norm 3.754096032677e-06 true resid norm 7.327548509574e-08 ||r(i)||/||b|| 2.754509570738e-03 38 KSP preconditioned resid norm 3.282845775393e-06 true resid norm 6.335411108754e-08 ||r(i)||/||b|| 2.381553736672e-03 39 KSP preconditioned resid norm 2.901002406965e-06 true resid norm 5.510372185205e-08 ||r(i)||/||b|| 2.071412137722e-03 40 KSP preconditioned resid norm 2.617593071896e-06 true resid norm 5.061361632971e-08 ||r(i)||/||b|| 1.902623918595e-03 41 KSP preconditioned resid norm 2.320427489884e-06 true resid norm 4.538140258101e-08 ||r(i)||/||b|| 1.705938999647e-03 42 KSP preconditioned resid norm 2.037595475665e-06 true resid norm 3.990817038247e-08 ||r(i)||/||b|| 1.500193920593e-03 43 KSP preconditioned resid norm 1.817221564221e-06 true resid norm 3.529441746721e-08 ||r(i)||/||b|| 1.326757654078e-03 44 KSP preconditioned resid norm 1.676309784598e-06 true resid norm 3.250249719393e-08 ||r(i)||/||b|| 1.221806167186e-03 45 KSP preconditioned resid norm 1.529076531087e-06 true resid norm 2.877556305180e-08 ||r(i)||/||b|| 1.081706436006e-03 46 KSP preconditioned resid norm 1.348931626786e-06 true resid norm 2.619489739647e-08 ||r(i)||/||b|| 9.846962526253e-04 47 KSP preconditioned resid norm 1.207929026888e-06 true resid norm 2.319748597365e-08 ||r(i)||/||b|| 8.720201176148e-04 48 KSP preconditioned resid norm 1.095664618689e-06 true resid norm 2.131510440594e-08 ||r(i)||/||b|| 8.012592343911e-04 49 KSP preconditioned resid norm 1.008059783584e-06 true resid norm 1.960225181266e-08 ||r(i)||/||b|| 7.368711398560e-04 50 KSP preconditioned resid norm 9.153990043497e-07 true resid norm 1.761810128985e-08 ||r(i)||/||b|| 6.622846448264e-04 51 KSP preconditioned resid norm 8.149696059901e-07 true resid norm 1.585760838986e-08 ||r(i)||/||b|| 5.961056964934e-04 52 KSP preconditioned resid norm 6.940691883657e-07 true resid norm 1.354810327414e-08 ||r(i)||/||b|| 5.092887489615e-04 53 KSP preconditioned resid norm 5.773135403959e-07 true resid norm 1.138448834636e-08 ||r(i)||/||b|| 4.279559810084e-04 54 KSP preconditioned resid norm 4.820444531341e-07 true resid norm 9.403999421223e-09 ||r(i)||/||b|| 3.535071296373e-04 55 KSP preconditioned resid norm 4.086017702491e-07 true resid norm 8.032113070734e-09 ||r(i)||/||b|| 3.019363474384e-04 56 KSP preconditioned resid norm 3.407046139996e-07 true resid norm 6.611662152781e-09 ||r(i)||/||b|| 2.485399674192e-04 57 KSP preconditioned resid norm 2.906362091136e-07 true resid norm 5.720606082563e-09 ||r(i)||/||b|| 2.150441472241e-04 58 KSP preconditioned resid norm 2.400667370506e-07 true resid norm 4.713077728873e-09 ||r(i)||/||b|| 1.771700002375e-04 59 KSP preconditioned resid norm 1.964778230108e-07 true resid norm 3.852392636516e-09 ||r(i)||/||b|| 1.448158599518e-04 60 KSP preconditioned resid norm 1.606375450174e-07 true resid norm 3.138680713902e-09 ||r(i)||/||b|| 1.179866097732e-04 61 KSP preconditioned resid norm 1.335533303153e-07 true resid norm 2.585155854801e-09 ||r(i)||/||b|| 9.717897513193e-05 62 KSP preconditioned resid norm 1.112340633920e-07 true resid norm 2.129307715445e-09 ||r(i)||/||b|| 8.004312047305e-05 63 KSP preconditioned resid norm 9.506736723151e-08 true resid norm 1.821407434001e-09 ||r(i)||/||b|| 6.846879556805e-05 64 KSP preconditioned resid norm 8.319583504788e-08 true resid norm 1.592684642946e-09 ||r(i)||/||b|| 5.987084338549e-05 65 KSP preconditioned resid norm 7.465068579271e-08 true resid norm 1.448430692019e-09 ||r(i)||/||b|| 5.444817183405e-05 66 KSP preconditioned resid norm 6.887948059819e-08 true resid norm 1.328161228112e-09 ||r(i)||/||b|| 4.992710467269e-05 67 KSP preconditioned resid norm 6.423940855092e-08 true resid norm 1.248004241221e-09 ||r(i)||/||b|| 4.691391155272e-05 68 KSP preconditioned resid norm 5.992744126375e-08 true resid norm 1.167343987062e-09 ||r(i)||/||b|| 4.388180003865e-05 69 KSP preconditioned resid norm 5.501602840286e-08 true resid norm 1.064833594175e-09 ||r(i)||/||b|| 4.002831673604e-05 70 KSP preconditioned resid norm 5.010711019057e-08 true resid norm 9.794548550182e-10 ||r(i)||/||b|| 3.681883195626e-05 71 KSP preconditioned resid norm 4.776041396265e-08 true resid norm 9.227166682458e-10 ||r(i)||/||b|| 3.468597840658e-05 72 KSP preconditioned resid norm 4.543991698804e-08 true resid norm 8.763428120478e-10 ||r(i)||/||b|| 3.294273193660e-05 73 KSP preconditioned resid norm 4.120871041596e-08 true resid norm 8.028250538297e-10 ||r(i)||/||b|| 3.017911504118e-05 74 KSP preconditioned resid norm 3.659843232363e-08 true resid norm 6.983227758128e-10 ||r(i)||/||b|| 2.625075449078e-05 75 KSP preconditioned resid norm 3.167510868808e-08 true resid norm 6.136642776656e-10 ||r(i)||/||b|| 2.306834439706e-05 76 KSP preconditioned resid norm 2.747465018866e-08 true resid norm 5.202686931948e-10 ||r(i)||/||b|| 1.955749720235e-05 77 KSP preconditioned resid norm 2.386564352409e-08 true resid norm 4.503774477338e-10 ||r(i)||/||b|| 1.693020508300e-05 78 KSP preconditioned resid norm 2.064672894856e-08 true resid norm 3.836468907700e-10 ||r(i)||/||b|| 1.442172687126e-05 79 KSP preconditioned resid norm 1.827648663750e-08 true resid norm 3.398821677040e-10 ||r(i)||/||b|| 1.277656070977e-05 80 KSP preconditioned resid norm 1.581634393217e-08 true resid norm 2.921916446312e-10 ||r(i)||/||b|| 1.098381922105e-05 81 KSP preconditioned resid norm 1.354945335223e-08 true resid norm 2.596243816388e-10 ||r(i)||/||b|| 9.759578433178e-06 82 KSP preconditioned resid norm 1.140214070690e-08 true resid norm 2.191406318159e-10 ||r(i)||/||b|| 8.237747820924e-06 83 KSP preconditioned resid norm 9.543613931863e-09 true resid norm 1.855671427701e-10 ||r(i)||/||b|| 6.975681841026e-06 84 KSP preconditioned resid norm 8.128878742046e-09 true resid norm 1.588745627828e-10 ||r(i)||/||b|| 5.972277128706e-06 85 KSP preconditioned resid norm 7.014051844116e-09 true resid norm 1.381778737440e-10 ||r(i)||/||b|| 5.194264837618e-06 86 KSP preconditioned resid norm 6.059328895169e-09 true resid norm 1.177300547679e-10 ||r(i)||/||b|| 4.425607857773e-06 87 KSP preconditioned resid norm 5.166508285590e-09 true resid norm 1.005845420639e-10 ||r(i)||/||b|| 3.781088360197e-06 88 KSP preconditioned resid norm 4.387690775020e-09 true resid norm 8.599449196959e-11 ||r(i)||/||b|| 3.232631634598e-06 89 KSP preconditioned resid norm 3.655258457250e-09 true resid norm 7.253554846849e-11 ||r(i)||/||b|| 2.726694503819e-06 90 KSP preconditioned resid norm 3.099633193826e-09 true resid norm 6.034227115756e-11 ||r(i)||/||b|| 2.268335217521e-06 91 KSP preconditioned resid norm 2.610773865691e-09 true resid norm 5.110436626858e-11 ||r(i)||/||b|| 1.921071771949e-06 92 KSP preconditioned resid norm 2.267634720276e-09 true resid norm 4.387203588802e-11 ||r(i)||/||b|| 1.649200173611e-06 93 KSP preconditioned resid norm 1.956127287416e-09 true resid norm 3.785527103025e-11 ||r(i)||/||b|| 1.423023078175e-06 94 KSP preconditioned resid norm 1.715093351412e-09 true resid norm 3.284424691407e-11 ||r(i)||/||b|| 1.234652931335e-06 95 KSP preconditioned resid norm 1.531192450182e-09 true resid norm 2.975862820569e-11 ||r(i)||/||b|| 1.118660983240e-06 96 KSP preconditioned resid norm 1.391641029732e-09 true resid norm 2.715114324938e-11 ||r(i)||/||b|| 1.020642631559e-06 97 KSP preconditioned resid norm 1.272686827099e-09 true resid norm 2.472459620436e-11 ||r(i)||/||b|| 9.294259435958e-07 98 KSP preconditioned resid norm 1.167740077625e-09 true resid norm 2.255914301058e-11 ||r(i)||/||b|| 8.480240731139e-07 99 KSP preconditioned resid norm 1.098615998731e-09 true resid norm 2.105037633797e-11 ||r(i)||/||b|| 7.913078025319e-07 100 KSP preconditioned resid norm 1.022252639693e-09 true resid norm 1.949118784482e-11 ||r(i)||/||b|| 7.326961178554e-07 101 KSP preconditioned resid norm 9.330835672187e-10 true resid norm 1.797775372303e-11 ||r(i)||/||b|| 6.758043925025e-07 102 KSP preconditioned resid norm 8.405271305455e-10 true resid norm 1.602150742382e-11 ||r(i)||/||b|| 6.022668492594e-07 103 KSP preconditioned resid norm 7.693015606366e-10 true resid norm 1.481260570458e-11 ||r(i)||/||b|| 5.568228463793e-07 104 KSP preconditioned resid norm 7.089135349703e-10 true resid norm 1.360308068205e-11 ||r(i)||/||b|| 5.113554128133e-07 105 KSP preconditioned resid norm 6.369665039887e-10 true resid norm 1.231864451401e-11 ||r(i)||/||b|| 4.630719833245e-07 106 KSP preconditioned resid norm 5.625724507742e-10 true resid norm 1.075145623094e-11 ||r(i)||/||b|| 4.041595773649e-07 107 KSP preconditioned resid norm 4.966400051561e-10 true resid norm 9.565011941533e-12 ||r(i)||/||b|| 3.595597750428e-07 108 KSP preconditioned resid norm 4.253188430285e-10 true resid norm 8.344268178404e-12 ||r(i)||/||b|| 3.136706161438e-07 109 KSP preconditioned resid norm 3.629681099996e-10 true resid norm 7.179404213826e-12 ||r(i)||/||b|| 2.698820429963e-07 110 KSP preconditioned resid norm 3.129702631230e-10 true resid norm 6.200262401090e-12 ||r(i)||/||b|| 2.330749786586e-07 111 KSP preconditioned resid norm 2.691833359028e-10 true resid norm 5.301396390392e-12 ||r(i)||/||b|| 1.992855738387e-07 112 KSP preconditioned resid norm 2.307988274778e-10 true resid norm 4.508688022795e-12 ||r(i)||/||b|| 1.694867566422e-07 113 KSP preconditioned resid norm 1.980226754059e-10 true resid norm 3.881960091115e-12 ||r(i)||/||b|| 1.459273344998e-07 114 KSP preconditioned resid norm 1.749971597022e-10 true resid norm 3.411191729673e-12 ||r(i)||/||b|| 1.282306115712e-07 115 KSP preconditioned resid norm 1.561332641725e-10 true resid norm 3.067329156349e-12 ||r(i)||/||b|| 1.153044228466e-07 116 KSP preconditioned resid norm 1.412021310536e-10 true resid norm 2.774734164732e-12 ||r(i)||/||b|| 1.043054413495e-07 Linear solve converged due to CONVERGED_RTOL iterations 116 iter = 17, Function value: -4.34176, Residual: 1.42828e-05 0 KSP preconditioned resid norm 3.506421354743e-04 true resid norm 6.866903802372e-06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.148453471671e-04 true resid norm 2.974227659087e-06 ||r(i)||/||b|| 4.331249926727e-01 2 KSP preconditioned resid norm 7.740279764372e-05 true resid norm 1.569091650477e-06 ||r(i)||/||b|| 2.285006016737e-01 3 KSP preconditioned resid norm 5.017379175425e-05 true resid norm 1.190764878678e-06 ||r(i)||/||b|| 1.734063725002e-01 4 KSP preconditioned resid norm 3.946631334191e-05 true resid norm 7.901088194635e-07 ||r(i)||/||b|| 1.150604176500e-01 5 KSP preconditioned resid norm 2.996754265342e-05 true resid norm 6.218835246865e-07 ||r(i)||/||b|| 9.056243433491e-02 6 KSP preconditioned resid norm 2.402583892477e-05 true resid norm 5.028879805662e-07 ||r(i)||/||b|| 7.323358460220e-02 7 KSP preconditioned resid norm 2.010542319539e-05 true resid norm 4.106621001309e-07 ||r(i)||/||b|| 5.980309495365e-02 8 KSP preconditioned resid norm 1.607092351882e-05 true resid norm 3.346994281441e-07 ||r(i)||/||b|| 4.874095193070e-02 9 KSP preconditioned resid norm 1.394459839076e-05 true resid norm 2.994883166534e-07 ||r(i)||/||b|| 4.361329724030e-02 10 KSP preconditioned resid norm 1.189789758909e-05 true resid norm 2.423808523983e-07 ||r(i)||/||b|| 3.529696343125e-02 11 KSP preconditioned resid norm 1.070270093520e-05 true resid norm 2.146177857799e-07 ||r(i)||/||b|| 3.125393801290e-02 12 KSP preconditioned resid norm 9.679823126824e-06 true resid norm 1.968577010113e-07 ||r(i)||/||b|| 2.866760721815e-02 13 KSP preconditioned resid norm 8.619730953303e-06 true resid norm 1.751690265659e-07 ||r(i)||/||b|| 2.550917147047e-02 14 KSP preconditioned resid norm 7.544257093518e-06 true resid norm 1.420542616779e-07 ||r(i)||/||b|| 2.068679943191e-02 15 KSP preconditioned resid norm 6.725951974399e-06 true resid norm 1.311804393651e-07 ||r(i)||/||b|| 1.910328776119e-02 16 KSP preconditioned resid norm 6.216434849730e-06 true resid norm 1.145203305314e-07 ||r(i)||/||b|| 1.667714210469e-02 17 KSP preconditioned resid norm 5.624323370535e-06 true resid norm 1.068332658698e-07 ||r(i)||/||b|| 1.555770532754e-02 18 KSP preconditioned resid norm 4.900988330811e-06 true resid norm 9.253629477040e-08 ||r(i)||/||b|| 1.347569405857e-02 19 KSP preconditioned resid norm 4.273589856202e-06 true resid norm 7.388110947532e-08 ||r(i)||/||b|| 1.075901331977e-02 20 KSP preconditioned resid norm 3.886002205972e-06 true resid norm 6.864580353189e-08 ||r(i)||/||b|| 9.996616452990e-03 21 KSP preconditioned resid norm 3.679075699028e-06 true resid norm 6.453103535462e-08 ||r(i)||/||b|| 9.397399062489e-03 22 KSP preconditioned resid norm 3.617178136626e-06 true resid norm 6.416721242276e-08 ||r(i)||/||b|| 9.344416970075e-03 23 KSP preconditioned resid norm 3.496519610076e-06 true resid norm 6.291510898136e-08 ||r(i)||/||b|| 9.162078105655e-03 24 KSP preconditioned resid norm 3.345281133953e-06 true resid norm 6.188722171003e-08 ||r(i)||/||b|| 9.012390953934e-03 25 KSP preconditioned resid norm 3.046253382855e-06 true resid norm 5.439128857245e-08 ||r(i)||/||b|| 7.920787903518e-03 26 KSP preconditioned resid norm 2.702334306139e-06 true resid norm 4.957499609292e-08 ||r(i)||/||b|| 7.219410307712e-03 27 KSP preconditioned resid norm 2.361537902551e-06 true resid norm 4.468527846082e-08 ||r(i)||/||b|| 6.507340097787e-03 28 KSP preconditioned resid norm 2.003438894347e-06 true resid norm 3.688486181287e-08 ||r(i)||/||b|| 5.371396319856e-03 29 KSP preconditioned resid norm 1.661301349571e-06 true resid norm 3.146571174066e-08 ||r(i)||/||b|| 4.582226960831e-03 30 KSP preconditioned resid norm 1.379526632617e-06 true resid norm 2.664843800839e-08 ||r(i)||/||b|| 3.880706469076e-03 31 KSP preconditioned resid norm 1.186806233711e-06 true resid norm 2.274055234170e-08 ||r(i)||/||b|| 3.311616559103e-03 32 KSP preconditioned resid norm 1.022810145659e-06 true resid norm 1.954931692773e-08 ||r(i)||/||b|| 2.846889586683e-03 33 KSP preconditioned resid norm 8.922936707517e-07 true resid norm 1.730978699270e-08 ||r(i)||/||b|| 2.520755713327e-03 34 KSP preconditioned resid norm 7.797724917831e-07 true resid norm 1.479670859890e-08 ||r(i)||/||b|| 2.154786061484e-03 35 KSP preconditioned resid norm 6.825303559638e-07 true resid norm 1.310713790278e-08 ||r(i)||/||b|| 1.908740573627e-03 36 KSP preconditioned resid norm 5.851469468431e-07 true resid norm 1.112949840216e-08 ||r(i)||/||b|| 1.620744766850e-03 37 KSP preconditioned resid norm 5.122135125008e-07 true resid norm 9.820958852439e-09 ||r(i)||/||b|| 1.430187335528e-03 38 KSP preconditioned resid norm 4.478862947906e-07 true resid norm 8.663843341091e-09 ||r(i)||/||b|| 1.261681187102e-03 39 KSP preconditioned resid norm 3.930900484540e-07 true resid norm 7.599691612469e-09 ||r(i)||/||b|| 1.106712986112e-03 40 KSP preconditioned resid norm 3.467001015409e-07 true resid norm 6.629359149650e-09 ||r(i)||/||b|| 9.654073131708e-04 41 KSP preconditioned resid norm 3.101302395094e-07 true resid norm 6.013234781554e-09 ||r(i)||/||b|| 8.756835619973e-04 42 KSP preconditioned resid norm 2.825085809115e-07 true resid norm 5.559474403821e-09 ||r(i)||/||b|| 8.096042355945e-04 43 KSP preconditioned resid norm 2.522834462687e-07 true resid norm 4.948410891306e-09 ||r(i)||/||b|| 7.206174767727e-04 44 KSP preconditioned resid norm 2.274739360159e-07 true resid norm 4.497807586345e-09 ||r(i)||/||b|| 6.549979023721e-04 45 KSP preconditioned resid norm 2.063492720153e-07 true resid norm 3.935715351826e-09 ||r(i)||/||b|| 5.731426367828e-04 46 KSP preconditioned resid norm 1.849616593113e-07 true resid norm 3.589843093222e-09 ||r(i)||/||b|| 5.227746298094e-04 47 KSP preconditioned resid norm 1.620298549053e-07 true resid norm 3.107284428181e-09 ||r(i)||/||b|| 4.525015229000e-04 48 KSP preconditioned resid norm 1.432631476886e-07 true resid norm 2.782026323769e-09 ||r(i)||/||b|| 4.051354735461e-04 49 KSP preconditioned resid norm 1.268737891052e-07 true resid norm 2.419065043674e-09 ||r(i)||/||b|| 3.522788600647e-04 50 KSP preconditioned resid norm 1.125302459259e-07 true resid norm 2.206231992322e-09 ||r(i)||/||b|| 3.212848258570e-04 51 KSP preconditioned resid norm 1.016708106543e-07 true resid norm 1.985632946068e-09 ||r(i)||/||b|| 2.891598605739e-04 52 KSP preconditioned resid norm 9.072835164836e-08 true resid norm 1.813919638193e-09 ||r(i)||/||b|| 2.641539317278e-04 53 KSP preconditioned resid norm 7.839459727391e-08 true resid norm 1.529073921421e-09 ||r(i)||/||b|| 2.226729783069e-04 54 KSP preconditioned resid norm 6.601381028591e-08 true resid norm 1.287552859854e-09 ||r(i)||/||b|| 1.875012228086e-04 55 KSP preconditioned resid norm 5.685480685634e-08 true resid norm 1.114093807824e-09 ||r(i)||/||b|| 1.622410681564e-04 56 KSP preconditioned resid norm 4.800842533085e-08 true resid norm 9.507812100719e-10 ||r(i)||/||b|| 1.384585014491e-04 57 KSP preconditioned resid norm 4.190158928552e-08 true resid norm 8.131929793017e-10 ||r(i)||/||b|| 1.184220724078e-04 58 KSP preconditioned resid norm 3.560756796435e-08 true resid norm 7.009747940059e-10 ||r(i)||/||b|| 1.020801825946e-04 59 KSP preconditioned resid norm 2.984048555207e-08 true resid norm 5.953346332861e-10 ||r(i)||/||b|| 8.669622444404e-05 60 KSP preconditioned resid norm 2.494120959879e-08 true resid norm 4.903954213055e-10 ||r(i)||/||b|| 7.141434268179e-05 61 KSP preconditioned resid norm 2.134788043904e-08 true resid norm 4.176570704622e-10 ||r(i)||/||b|| 6.082174477498e-05 62 KSP preconditioned resid norm 1.851222016394e-08 true resid norm 3.616113215178e-10 ||r(i)||/||b|| 5.266002436104e-05 63 KSP preconditioned resid norm 1.609839710515e-08 true resid norm 3.135835757558e-10 ||r(i)||/||b|| 4.566593399015e-05 64 KSP preconditioned resid norm 1.447252133793e-08 true resid norm 2.761923213472e-10 ||r(i)||/||b|| 4.022079372246e-05 65 KSP preconditioned resid norm 1.306766247453e-08 true resid norm 2.472413677732e-10 ||r(i)||/||b|| 3.600478103215e-05 66 KSP preconditioned resid norm 1.181313051589e-08 true resid norm 2.203404703898e-10 ||r(i)||/||b|| 3.208730990432e-05 67 KSP preconditioned resid norm 1.078588279393e-08 true resid norm 2.037242539109e-10 ||r(i)||/||b|| 2.966755611758e-05 68 KSP preconditioned resid norm 1.016884166240e-08 true resid norm 1.924729805649e-10 ||r(i)||/||b|| 2.802907774802e-05 69 KSP preconditioned resid norm 9.109991617990e-09 true resid norm 1.745364221068e-10 ||r(i)||/||b|| 2.541704778892e-05 70 KSP preconditioned resid norm 8.114223072990e-09 true resid norm 1.518144931714e-10 ||r(i)||/||b|| 2.210814328270e-05 71 KSP preconditioned resid norm 7.281620157406e-09 true resid norm 1.377442037481e-10 ||r(i)||/||b|| 2.005914276832e-05 72 KSP preconditioned resid norm 6.653215203048e-09 true resid norm 1.270370139789e-10 ||r(i)||/||b|| 1.849989713486e-05 73 KSP preconditioned resid norm 5.874545495878e-09 true resid norm 1.119615699193e-10 ||r(i)||/||b|| 1.630451993235e-05 74 KSP preconditioned resid norm 4.977741499543e-09 true resid norm 9.548882079341e-11 ||r(i)||/||b|| 1.390565872794e-05 75 KSP preconditioned resid norm 4.279143491153e-09 true resid norm 8.090536469081e-11 ||r(i)||/||b|| 1.178192778277e-05 76 KSP preconditioned resid norm 3.677721856599e-09 true resid norm 7.037162927698e-11 ||r(i)||/||b|| 1.024794162002e-05 77 KSP preconditioned resid norm 3.217314721593e-09 true resid norm 6.070725407907e-11 ||r(i)||/||b|| 8.840556941849e-06 78 KSP preconditioned resid norm 2.801036133387e-09 true resid norm 5.215425647071e-11 ||r(i)||/||b|| 7.595017779730e-06 79 KSP preconditioned resid norm 2.555512332058e-09 true resid norm 4.802360533696e-11 ||r(i)||/||b|| 6.993487417192e-06 80 KSP preconditioned resid norm 2.317538043006e-09 true resid norm 4.386210071516e-11 ||r(i)||/||b|| 6.387463983405e-06 81 KSP preconditioned resid norm 2.105665248868e-09 true resid norm 4.035362896160e-11 ||r(i)||/||b|| 5.876539139468e-06 82 KSP preconditioned resid norm 1.886996607916e-09 true resid norm 3.625883378171e-11 ||r(i)||/||b|| 5.280230337461e-06 83 KSP preconditioned resid norm 1.701604764324e-09 true resid norm 3.263979795870e-11 ||r(i)||/||b|| 4.753204486048e-06 84 KSP preconditioned resid norm 1.531004109565e-09 true resid norm 2.891821875368e-11 ||r(i)||/||b|| 4.211245648103e-06 85 KSP preconditioned resid norm 1.402921200739e-09 true resid norm 2.640928112158e-11 ||r(i)||/||b|| 3.845878998983e-06 86 KSP preconditioned resid norm 1.300540578957e-09 true resid norm 2.457010439027e-11 ||r(i)||/||b|| 3.578046976831e-06 87 KSP preconditioned resid norm 1.169216480698e-09 true resid norm 2.160125853189e-11 ||r(i)||/||b|| 3.145705714477e-06 88 KSP preconditioned resid norm 1.025253545515e-09 true resid norm 1.915616025289e-11 ||r(i)||/||b|| 2.789635737473e-06 89 KSP preconditioned resid norm 8.703002774169e-10 true resid norm 1.624103875352e-11 ||r(i)||/||b|| 2.365118140713e-06 90 KSP preconditioned resid norm 7.414365312721e-10 true resid norm 1.377979625389e-11 ||r(i)||/||b|| 2.006697144808e-06 91 KSP preconditioned resid norm 6.067907628716e-10 true resid norm 1.128640105671e-11 ||r(i)||/||b|| 1.643593878920e-06 92 KSP preconditioned resid norm 5.265640442250e-10 true resid norm 9.804106873462e-12 ||r(i)||/||b|| 1.427733248582e-06 93 KSP preconditioned resid norm 4.493098902254e-10 true resid norm 8.475253995624e-12 ||r(i)||/||b|| 1.234217667750e-06 94 KSP preconditioned resid norm 3.859635925960e-10 true resid norm 7.312556023375e-12 ||r(i)||/||b|| 1.064898567656e-06 95 KSP preconditioned resid norm 3.295352644910e-10 true resid norm 6.140175306245e-12 ||r(i)||/||b|| 8.941694077794e-07 96 KSP preconditioned resid norm 2.800984335336e-10 true resid norm 5.312392798861e-12 ||r(i)||/||b|| 7.736227201881e-07 97 KSP preconditioned resid norm 2.376762502526e-10 true resid norm 4.600938310438e-12 ||r(i)||/||b|| 6.700164212070e-07 98 KSP preconditioned resid norm 2.001722319026e-10 true resid norm 3.909730209143e-12 ||r(i)||/||b|| 5.693585233847e-07 99 KSP preconditioned resid norm 1.681704481239e-10 true resid norm 3.266901496994e-12 ||r(i)||/||b|| 4.757459243663e-07 100 KSP preconditioned resid norm 1.412854480766e-10 true resid norm 2.721515130117e-12 ||r(i)||/||b|| 3.963234681076e-07 101 KSP preconditioned resid norm 1.195995080034e-10 true resid norm 2.365139805283e-12 ||r(i)||/||b|| 3.444259412031e-07 102 KSP preconditioned resid norm 1.025386072253e-10 true resid norm 2.000363767666e-12 ||r(i)||/||b|| 2.913050517724e-07 103 KSP preconditioned resid norm 8.879174942592e-11 true resid norm 1.741125432604e-12 ||r(i)||/||b|| 2.535531999156e-07 104 KSP preconditioned resid norm 7.713887817627e-11 true resid norm 1.479997884682e-12 ||r(i)||/||b|| 2.155262294734e-07 105 KSP preconditioned resid norm 6.847651519322e-11 true resid norm 1.317894935459e-12 ||r(i)||/||b|| 1.919198190898e-07 106 KSP preconditioned resid norm 6.203905531932e-11 true resid norm 1.194271139214e-12 ||r(i)||/||b|| 1.739169753334e-07 107 KSP preconditioned resid norm 5.594839599801e-11 true resid norm 1.090203856383e-12 ||r(i)||/||b|| 1.587620691594e-07 108 KSP preconditioned resid norm 5.037334613468e-11 true resid norm 9.690174925613e-13 ||r(i)||/||b|| 1.411141790317e-07 109 KSP preconditioned resid norm 4.503126411310e-11 true resid norm 8.834186391084e-13 ||r(i)||/||b|| 1.286487570720e-07 110 KSP preconditioned resid norm 4.028080915007e-11 true resid norm 7.848357552337e-13 ||r(i)||/||b|| 1.142925221936e-07 111 KSP preconditioned resid norm 3.799844646046e-11 true resid norm 7.301397670085e-13 ||r(i)||/||b|| 1.063273620866e-07 112 KSP preconditioned resid norm 3.624871136275e-11 true resid norm 7.087010471957e-13 ||r(i)||/||b|| 1.032053262419e-07 113 KSP preconditioned resid norm 3.389723660711e-11 true resid norm 6.520789767051e-13 ||r(i)||/||b|| 9.495967840409e-08 Linear solve converged due to CONVERGED_RTOL iterations 113 iter = 18, Function value: -4.34176, Residual: 1.93571e-06 0 KSP preconditioned resid norm 4.508502261780e-05 true resid norm 1.071143234017e-06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.069076165093e-05 true resid norm 3.089684572391e-07 ||r(i)||/||b|| 2.884473779294e-01 2 KSP preconditioned resid norm 5.497023079267e-06 true resid norm 1.449735850853e-07 ||r(i)||/||b|| 1.353447237318e-01 3 KSP preconditioned resid norm 3.205788707464e-06 true resid norm 7.443224954375e-08 ||r(i)||/||b|| 6.948860542637e-02 4 KSP preconditioned resid norm 2.012004018387e-06 true resid norm 4.147254560650e-08 ||r(i)||/||b|| 3.871802041914e-02 5 KSP preconditioned resid norm 1.370146586025e-06 true resid norm 2.847571217488e-08 ||r(i)||/||b|| 2.658441118849e-02 6 KSP preconditioned resid norm 1.103918072549e-06 true resid norm 2.283957626182e-08 ||r(i)||/||b|| 2.132261637518e-02 7 KSP preconditioned resid norm 8.423418496349e-07 true resid norm 1.642838763832e-08 ||r(i)||/||b|| 1.533724633325e-02 8 KSP preconditioned resid norm 6.460765607550e-07 true resid norm 1.371129397637e-08 ||r(i)||/||b|| 1.280061670646e-02 9 KSP preconditioned resid norm 5.104032388305e-07 true resid norm 1.026992142474e-08 ||r(i)||/||b|| 9.587813374151e-03 10 KSP preconditioned resid norm 4.364583451738e-07 true resid norm 9.171420557624e-09 ||r(i)||/||b|| 8.562272781417e-03 11 KSP preconditioned resid norm 3.639022967259e-07 true resid norm 7.624383911178e-09 ||r(i)||/||b|| 7.117987276626e-03 12 KSP preconditioned resid norm 3.054320429023e-07 true resid norm 6.108076795150e-09 ||r(i)||/||b|| 5.702390307077e-03 13 KSP preconditioned resid norm 2.720334214801e-07 true resid norm 5.461969871945e-09 ||r(i)||/||b|| 5.099196539253e-03 14 KSP preconditioned resid norm 2.294694535830e-07 true resid norm 4.591959682475e-09 ||r(i)||/||b|| 4.286970721229e-03 15 KSP preconditioned resid norm 2.003072158320e-07 true resid norm 3.827623819504e-09 ||r(i)||/||b|| 3.573400548075e-03 16 KSP preconditioned resid norm 1.764850308967e-07 true resid norm 3.345473181437e-09 ||r(i)||/||b|| 3.123273410307e-03 17 KSP preconditioned resid norm 1.624102863325e-07 true resid norm 2.986247918685e-09 ||r(i)||/||b|| 2.787907185377e-03 18 KSP preconditioned resid norm 1.512289664610e-07 true resid norm 2.926414942528e-09 ||r(i)||/||b|| 2.732048198216e-03 19 KSP preconditioned resid norm 1.304990287691e-07 true resid norm 2.401335782675e-09 ||r(i)||/||b|| 2.241843766934e-03 20 KSP preconditioned resid norm 1.123327840984e-07 true resid norm 2.046673736526e-09 ||r(i)||/||b|| 1.910737678704e-03 21 KSP preconditioned resid norm 9.871204336004e-08 true resid norm 1.746520697062e-09 ||r(i)||/||b|| 1.630520215781e-03 22 KSP preconditioned resid norm 9.050642850270e-08 true resid norm 1.615821848890e-09 ||r(i)||/||b|| 1.508502128917e-03 23 KSP preconditioned resid norm 8.253529795981e-08 true resid norm 1.474775485537e-09 ||r(i)||/||b|| 1.376823788548e-03 24 KSP preconditioned resid norm 7.766901010769e-08 true resid norm 1.438482277688e-09 ||r(i)||/||b|| 1.342941104425e-03 25 KSP preconditioned resid norm 7.383790753378e-08 true resid norm 1.350419162407e-09 ||r(i)||/||b|| 1.260726968645e-03 26 KSP preconditioned resid norm 6.792424170787e-08 true resid norm 1.275048201039e-09 ||r(i)||/||b|| 1.190361998793e-03 27 KSP preconditioned resid norm 6.283855135356e-08 true resid norm 1.164017041623e-09 ||r(i)||/||b|| 1.086705311350e-03 28 KSP preconditioned resid norm 5.792741282609e-08 true resid norm 1.103086858381e-09 ||r(i)||/||b|| 1.029821991448e-03 29 KSP preconditioned resid norm 5.156397921132e-08 true resid norm 9.664289780117e-10 ||r(i)||/||b|| 9.022406596242e-04 30 KSP preconditioned resid norm 4.527404622723e-08 true resid norm 8.331839991615e-10 ||r(i)||/||b|| 7.778455510910e-04 31 KSP preconditioned resid norm 3.859475127142e-08 true resid norm 7.301223842779e-10 ||r(i)||/||b|| 6.816290866532e-04 32 KSP preconditioned resid norm 3.446748010393e-08 true resid norm 6.390767742056e-10 ||r(i)||/||b|| 5.966305475401e-04 33 KSP preconditioned resid norm 2.990657825706e-08 true resid norm 5.845742031679e-10 ||r(i)||/||b|| 5.457479304383e-04 34 KSP preconditioned resid norm 2.627687512902e-08 true resid norm 4.983627876452e-10 ||r(i)||/||b|| 4.652625081485e-04 35 KSP preconditioned resid norm 2.297563962610e-08 true resid norm 4.475415161696e-10 ||r(i)||/||b|| 4.178166859077e-04 36 KSP preconditioned resid norm 2.023593248562e-08 true resid norm 3.927293520099e-10 ||r(i)||/||b|| 3.666450382522e-04 37 KSP preconditioned resid norm 1.763442730162e-08 true resid norm 3.428839297666e-10 ||r(i)||/||b|| 3.201102512506e-04 38 KSP preconditioned resid norm 1.556013127375e-08 true resid norm 2.962767563452e-10 ||r(i)||/||b|| 2.765986349314e-04 39 KSP preconditioned resid norm 1.357598956918e-08 true resid norm 2.594680253075e-10 ||r(i)||/||b|| 2.422346676592e-04 40 KSP preconditioned resid norm 1.180073324344e-08 true resid norm 2.229475554566e-10 ||r(i)||/||b|| 2.081398158307e-04 41 KSP preconditioned resid norm 1.030135397927e-08 true resid norm 1.927210980411e-10 ||r(i)||/||b|| 1.799209404687e-04 42 KSP preconditioned resid norm 9.252148038335e-09 true resid norm 1.747396606405e-10 ||r(i)||/||b|| 1.631337948942e-04 43 KSP preconditioned resid norm 8.223041863942e-09 true resid norm 1.565452449710e-10 ||r(i)||/||b|| 1.461478166500e-04 44 KSP preconditioned resid norm 7.380942032235e-09 true resid norm 1.416503542115e-10 ||r(i)||/||b|| 1.322422153387e-04 45 KSP preconditioned resid norm 6.453831650211e-09 true resid norm 1.241068825442e-10 ||r(i)||/||b|| 1.158639466720e-04 46 KSP preconditioned resid norm 5.688126650324e-09 true resid norm 1.106855151342e-10 ||r(i)||/||b|| 1.033340001776e-04 47 KSP preconditioned resid norm 5.057302464313e-09 true resid norm 9.820289452367e-11 ||r(i)||/||b|| 9.168045076041e-05 48 KSP preconditioned resid norm 4.500799098525e-09 true resid norm 8.852961473294e-11 ||r(i)||/||b|| 8.264965125245e-05 49 KSP preconditioned resid norm 3.983769407394e-09 true resid norm 7.789147315022e-11 ||r(i)||/||b|| 7.271807418145e-05 50 KSP preconditioned resid norm 3.420304235712e-09 true resid norm 6.615770981665e-11 ||r(i)||/||b|| 6.176364440873e-05 51 KSP preconditioned resid norm 2.969395787357e-09 true resid norm 5.775608928511e-11 ||r(i)||/||b|| 5.392004304457e-05 52 KSP preconditioned resid norm 2.583265801170e-09 true resid norm 4.936215282441e-11 ||r(i)||/||b|| 4.608361539034e-05 53 KSP preconditioned resid norm 2.303172156518e-09 true resid norm 4.555830632203e-11 ||r(i)||/||b|| 4.253241291659e-05 54 KSP preconditioned resid norm 2.004038570012e-09 true resid norm 3.883153249403e-11 ||r(i)||/||b|| 3.625241822086e-05 55 KSP preconditioned resid norm 1.758746167534e-09 true resid norm 3.394179710909e-11 ||r(i)||/||b|| 3.168744947563e-05 56 KSP preconditioned resid norm 1.527568393189e-09 true resid norm 3.011417594376e-11 ||r(i)||/||b|| 2.811405140545e-05 57 KSP preconditioned resid norm 1.351270602655e-09 true resid norm 2.646768367869e-11 ||r(i)||/||b|| 2.470975200900e-05 58 KSP preconditioned resid norm 1.188580433509e-09 true resid norm 2.310607643585e-11 ||r(i)||/||b|| 2.157141613003e-05 59 KSP preconditioned resid norm 1.049772725898e-09 true resid norm 2.055122733712e-11 ||r(i)||/||b|| 1.918625510059e-05 60 KSP preconditioned resid norm 9.178135338494e-10 true resid norm 1.789537479201e-11 ||r(i)||/||b|| 1.670679907569e-05 61 KSP preconditioned resid norm 8.172466131650e-10 true resid norm 1.584599056440e-11 ||r(i)||/||b|| 1.479353093141e-05 62 KSP preconditioned resid norm 7.373368729936e-10 true resid norm 1.425470170182e-11 ||r(i)||/||b|| 1.330793235593e-05 63 KSP preconditioned resid norm 6.449901730940e-10 true resid norm 1.230744306740e-11 ||r(i)||/||b|| 1.149000682313e-05 64 KSP preconditioned resid norm 5.598483514662e-10 true resid norm 1.086956128883e-11 ||r(i)||/||b|| 1.014762633384e-05 65 KSP preconditioned resid norm 4.963094200575e-10 true resid norm 9.397813144522e-12 ||r(i)||/||b|| 8.773628816453e-06 66 KSP preconditioned resid norm 4.366969072145e-10 true resid norm 8.480411965936e-12 ||r(i)||/||b|| 7.917159625917e-06 67 KSP preconditioned resid norm 3.778573793786e-10 true resid norm 7.258854529176e-12 ||r(i)||/||b|| 6.776735639689e-06 68 KSP preconditioned resid norm 3.370904464953e-10 true resid norm 6.503542250737e-12 ||r(i)||/||b|| 6.071589722269e-06 69 KSP preconditioned resid norm 2.982699444561e-10 true resid norm 5.688313388577e-12 ||r(i)||/||b|| 5.310506763174e-06 70 KSP preconditioned resid norm 2.675651763666e-10 true resid norm 5.136077412575e-12 ||r(i)||/||b|| 4.794949218241e-06 71 KSP preconditioned resid norm 2.393278506019e-10 true resid norm 4.498351645872e-12 ||r(i)||/||b|| 4.199579946933e-06 72 KSP preconditioned resid norm 2.159108701621e-10 true resid norm 4.102354584836e-12 ||r(i)||/||b|| 3.829884234483e-06 73 KSP preconditioned resid norm 1.976346514770e-10 true resid norm 3.794817346865e-12 ||r(i)||/||b|| 3.542773017044e-06 74 KSP preconditioned resid norm 1.730082030043e-10 true resid norm 3.329374283677e-12 ||r(i)||/||b|| 3.108243769781e-06 75 KSP preconditioned resid norm 1.475368383202e-10 true resid norm 2.822421740838e-12 ||r(i)||/||b|| 2.634962021142e-06 76 KSP preconditioned resid norm 1.241796283012e-10 true resid norm 2.356348835514e-12 ||r(i)||/||b|| 2.199844764622e-06 77 KSP preconditioned resid norm 1.051910287558e-10 true resid norm 2.018639880815e-12 ||r(i)||/||b|| 1.884565776740e-06 78 KSP preconditioned resid norm 8.962988346101e-11 true resid norm 1.708872881925e-12 ||r(i)||/||b|| 1.595372894730e-06 79 KSP preconditioned resid norm 7.570084976402e-11 true resid norm 1.429761782062e-12 ||r(i)||/||b|| 1.334799807025e-06 80 KSP preconditioned resid norm 6.683347346922e-11 true resid norm 1.274890468348e-12 ||r(i)||/||b|| 1.190214742399e-06 81 KSP preconditioned resid norm 5.997149607838e-11 true resid norm 1.154991045391e-12 ||r(i)||/||b|| 1.078278804096e-06 82 KSP preconditioned resid norm 5.423985443710e-11 true resid norm 1.040041627308e-12 ||r(i)||/||b|| 9.709641010446e-07 83 KSP preconditioned resid norm 4.894792603260e-11 true resid norm 9.522002485135e-13 ||r(i)||/||b|| 8.889569744491e-07 84 KSP preconditioned resid norm 4.492690279298e-11 true resid norm 8.792462658391e-13 ||r(i)||/||b|| 8.208484523041e-07 85 KSP preconditioned resid norm 4.061630539729e-11 true resid norm 8.018573862948e-13 ||r(i)||/||b|| 7.485995904465e-07 86 KSP preconditioned resid norm 3.767320471887e-11 true resid norm 7.271396081286e-13 ||r(i)||/||b|| 6.788444206491e-07 87 KSP preconditioned resid norm 3.530363419777e-11 true resid norm 6.782807966740e-13 ||r(i)||/||b|| 6.332307156814e-07 88 KSP preconditioned resid norm 3.335188852670e-11 true resid norm 6.399306643496e-13 ||r(i)||/||b|| 5.974277239745e-07 89 KSP preconditioned resid norm 3.101008041056e-11 true resid norm 6.004937283362e-13 ||r(i)||/||b|| 5.606101119494e-07 90 KSP preconditioned resid norm 2.826503485799e-11 true resid norm 5.505976680085e-13 ||r(i)||/||b|| 5.140280501456e-07 91 KSP preconditioned resid norm 2.451903803277e-11 true resid norm 4.785428397528e-13 ||r(i)||/||b|| 4.467589623454e-07 92 KSP preconditioned resid norm 2.097479072432e-11 true resid norm 4.041277788586e-13 ||r(i)||/||b|| 3.772864039321e-07 93 KSP preconditioned resid norm 1.767596735270e-11 true resid norm 3.470813024880e-13 ||r(i)||/||b|| 3.240288427033e-07 94 KSP preconditioned resid norm 1.530390760261e-11 true resid norm 2.971278260534e-13 ||r(i)||/||b|| 2.773931782578e-07 95 KSP preconditioned resid norm 1.305001317805e-11 true resid norm 2.540140788993e-13 ||r(i)||/||b|| 2.371429616809e-07 96 KSP preconditioned resid norm 1.085438720889e-11 true resid norm 2.119323936237e-13 ||r(i)||/||b|| 1.978562594555e-07 97 KSP preconditioned resid norm 9.141824794601e-12 true resid norm 1.787874371738e-13 ||r(i)||/||b|| 1.669127260444e-07 98 KSP preconditioned resid norm 7.641790221189e-12 true resid norm 1.506813342331e-13 ||r(i)||/||b|| 1.406733753692e-07 99 KSP preconditioned resid norm 6.453763919751e-12 true resid norm 1.252091856570e-13 ||r(i)||/||b|| 1.168930369726e-07 100 KSP preconditioned resid norm 5.352698890327e-12 true resid norm 1.034643039009e-13 ||r(i)||/||b|| 9.659240763987e-08 101 KSP preconditioned resid norm 4.468634575064e-12 true resid norm 8.698708890092e-14 ||r(i)||/||b|| 8.120957696263e-08 Linear solve converged due to CONVERGED_RTOL iterations 101 iter = 19, Function value: -4.34176, Residual: 1.15202e-07 Tao Object: 1 MPI processes type: tron Total PG its: 57, PG tolerance: 0.001 TaoLineSearch Object: 1 MPI processes type: more-thuente KSP Object: 1 MPI processes type: cg total KSP iterations: 1920 Active Set subset type: subvec convergence tolerances: gatol=1e-12, steptol=1e-12, gttol=0. Residual in Function/Gradient:=1.15202e-07 Objective value=-4.34176 total number of iterations=19, (max: 50) total number of function/gradient evaluations=78, (max: 10000) total number of Hessian evaluations=19 Solution converged: ||g(X)||/|f(X)| <= grtol TAO solve converged due to CONVERGED_GRTOL iterations 19 it: 1 2.245113e+01 1920 8.441264e+08 ========================================== Time summary: ========================================== Creating DMPlex: 0.520237 Distributing DMPlex: 1.97887e-05 Refining DMPlex: 2.10413 Setting up problem: 2.86735 Overall analysis time: 22.5795 Overall FLOPS/s: 7.5135e+08 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./main on a arch-darwin-c-opt named Justins-MacBook-Pro-2.local with 1 processor, by justin Tue Mar 8 00:51:47 2016 Using Petsc Development GIT revision: pre-tsfc-438-gf0bfc80 GIT Date: 2016-03-01 11:52:01 -0600 Max Max/Min Avg Total Time (sec): 2.811e+01 1.00000 2.811e+01 Objects: 4.611e+03 1.00000 4.611e+03 Flops: 1.902e+10 1.00000 1.902e+10 1.902e+10 Flops/sec: 6.766e+08 1.00000 6.766e+08 6.766e+08 MPI Messages: 5.500e+00 1.00000 5.500e+00 5.500e+00 MPI Message Lengths: 4.494e+06 1.00000 8.170e+05 4.494e+06 MPI Reductions: 1.000e+00 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.8115e+01 100.0% 1.9023e+10 100.0% 5.500e+00 100.0% 8.170e+05 100.0% 1.000e+00 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage CreateMesh 79 1.0 2.9209e+00 1.0 3.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 10 2 0 0 0 10 2 0 0 0 113 BuildTwoSided 5 1.0 5.2476e-04 1.0 0.00e+00 0.0 5.0e-01 4.0e+00 0.0e+00 0 0 9 0 0 0 0 9 0 0 0 VecView 1 1.0 1.2794e-02 1.0 4.02e+05 1.0 1.0e+00 1.0e+06 0.0e+00 0 0 18 22 0 0 0 18 22 0 31 VecDot 424 1.0 5.8249e-02 1.0 1.06e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1817 VecTDot 3840 1.0 4.5666e-01 1.0 8.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 4 0 0 0 2 4 0 0 0 1779 VecNorm 3936 1.0 4.1154e-01 1.0 8.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 2027 VecScale 228 1.0 1.6403e-02 1.0 2.81e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1713 VecCopy 2358 1.0 2.1371e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecSet 6036 1.0 3.6855e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 4014 1.0 4.1987e-01 1.0 8.56e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 2039 VecAYPX 3859 1.0 4.1066e-01 1.0 6.12e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 1491 VecWAXPY 1 1.0 2.1100e-04 1.0 1.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 VecScatterBegin 38 1.0 4.5440e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 3957 1.0 1.0226e+01 1.0 1.03e+10 1.0 0.0e+00 0.0e+00 0.0e+00 36 54 0 0 0 36 54 0 0 0 1011 MatSolve 1939 1.0 5.8253e+00 1.0 5.02e+09 1.0 0.0e+00 0.0e+00 0.0e+00 21 26 0 0 0 21 26 0 0 0 863 MatLUFactorNum 19 1.0 5.6000e-01 1.0 2.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 380 MatILUFactorSym 19 1.0 1.2467e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 21 1.0 9.0599e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 21 1.0 6.1831e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 19 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 19 1.0 4.8048e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatGetOrdering 19 1.0 5.6238e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 1 1.0 9.4700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexInterp 3 1.0 4.7024e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 DMPlexStratify 11 1.0 6.3541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 DMPlexPrealloc 1 1.0 1.0536e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 DMPlexResidualFE 1 1.0 6.2948e-01 1.0 4.01e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 64 DMPlexJacobianFE 1 1.0 1.8430e+00 1.0 8.14e+07 1.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 44 SFSetGraph 6 1.0 2.0661e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 9 1.0 2.5048e-03 1.0 0.00e+00 0.0 4.5e+00 7.8e+05 0.0e+00 0 0 82 78 0 0 0 82 78 0 0 SFBcastEnd 9 1.0 7.1669e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFReduceBegin 1 1.0 3.6120e-04 1.0 0.00e+00 0.0 1.0e+00 1.0e+06 0.0e+00 0 0 18 22 0 0 0 18 22 0 0 SFReduceEnd 1 1.0 3.8004e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SNESFunctionEval 1 1.0 6.3624e-01 1.0 4.01e+07 1.0 2.0e+00 1.0e+06 0.0e+00 2 0 36 44 0 2 0 36 44 0 63 SNESJacobianEval 1 1.0 1.8445e+00 1.0 8.14e+07 1.0 2.5e+00 6.0e+05 0.0e+00 7 0 45 33 0 7 0 45 33 0 44 KSPSetUp 38 1.0 4.6651e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 19 1.0 1.8689e+01 1.0 1.83e+10 1.0 0.0e+00 0.0e+00 0.0e+00 66 96 0 0 0 66 96 0 0 0 979 PCSetUp 38 1.0 6.9092e-01 1.0 2.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 308 PCSetUpOnBlocks 19 1.0 6.9078e-01 1.0 2.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 308 PCApply 1939 1.0 5.9339e+00 1.0 5.02e+09 1.0 0.0e+00 0.0e+00 0.0e+00 21 26 0 0 0 21 26 0 0 0 847 TaoSolve 1 1.0 1.9964e+01 1.0 1.88e+10 1.0 0.0e+00 0.0e+00 0.0e+00 71 99 0 0 0 71 99 0 0 0 943 TaoHessianEval 19 1.0 4.5300e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 TaoLineSearchApply 76 1.0 5.3617e-01 1.0 4.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 787 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 4 3 2432 0. Object 7 7 4088 0. Container 15 15 8640 0. Vector 4074 4074 3615142824 0. Vector Scatter 38 38 25232 0. Matrix 39 39 699796976 0. Distributed Mesh 29 29 135576 0. GraphPartitioner 11 11 6732 0. Star Forest Bipartite Graph 62 62 50488 0. Discrete System 29 29 25056 0. Index Set 229 229 33961616 0. IS L to G Mapping 1 1 561392 0. Section 56 54 36288 0. SNES 1 1 1340 0. SNESLineSearch 1 1 1000 0. DMSNES 1 1 672 0. Krylov Solver 3 3 3680 0. Preconditioner 3 3 2824 0. Linear Space 2 2 1296 0. Dual Space 2 2 1328 0. FE Space 2 2 1512 0. Tao 1 1 1944 0. TaoLineSearch 1 1 888 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 #PETSc Option Table entries: -al 1 -am 0 -at 0.001 -bcloc 0,1,0,1,0,0,0,1,0,1,1,1,0,0,0,1,0,1,1,1,0,1,0,1,0,1,0,0,0,1,0,1,1,1,0,1,0.45,0.55,0.45,0.55,0.45,0.55 -bcnum 7 -bcval 0,0,0,0,0,0,1 -dim 3 -dm_refine 1 -dt 0.001 -edges 3,3 -floc 0.25,0.75,0.25,0.75,0.25,0.75 -fnum 0 -ftime 0,99 -fval 1 -ksp_atol 1.0e-12 -ksp_converged_reason -ksp_monitor_true_residual -ksp_rtol 1.0e-7 -ksp_type cg -log_view -lower 0,0 -mat_petscspace_order 0 -mesh datafiles/cube_with_hole2_mesh.dat -mu 1 -nonneg 1 -numsteps 0 -options_left 0 -pc_type bjacobi -petscpartitioner_type parmetis -progress 0 -simplex 1 -solution_petscspace_order 1 -tao_converged_reason -tao_gatol 1e-12 -tao_grtol 1e-7 -tao_monitor -tao_type tron -tao_view -trans datafiles/cube_with_hole2_trans.dat -upper 1,1 -vtuname figures/cube_with_hole_2 -vtuprint 1 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-chaco --download-ctetgen --download-exodusii --download-fblaslapack --download-hdf5 --download-hypre --download-metis --download-ml --download-mumps --download-netcdf --download-parmetis --download-scalapack --download-superlu_dist --download-triangle --with-debugging=0 --with-mpi-dir=/usr/local/opt/open-mpi --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2 PETSC_ARCH=arch-darwin-c-opt ----------------------------------------- Libraries compiled on Tue Mar 1 13:44:59 2016 on Justins-MacBook-Pro-2.local Machine characteristics: Darwin-15.3.0-x86_64-i386-64bit Using PETSc directory: /Users/justin/Software/petsc Using PETSc arch: arch-darwin-c-opt ----------------------------------------- Using C compiler: /usr/local/opt/open-mpi/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -O2 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/opt/open-mpi/bin/mpif90 -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O2 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/Users/justin/Software/petsc/arch-darwin-c-opt/include -I/Users/justin/Software/petsc/include -I/Users/justin/Software/petsc/include -I/Users/justin/Software/petsc/arch-darwin-c-opt/include -I/opt/X11/include -I/usr/local/opt/open-mpi/include -I/usr/local/Cellar/open-mpi/1.10.2/include ----------------------------------------- Using C linker: /usr/local/opt/open-mpi/bin/mpicc Using Fortran linker: /usr/local/opt/open-mpi/bin/mpif90 Using libraries: -Wl,-rpath,/Users/justin/Software/petsc/arch-darwin-c-opt/lib -L/Users/justin/Software/petsc/arch-darwin-c-opt/lib -lpetsc -Wl,-rpath,/Users/justin/Software/petsc/arch-darwin-c-opt/lib -L/Users/justin/Software/petsc/arch-darwin-c-opt/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -Wl,-rpath,/usr/local/opt/libevent/lib -L/usr/local/opt/libevent/lib -Wl,-rpath,/usr/local/Cellar/open-mpi/1.10.2/lib -L/usr/local/Cellar/open-mpi/1.10.2/lib -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin -lclang_rt.osx -lmpi_cxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -lclang_rt.osx -lscalapack -lml -lclang_rt.osx -lmpi_cxx -lc++ -lclang_rt.osx -lflapack -lfblas -ltriangle -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lctetgen -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -Wl,-rpath,/usr/local/Cellar/gcc/5.3.0/lib/gcc/5/gcc/x86_64-apple-darwin15.0.0/5.3.0 -L/usr/local/Cellar/gcc/5.3.0/lib/gcc/5/gcc/x86_64-apple-darwin15.0.0/5.3.0 -Wl,-rpath,/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 -L/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpi_cxx -lc++ -lclang_rt.osx -Wl,-rpath,/usr/local/opt/libevent/lib -L/usr/local/opt/libevent/lib -Wl,-rpath,/usr/local/Cellar/open-mpi/1.10.2/lib -L/usr/local/Cellar/open-mpi/1.10.2/lib -ldl -lmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/lib/darwin -lclang_rt.osx -ldl ----------------------------------------- From davydden at gmail.com Tue Mar 8 04:28:14 2016 From: davydden at gmail.com (Denis Davydov) Date: Tue, 8 Mar 2016 11:28:14 +0100 Subject: [petsc-users] [SLEPc] non-deterministic behaviour in GHEP with Krylov-Schur Message-ID: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> Dear all, I have some issues with Krylov-Schur applied to GHEP, namely, that different runs on the same machine with the same number of MPI cores gives different eigenvectors results. Here is an example: mass.InfNorm =15.625 stiff.InfNorm=726.549 eigenfunction[0].linf=0.459089 eigenfunction[1].linf=0.318075 eigenfunction[2].linf=0.326199 eigenfunction[3].linf=0.312521 eigenfunction[4].linf=0.271712 eigenfunction[5].linf=0.280744 eigenfunction[6].linf=0.315654 eigenfunction[7].linf=0.192715 eigenfunction[8].linf=0.194826 vs mass.InfNorm =15.625 stiff.InfNorm=726.549 eigenfunction[0].linf=0.459089 eigenfunction[1].linf=0.329682 eigenfunction[2].linf=0.326199 eigenfunction[3].linf=0.325289 eigenfunction[4].linf=0.284252 eigenfunction[5].linf=0.263418 eigenfunction[6].linf=0.315756 eigenfunction[7].linf=0.194826 eigenfunction[8].linf=0.193074 Eigensolver tolerance is absolute and 1e-20. So it?s a bit surprising that there is a quite a big variation in L-inf norm of eigenvectors (0.318075 vs 0.329682). In either case, the biggest issue is non-deterministic behaviour. Is there anything I am missing in Krylov-Schur to make its behaviour as deterministic as possible? p.s. shift-and-invert is done with LU from MUMPS. Shift value is lower than the exact lowest eigenvalue. Kind regards, Denis From jroman at dsic.upv.es Tue Mar 8 04:39:41 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 8 Mar 2016 11:39:41 +0100 Subject: [petsc-users] [SLEPc] non-deterministic behaviour in GHEP with Krylov-Schur In-Reply-To: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> References: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> Message-ID: <818D0818-068B-4472-B42D-7564C68127EE@dsic.upv.es> > El 8 mar 2016, a las 11:28, Denis Davydov escribi?: > > Dear all, > > I have some issues with Krylov-Schur applied to GHEP, namely, that different runs on the same machine with the > same number of MPI cores gives different eigenvectors results. > Here is an example: > > mass.InfNorm =15.625 > stiff.InfNorm=726.549 > eigenfunction[0].linf=0.459089 > eigenfunction[1].linf=0.318075 > eigenfunction[2].linf=0.326199 > eigenfunction[3].linf=0.312521 > eigenfunction[4].linf=0.271712 > eigenfunction[5].linf=0.280744 > eigenfunction[6].linf=0.315654 > eigenfunction[7].linf=0.192715 > eigenfunction[8].linf=0.194826 > > vs > > mass.InfNorm =15.625 > stiff.InfNorm=726.549 > eigenfunction[0].linf=0.459089 > eigenfunction[1].linf=0.329682 > eigenfunction[2].linf=0.326199 > eigenfunction[3].linf=0.325289 > eigenfunction[4].linf=0.284252 > eigenfunction[5].linf=0.263418 > eigenfunction[6].linf=0.315756 > eigenfunction[7].linf=0.194826 > eigenfunction[8].linf=0.193074 > > Eigensolver tolerance is absolute and 1e-20. So it?s a bit surprising that there is a quite a big variation in L-inf norm of eigenvectors (0.318075 vs 0.329682). > In either case, the biggest issue is non-deterministic behaviour. > Is there anything I am missing in Krylov-Schur to make its behaviour as deterministic as possible? > > p.s. shift-and-invert is done with LU from MUMPS. Shift value is lower than the exact lowest eigenvalue. > > Kind regards, > Denis > Which are the eigenvalues? From davydden at gmail.com Tue Mar 8 05:29:15 2016 From: davydden at gmail.com (Denis Davydov) Date: Tue, 8 Mar 2016 12:29:15 +0100 Subject: [petsc-users] [SLEPc] non-deterministic behaviour in GHEP with Krylov-Schur In-Reply-To: <818D0818-068B-4472-B42D-7564C68127EE@dsic.upv.es> References: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> <818D0818-068B-4472-B42D-7564C68127EE@dsic.upv.es> Message-ID: <0256F98A-419B-4A6B-88BE-CAD8BDA7E945@gmail.com> > On 8 Mar 2016, at 11:39, Jose E. Roman wrote: > > >> El 8 mar 2016, a las 11:28, Denis Davydov escribi?: >> >> Dear all, >> >> I have some issues with Krylov-Schur applied to GHEP, namely, that different runs on the same machine with the >> same number of MPI cores gives different eigenvectors results. >> Here is an example: >> >> mass.InfNorm =15.625 >> stiff.InfNorm=726.549 >> eigenfunction[0].linf=0.459089 >> eigenfunction[1].linf=0.318075 >> eigenfunction[2].linf=0.326199 >> eigenfunction[3].linf=0.312521 >> eigenfunction[4].linf=0.271712 >> eigenfunction[5].linf=0.280744 >> eigenfunction[6].linf=0.315654 >> eigenfunction[7].linf=0.192715 >> eigenfunction[8].linf=0.194826 >> >> vs >> >> mass.InfNorm =15.625 >> stiff.InfNorm=726.549 >> eigenfunction[0].linf=0.459089 >> eigenfunction[1].linf=0.329682 >> eigenfunction[2].linf=0.326199 >> eigenfunction[3].linf=0.325289 >> eigenfunction[4].linf=0.284252 >> eigenfunction[5].linf=0.263418 >> eigenfunction[6].linf=0.315756 >> eigenfunction[7].linf=0.194826 >> eigenfunction[8].linf=0.193074 >> >> Eigensolver tolerance is absolute and 1e-20. So it?s a bit surprising that there is a quite a big variation in L-inf norm of eigenvectors (0.318075 vs 0.329682). >> In either case, the biggest issue is non-deterministic behaviour. >> Is there anything I am missing in Krylov-Schur to make its behaviour as deterministic as possible? >> >> p.s. shift-and-invert is done with LU from MUMPS. Shift value is lower than the exact lowest eigenvalue. >> >> Kind regards, >> Denis >> > > Which are the eigenvalues? > The eigenvalues are the same for the two runs up to the output precision: Eigenvalue 0 : 1.65635 Eigenvalue 1 : 4.71385 Eigenvalue 2 : 4.71385 Eigenvalue 3 : 4.71385 Eigenvalue 4 : 5.1974 Eigenvalue 5 : 5.1974 Eigenvalue 6 : 5.1974 Eigenvalue 7 : 7.77136 Eigenvalue 8 : 7.77136 From jroman at dsic.upv.es Tue Mar 8 05:38:44 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 8 Mar 2016 12:38:44 +0100 Subject: [petsc-users] [SLEPc] non-deterministic behaviour in GHEP with Krylov-Schur In-Reply-To: <0256F98A-419B-4A6B-88BE-CAD8BDA7E945@gmail.com> References: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> <818D0818-068B-4472-B42D-7564C68127EE@dsic.upv.es> <0256F98A-419B-4A6B-88BE-CAD8BDA7E945@gmail.com> Message-ID: <1A813ACF-731E-46BC-9C8A-4379B8CD9707@dsic.upv.es> > El 8 mar 2016, a las 12:29, Denis Davydov escribi?: > > >> On 8 Mar 2016, at 11:39, Jose E. Roman wrote: >> >> >>> El 8 mar 2016, a las 11:28, Denis Davydov escribi?: >>> >>> Dear all, >>> >>> I have some issues with Krylov-Schur applied to GHEP, namely, that different runs on the same machine with the >>> same number of MPI cores gives different eigenvectors results. >>> Here is an example: >>> >>> mass.InfNorm =15.625 >>> stiff.InfNorm=726.549 >>> eigenfunction[0].linf=0.459089 >>> eigenfunction[1].linf=0.318075 >>> eigenfunction[2].linf=0.326199 >>> eigenfunction[3].linf=0.312521 >>> eigenfunction[4].linf=0.271712 >>> eigenfunction[5].linf=0.280744 >>> eigenfunction[6].linf=0.315654 >>> eigenfunction[7].linf=0.192715 >>> eigenfunction[8].linf=0.194826 >>> >>> vs >>> >>> mass.InfNorm =15.625 >>> stiff.InfNorm=726.549 >>> eigenfunction[0].linf=0.459089 >>> eigenfunction[1].linf=0.329682 >>> eigenfunction[2].linf=0.326199 >>> eigenfunction[3].linf=0.325289 >>> eigenfunction[4].linf=0.284252 >>> eigenfunction[5].linf=0.263418 >>> eigenfunction[6].linf=0.315756 >>> eigenfunction[7].linf=0.194826 >>> eigenfunction[8].linf=0.193074 >>> >>> Eigensolver tolerance is absolute and 1e-20. So it?s a bit surprising that there is a quite a big variation in L-inf norm of eigenvectors (0.318075 vs 0.329682). >>> In either case, the biggest issue is non-deterministic behaviour. >>> Is there anything I am missing in Krylov-Schur to make its behaviour as deterministic as possible? >>> >>> p.s. shift-and-invert is done with LU from MUMPS. Shift value is lower than the exact lowest eigenvalue. >>> >>> Kind regards, >>> Denis >>> >> >> Which are the eigenvalues? >> > > The eigenvalues are the same for the two runs up to the output precision: > > Eigenvalue 0 : 1.65635 > Eigenvalue 1 : 4.71385 > Eigenvalue 2 : 4.71385 > Eigenvalue 3 : 4.71385 > Eigenvalue 4 : 5.1974 > Eigenvalue 5 : 5.1974 > Eigenvalue 6 : 5.1974 > Eigenvalue 7 : 7.77136 > Eigenvalue 8 : 7.77136 As you can see, the eigenvector for the first eigenvalue (which is simple) is the same in both runs. The rest are multiple eigenvalues, so the corresponding eigenvectors are not uniquely determined simply by normalization. This means that, for instance, any linear combination of v1,v2,v3 is an eigenvector corresponding to 4.71385. Parallel computation implies slightly different numerical error in different runs, and that is why you are getting different eigenvectors. But in terms of linear algebra, both runs are correct. Jose From davydden at gmail.com Tue Mar 8 06:13:41 2016 From: davydden at gmail.com (Denis Davydov) Date: Tue, 8 Mar 2016 13:13:41 +0100 Subject: [petsc-users] [SLEPc] non-deterministic behaviour in GHEP with Krylov-Schur In-Reply-To: <1A813ACF-731E-46BC-9C8A-4379B8CD9707@dsic.upv.es> References: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> <818D0818-068B-4472-B42D-7564C68127EE@dsic.upv.es> <0256F98A-419B-4A6B-88BE-CAD8BDA7E945@gmail.com> <1A813ACF-731E-46BC-9C8A-4379B8CD9707@dsic.upv.es> Message-ID: <5832988F-1309-49B2-860F-DEDEC0172161@gmail.com> > On 8 Mar 2016, at 12:38, Jose E. Roman wrote: > > As you can see, the eigenvector for the first eigenvalue (which is simple) is the same in both runs. The rest are multiple eigenvalues, so the corresponding eigenvectors are not uniquely determined simply by normalization. This means that, for instance, any linear combination of v1,v2,v3 is an eigenvector corresponding to 4.71385. Parallel There is no questions about this. > computation implies slightly different numerical error in different runs, and that is why you are getting different eigenvectors. But in terms of linear algebra, both runs are > correct. Frankly, i don?t see a reason for that. Assuming that the partition of degrees-of-freedom is the same and the number of MPI cores is the same, the result should be deterministic on the same machine. Unless some random number seed is changing and thus influence which eigenvectors are obtained for the degenerate eigenvalue... Regards, Denis From knepley at gmail.com Tue Mar 8 07:08:06 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 8 Mar 2016 07:08:06 -0600 Subject: [petsc-users] [SLEPc] non-deterministic behaviour in GHEP with Krylov-Schur In-Reply-To: <5832988F-1309-49B2-860F-DEDEC0172161@gmail.com> References: <5E107D05-7969-457A-A02E-D00B9E0569F6@gmail.com> <818D0818-068B-4472-B42D-7564C68127EE@dsic.upv.es> <0256F98A-419B-4A6B-88BE-CAD8BDA7E945@gmail.com> <1A813ACF-731E-46BC-9C8A-4379B8CD9707@dsic.upv.es> <5832988F-1309-49B2-860F-DEDEC0172161@gmail.com> Message-ID: On Tue, Mar 8, 2016 at 6:13 AM, Denis Davydov wrote: > > > On 8 Mar 2016, at 12:38, Jose E. Roman wrote: > > > > As you can see, the eigenvector for the first eigenvalue (which is > simple) is the same in both runs. The rest are multiple eigenvalues, so the > corresponding eigenvectors are not uniquely determined simply by > normalization. This means that, for instance, any linear combination of > v1,v2,v3 is an eigenvector corresponding to 4.71385. Parallel > > There is no questions about this. > > > computation implies slightly different numerical error in different > runs, and that is why you are getting different eigenvectors. But in terms > of linear algebra, both runs are > > correct. > > Frankly, i don?t see a reason for that. > Assuming that the partition of degrees-of-freedom is the same and the > number of MPI cores is the same, > the result should be deterministic on the same machine. > This is not true. If you have reductions, the order of operations is not prescribed by the standard, and can change. I believe there is an MPI implementation for climate that allows no reordering, so you get "bit-level reproducibility". However, that seems to me to defeat the whole purpose of efficient numerics. We should only care about things to the given tolerance, and since you have a multiple eigenvalue to that tolerance you should not care which basis vectors are chosen. Thanks, Matt > Unless some random number seed is changing and thus influence which > eigenvectors are obtained for the degenerate eigenvalue... > > > Regards, > Denis > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 8 07:10:41 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 8 Mar 2016 07:10:41 -0600 Subject: [petsc-users] DMPlex : Assemble global stiffness matrix problem In-Reply-To: <6B03D347796DED499A2696FC095CE81A013A556A@ait-pex02mbx04.win.dtu.dk> References: <6B03D347796DED499A2696FC095CE81A013A556A@ait-pex02mbx04.win.dtu.dk> Message-ID: On Mon, Mar 7, 2016 at 1:28 PM, Morten Nobel-J?rgensen wrote: > I have some problems using DMPlex on unstructured grids in 3D. > > After I have created the DMPlex and assigned dofs (3 dofs on each node), I > run into some problems when assembling the global stiffness matrix. I have > created a small example in the attached cc file. My problems are: > > - It seems like the matrix (created using DMCreateMatrix) contains no > non-zero elements. I was under the impression that the sparsity pattern of > the matrix would be created automatically when the dofs has been assigned > to the default section. > - (Probably as a consequence of this) when assigning values to the > matrix I get an: "Argument of of range. New nonzero at (0,0) caused a > malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, > PETSC_FALSE) to turn off this check" > - Finally, I'm reading the nodes of each element using the > get-transitive-clojure (where I test if each point is inside the node > range), but I have a hard time understanding if the returned values are > sorted. And if not, how to sort the values (e.g. using orientation which > the get-transitive-clojure function also returns). > > I hope someone can guide me in the right direction :) > I will take a look today or tomorrow. The first thing to do is to look at the nonzero pattern of the Jacobian. I use -mat_view draw -draw_pause -1 Thanks, Matt > Kind regards, > Morten > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 8 11:47:21 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 8 Mar 2016 11:47:21 -0600 Subject: [petsc-users] DMPlex : Assemble global stiffness matrix problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A013A556A@ait-pex02mbx04.win.dtu.dk> Message-ID: On Tue, Mar 8, 2016 at 7:10 AM, Matthew Knepley wrote: > On Mon, Mar 7, 2016 at 1:28 PM, Morten Nobel-J?rgensen > wrote: > >> I have some problems using DMPlex on unstructured grids in 3D. >> >> After I have created the DMPlex and assigned dofs (3 dofs on each node), >> I run into some problems when assembling the global stiffness matrix. I >> have created a small example in the attached cc file. My problems are: >> >> - It seems like the matrix (created using DMCreateMatrix) contains no >> non-zero elements. I was under the impression that the sparsity pattern of >> the matrix would be created automatically when the dofs has been assigned >> to the default section. >> - (Probably as a consequence of this) when assigning values to the >> matrix I get an: "Argument of of range. New nonzero at (0,0) caused a >> malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, >> PETSC_FALSE) to turn off this check" >> - Finally, I'm reading the nodes of each element using the >> get-transitive-clojure (where I test if each point is inside the node >> range), but I have a hard time understanding if the returned values are >> sorted. And if not, how to sort the values (e.g. using orientation which >> the get-transitive-clojure function also returns). >> >> I hope someone can guide me in the right direction :) >> > > I will take a look today or tomorrow. The first thing to do is to look at > the nonzero pattern of the Jacobian. I use -mat_view draw -draw_pause -1 > I admit that this problem is counter-intuitive, and I will think about a good error check. The problem is that I allow "inconsistent" sections, meaning that the dof for each field do not add up to the total dof. In your code, when you call ierr = PetscSectionSetDof(s, v, 3);CHKERRQ(ierr); you should also call ierr = PetscSectionSetFieldDof(s, v, 0, 3);CHKERRQ(ierr); Then everything works. I am attaching my slight rewrite of your code. Thanks, Matt > Thanks, > > Matt > > >> Kind regards, >> Morten >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18.c Type: text/x-csrc Size: 4252 bytes Desc: not available URL: From kalan019 at umn.edu Tue Mar 8 14:07:28 2016 From: kalan019 at umn.edu (Vasileios Kalantzis) Date: Tue, 8 Mar 2016 14:07:28 -0600 Subject: [petsc-users] Using multithreaded MKL in PETSc Message-ID: Hi everyone, Assuming that I want to use MKL's Pardiso in the context of PETSc (I need it as a local solver for block Jacobi type preconditioning), how should I form my ./configure so that I can take advantage of multi-threading when calling MKL's Pardiso? For example, do I simply have to configure with --with-openmp=1 --with-pthread=1 and/or similar? I have tried different combinations and thought of asking, Thank you! ps. right now I use flat MPI and works fine, I want to use threading to add a second level of parallelism, -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 8 14:22:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 8 Mar 2016 14:22:46 -0600 Subject: [petsc-users] Using multithreaded MKL in PETSc In-Reply-To: References: Message-ID: <83BD7A86-0E34-4A23-945C-BAEAD6FA5B7C@mcs.anl.gov> > On Mar 8, 2016, at 2:07 PM, Vasileios Kalantzis wrote: > > Hi everyone, > > Assuming that I want to use MKL's Pardiso in the > context of PETSc (I need it as a local solver for > block Jacobi type preconditioning), how should I form > my ./configure so that I can take advantage of > multi-threading when calling MKL's Pardiso? > > For example, do I simply have to configure with > --with-openmp=1 --with-pthread=1 and/or similar? > I have tried different combinations and thought > of asking, > > Thank you! > > ps. right now I use flat MPI and works fine, I want > to use threading to add a second level of parallelism, If you do this then the only part of the code that takes advantage of that "second level of parallelism" will be within the MKL solver. The rest of the code will not use those "extra" threads. You must make sure you run MPI on less cores so there are enough "free" cores to run the threads for the the MKL solver. To utilize MKL's Pardiso you don't need to configure PETSc with any mention of openmp or pthreads you just need to set appropriate MKL environmental variables so that MKL's Pardiso will use multiple threads. Or you can use MatMkl_PardisoSetCntl() see the manual page for MATSOLVERMKL_PARDISO etc Barry > > From kalan019 at umn.edu Tue Mar 8 15:15:13 2016 From: kalan019 at umn.edu (Vasileios Kalantzis) Date: Tue, 8 Mar 2016 15:15:13 -0600 Subject: [petsc-users] Using multithreaded MKL in PETSc In-Reply-To: <83BD7A86-0E34-4A23-945C-BAEAD6FA5B7C@mcs.anl.gov> References: <83BD7A86-0E34-4A23-945C-BAEAD6FA5B7C@mcs.anl.gov> Message-ID: Hi Barry, thanks for your note, I am aware of that too. Having only the MKL solver being able to see the second level of parallelism is fine with me. Thanks for the clarification too! As a sidenote, I set all environmental variables correctly but Pardiso prints that is running on 1 OpenMP thread only. I can see in the compile line (when I make) that ${CLINKER} links with flags -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm. Could the -lmkl_sequential flag be the reason that MKL sees only a single thread? Can I edit ${CLINKER} or somehow remove the sequential flag? Thanks again! On Tue, Mar 8, 2016 at 2:22 PM, Barry Smith wrote: > > > On Mar 8, 2016, at 2:07 PM, Vasileios Kalantzis > wrote: > > > > Hi everyone, > > > > Assuming that I want to use MKL's Pardiso in the > > context of PETSc (I need it as a local solver for > > block Jacobi type preconditioning), how should I form > > my ./configure so that I can take advantage of > > multi-threading when calling MKL's Pardiso? > > > > For example, do I simply have to configure with > > --with-openmp=1 --with-pthread=1 and/or similar? > > I have tried different combinations and thought > > of asking, > > > > Thank you! > > > > ps. right now I use flat MPI and works fine, I want > > to use threading to add a second level of parallelism, > > If you do this then the only part of the code that takes advantage of > that "second level of parallelism" will be within the MKL solver. The rest > of the code will not use those "extra" threads. You must make sure you run > MPI on less cores so there are enough "free" cores to run the threads for > the the MKL solver. > > To utilize MKL's Pardiso you don't need to configure PETSc with any > mention of openmp or pthreads you just need to set appropriate MKL > environmental variables so that MKL's Pardiso will use multiple threads. Or > you can use MatMkl_PardisoSetCntl() see the manual page for > MATSOLVERMKL_PARDISO etc > > > > Barry > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Mar 8 15:33:32 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 8 Mar 2016 15:33:32 -0600 Subject: [petsc-users] Using multithreaded MKL in PETSc In-Reply-To: References: <83BD7A86-0E34-4A23-945C-BAEAD6FA5B7C@mcs.anl.gov> Message-ID: Ideally petsc configure is smart and automatically determines the correct mkl library list. [but this is not easy - hence the existance of MKL advisor] You could use intel mkl advisor - and get the *desired* MKL libary list for your specific setup [compilers, mkl version, etc..] - and then specify this library list to PETSc configure using --with-blas-lapack-lib option. Alternatively you can edit PETSC_ARCH/lib/conf/petscvariables and change -lmkl_sequential to -lmkl_intel_thread [or something suitable for your setup] and give it a try.. Satish On Tue, 8 Mar 2016, Vasileios Kalantzis wrote: > Hi Barry, > > thanks for your note, I am aware of that too. Having only the MKL solver > being able to see the second level of parallelism is fine with me. > > Thanks for the clarification too! As a sidenote, I set all environmental > variables correctly but Pardiso prints that is running on 1 OpenMP > thread only. I can see in the compile line (when I make) that ${CLINKER} > links with flags -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread > -lm. > Could the -lmkl_sequential flag be the reason that MKL sees only a > single thread? Can I edit ${CLINKER} or somehow remove the sequential > flag? > > Thanks again! > > On Tue, Mar 8, 2016 at 2:22 PM, Barry Smith wrote: > > > > > > On Mar 8, 2016, at 2:07 PM, Vasileios Kalantzis > > wrote: > > > > > > Hi everyone, > > > > > > Assuming that I want to use MKL's Pardiso in the > > > context of PETSc (I need it as a local solver for > > > block Jacobi type preconditioning), how should I form > > > my ./configure so that I can take advantage of > > > multi-threading when calling MKL's Pardiso? > > > > > > For example, do I simply have to configure with > > > --with-openmp=1 --with-pthread=1 and/or similar? > > > I have tried different combinations and thought > > > of asking, > > > > > > Thank you! > > > > > > ps. right now I use flat MPI and works fine, I want > > > to use threading to add a second level of parallelism, > > > > If you do this then the only part of the code that takes advantage of > > that "second level of parallelism" will be within the MKL solver. The rest > > of the code will not use those "extra" threads. You must make sure you run > > MPI on less cores so there are enough "free" cores to run the threads for > > the the MKL solver. > > > > To utilize MKL's Pardiso you don't need to configure PETSc with any > > mention of openmp or pthreads you just need to set appropriate MKL > > environmental variables so that MKL's Pardiso will use multiple threads. Or > > you can use MatMkl_PardisoSetCntl() see the manual page for > > MATSOLVERMKL_PARDISO etc > > > > > > > > Barry > > > > > > > > > > > > > > > > > From jychang48 at gmail.com Tue Mar 8 15:50:41 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 8 Mar 2016 15:50:41 -0600 Subject: [petsc-users] Strange GAMG performance for mixed FE formulation In-Reply-To: References: <56D80581.8030903@imperial.ac.uk> Message-ID: Mark, So I had the following options: -solver_fieldsplit_1_pc_type gamg -solver_fieldsplit_1_pc_gamg_square_graph 10 -solver_fieldsplit_1_mg_levels_ksp_type richardson -solver_fieldsplit_1_pc_gamg_threshold <-0.02/0.00/0.02> -info -log_summary where I varied the gamg threshold from -0.02, 0.00, and 0.02. Richardson for the mg_levels_ksp_type option did improve the performance, but it seems using a non-zero threshold worsens the overall solver. for "-solver_fieldsplit_1_pc_gamg_threshold 0.02": >> grep 'GAMG' positive_threshold [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold 0.02, 8.863 nnz ave. (N=162000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 min=8.260696e-03 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 active pes [0] PC*GAMG*FilterGraph(): 2.20542% nnz after filtering, with threshold 0.02, 704.128 nnz ave. (N=22085) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 798 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 min=1.484874e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=798, n data cols=1, nnz/row (ave)=798, 1 active pes [0] PC*GAMG*FilterGraph(): 1.75564% nnz after filtering, with threshold 0.02, 798. nnz ave. (N=798) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 44 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.668188e+00 min=1.402892e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=44, n data cols=1, nnz/row (ave)=44, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.2755 For "-solver_fieldsplit_1_pc_gamg_threshold 0.00": [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with threshold 0., 8.863 nnz ave. (N=162000) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 min=8.260696e-03 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 active pes [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with threshold 0., 704.128 nnz ave. (N=22085) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 545 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 min=1.484874e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=545, n data cols=1, nnz/row (ave)=545, 1 active pes [0] PC*GAMG*FilterGraph(): 7.55997% nnz after filtering, with threshold 0., 545. nnz ave. (N=545) [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 11 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.368729e+00 min=1.563750e-01 PC=jacobi [0] PCSetUp_*GAMG*(): 3) N=11, n data cols=1, nnz/row (ave)=11, 1 active pes [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0376 and for "-solver_fieldsplit_1_pc_gamg_threshold -0.02": [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, nnz/row (ave)=9, np=1 [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 10406 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 min=8.260696e-03 PC=jacobi [0] PCSetUp_*GAMG*(): 1) N=10406, n data cols=1, nnz/row (ave)=945, 1 active pes [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square [0] PC*GAMG*Prolongator_AGG(): New grid 1 nodes [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.248316e+00 min=9.020787e-02 PC=jacobi [0] PCSetUp_*GAMG*(): 2) N=1, n data cols=1, nnz/row (ave)=1, 1 active pes [0] PCSetUp_*GAMG*(): HARD stop of coarsening on level 1. Grid too small: 1 block nodes [0] PCSetUp_*GAMG*(): 3 levels, grid complexity = 7.85162 Attached are the log summaries for the three respective cases (positive, zero, and negative). 1) From the log summaries, it seems positive negative threshold is extremely slow. Though not as slow as if I had no "square_graph 10" and zero threshold has the "best" performance. 2) I tried mg_levels_pc_type sor but it wasn't as good as richardson 3) I tried increasing the square_graph number to 100 and 1000 but saw no improvement 4) I could not find any way to identify the ML information from -info, so I simply attached the log_summary using ML for reference 5) Is there anything else I could try? Thanks again for all your help Thanks, Justin On Fri, Mar 4, 2016 at 4:36 PM, Mark Adams wrote: > And it looks like you have a well behaved Laplacian here (M-matrix) so I > would guess 'richardson' would be faster as the smoother, instead of > 'chebyshev'. > > On Fri, Mar 4, 2016 at 5:04 PM, Mark Adams wrote: > >> You seem to have 3 of one type of solve that is give 'square_graph 1': >> >> 0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >> >> This has 9 nnz-row and 44% are zero: >> >> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with >> threshold 0., 8.79533 nnz ave. >> >> So you want to use a _negative_ threshold. This will keep zeros in the >> graph and help it to coarsen faster. And you could try adding more levels >> to square the graph. >> >> The second type of solve has the 'square_graph 10'. It looks like the >> first solve. It should also use a negative threshold also. >> >> ML has a default of zero for the threshold, but it seems to keep zeros >> whereas GAMG does not. >> >> Mark >> >> >> On Fri, Mar 4, 2016 at 10:38 AM, Justin Chang >> wrote: >> >>> Time to solution went from 100 seconds to 30 seconds once i used 10 >>> graphs. Using 20 graphs started to increase in time slightly >>> >>> On Fri, Mar 4, 2016 at 8:35 AM, Justin Chang >>> wrote: >>> >>>> You're right. This is what I have: >>>> >>>> [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, >>>> nnz/row (ave)=9, np=1 >>>> >>>> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with >>>> threshold 0., 8.79533 nnz ave. (N=48000) >>>> >>>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 >>>> min=1.040410e-02 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with >>>> threshold 0., 623.135 nnz ave. (N=6672) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 >>>> min=2.474586e-02 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with >>>> threshold 0., 724. nnz ave. (N=724) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 >>>> min=2.759552e-01 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 >>>> active pes >>>> >>>> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 >>>> >>>> [0] PCSetUp_*GAMG*(): level 0) N=48000, n data rows=1, n data cols=1, >>>> nnz/row (ave)=9, np=1 >>>> >>>> [0] PC*GAMG*FilterGraph(): 55.7114% nnz after filtering, with >>>> threshold 0., 8.79533 nnz ave. (N=48000) >>>> >>>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 6672 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.954700e+00 >>>> min=1.040410e-02 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 1) N=6672, n data cols=1, nnz/row (ave)=623, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 3.40099% nnz after filtering, with >>>> threshold 0., 623.135 nnz ave. (N=6672) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 724 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.313339e+00 >>>> min=2.474586e-02 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 2) N=724, n data cols=1, nnz/row (ave)=724, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 9.82914% nnz after filtering, with >>>> threshold 0., 724. nnz ave. (N=724) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 37 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.011784e+00 >>>> min=2.759552e-01 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 3) N=37, n data cols=1, nnz/row (ave)=37, 1 >>>> active pes >>>> >>>> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0928 >>>> >>>> [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, >>>> nnz/row (ave)=9, np=1 >>>> >>>> [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with >>>> threshold 0., 8.863 nnz ave. (N=162000) >>>> >>>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 1 to square >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 >>>> min=8.260696e-03 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with >>>> threshold 0., 704.128 nnz ave. (N=22085) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 2283 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 >>>> min=1.484874e-02 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 2) N=2283, n data cols=1, nnz/row (ave)=2283, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 3.64497% nnz after filtering, with >>>> threshold 0., 2283. nnz ave. (N=2283) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 97 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=2.043254e+00 >>>> min=1.321528e-01 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 3) N=97, n data cols=1, nnz/row (ave)=97, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 66.8403% nnz after filtering, with >>>> threshold 0., 97. nnz ave. (N=97) >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 5 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.653762e+00 >>>> min=4.460582e-01 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 4) N=5, n data cols=1, nnz/row (ave)=5, 1 active >>>> pes >>>> >>>> [0] PCSetUp_*GAMG*(): 5 levels, grid complexity = 15.4673 >>>> >>>> [0] PCSetUp_*GAMG*(): level 0) N=162000, n data rows=1, n data cols=1, >>>> nnz/row (ave)=9, np=1 >>>> >>>> [0] PC*GAMG*FilterGraph(): 55.6621% nnz after filtering, with >>>> threshold 0., 8.863 nnz ave. (N=162000) >>>> >>>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 1 of 10 to square >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 22085 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.955376e+00 >>>> min=8.260696e-03 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 1) N=22085, n data cols=1, nnz/row (ave)=704, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 3.1314% nnz after filtering, with >>>> threshold 0., 704.128 nnz ave. (N=22085) >>>> >>>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 2 of 10 to square >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 545 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.311291e+00 >>>> min=1.484874e-02 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 2) N=545, n data cols=1, nnz/row (ave)=545, 1 >>>> active pes >>>> >>>> [0] PC*GAMG*FilterGraph(): 7.55997% nnz after filtering, with >>>> threshold 0., 545. nnz ave. (N=545) >>>> >>>> [0] PC*GAMG*Coarsen_AGG(): Square Graph on level 3 of 10 to square >>>> >>>> [0] PC*GAMG*Prolongator_AGG(): New grid 11 nodes >>>> >>>> [0] PC*GAMG*OptProlongator_AGG(): Smooth P0: max eigen=1.368729e+00 >>>> min=1.563750e-01 PC=jacobi >>>> >>>> [0] PCSetUp_*GAMG*(): 3) N=11, n data cols=1, nnz/row (ave)=11, 1 >>>> active pes >>>> >>>> [0] PCSetUp_*GAMG*(): 4 levels, grid complexity = 12.0376 >>>> >>>> On Fri, Mar 4, 2016 at 8:31 AM, Lawrence Mitchell < >>>> lawrence.mitchell at imperial.ac.uk> wrote: >>>> >>>>> >>>>> > On 4 Mar 2016, at 15:24, Justin Chang wrote: >>>>> > >>>>> > So with -pc_gamg_square_graph 10 I get the following: >>>>> >>>>> Because you're using gamg inside the fieldsplit, I think you need: >>>>> >>>>> -fieldsplit_1_pc_gamg_square_graph 10 >>>>> >>>>> >>>>> >>>>> > [0] PCSetUp_GAMG(): level 0) N=48000, n data rows=1, n data cols=1, >>>>> nnz/row (ave)=9, np=1 >>>>> > [0] PCGAMGFilterGraph(): 55.7114% nnz after filtering, with >>>>> threshold 0., 8.79533 nnz ave. (N=48000) >>>>> > [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square >>>>> ^^^^^ >>>>> >>>>> Cheers, >>>>> >>>>> Lawrence >>>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: negative_threshold Type: application/octet-stream Size: 17987 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: positive_threshold Type: application/octet-stream Size: 17977 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: zero_threshold Type: application/octet-stream Size: 17976 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ml_summary Type: application/octet-stream Size: 14850 bytes Desc: not available URL: From kalan019 at umn.edu Tue Mar 8 16:14:22 2016 From: kalan019 at umn.edu (Vasileios Kalantzis) Date: Tue, 8 Mar 2016 16:14:22 -0600 Subject: [petsc-users] Using multithreaded MKL in PETSc In-Reply-To: References: <83BD7A86-0E34-4A23-945C-BAEAD6FA5B7C@mcs.anl.gov> Message-ID: Satish and Barry, thank you very much for your help, problem solved when I edited PETSC_ARCH/conf/petscvariables and removed the -lmkl_sequential flag from the linker, On Tue, Mar 8, 2016 at 3:33 PM, Satish Balay wrote: > Ideally petsc configure is smart and automatically determines the correct > mkl library list. [but this is not easy - hence the existance of MKL > advisor] > > You could use intel mkl advisor - and get the *desired* MKL libary > list for your specific setup [compilers, mkl version, etc..] - and > then specify this library list to PETSc configure using > --with-blas-lapack-lib option. > > Alternatively you can edit PETSC_ARCH/lib/conf/petscvariables and change > > -lmkl_sequential to -lmkl_intel_thread [or something suitable for your > setup] > > and give it a try.. > > Satish > > On Tue, 8 Mar 2016, Vasileios Kalantzis wrote: > > > Hi Barry, > > > > thanks for your note, I am aware of that too. Having only the MKL solver > > being able to see the second level of parallelism is fine with me. > > > > Thanks for the clarification too! As a sidenote, I set all environmental > > variables correctly but Pardiso prints that is running on 1 OpenMP > > thread only. I can see in the compile line (when I make) that ${CLINKER} > > links with flags -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread > > -lm. > > Could the -lmkl_sequential flag be the reason that MKL sees only a > > single thread? Can I edit ${CLINKER} or somehow remove the sequential > > flag? > > > > Thanks again! > > > > On Tue, Mar 8, 2016 at 2:22 PM, Barry Smith wrote: > > > > > > > > > On Mar 8, 2016, at 2:07 PM, Vasileios Kalantzis > > > wrote: > > > > > > > > Hi everyone, > > > > > > > > Assuming that I want to use MKL's Pardiso in the > > > > context of PETSc (I need it as a local solver for > > > > block Jacobi type preconditioning), how should I form > > > > my ./configure so that I can take advantage of > > > > multi-threading when calling MKL's Pardiso? > > > > > > > > For example, do I simply have to configure with > > > > --with-openmp=1 --with-pthread=1 and/or similar? > > > > I have tried different combinations and thought > > > > of asking, > > > > > > > > Thank you! > > > > > > > > ps. right now I use flat MPI and works fine, I want > > > > to use threading to add a second level of parallelism, > > > > > > If you do this then the only part of the code that takes advantage of > > > that "second level of parallelism" will be within the MKL solver. The > rest > > > of the code will not use those "extra" threads. You must make sure you > run > > > MPI on less cores so there are enough "free" cores to run the threads > for > > > the the MKL solver. > > > > > > To utilize MKL's Pardiso you don't need to configure PETSc with any > > > mention of openmp or pthreads you just need to set appropriate MKL > > > environmental variables so that MKL's Pardiso will use multiple > threads. Or > > > you can use MatMkl_PardisoSetCntl() see the manual page for > > > MATSOLVERMKL_PARDISO etc > > > > > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 8 19:46:39 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 8 Mar 2016 19:46:39 -0600 Subject: [petsc-users] Using multithreaded MKL in PETSc In-Reply-To: References: <83BD7A86-0E34-4A23-945C-BAEAD6FA5B7C@mcs.anl.gov> Message-ID: <83C9FE51-52A5-40B7-8CEA-3BC2A9EAD2FF@mcs.anl.gov> BTW: There is also MATSOLVERMKL_CPARDISO which is apparently an MPI parallel direct solver. Barry > On Mar 8, 2016, at 4:15 PM, Vasileios Kalantzis wrote: > > (I accidentally replied to the list only) > > Satish and Barry, > > thank you very much for your help, problem solved when I edited > PETSC_ARCH/conf/petscvariables and removed the -lmkl_sequential > flag from the linker, > > On Tue, Mar 8, 2016 at 3:33 PM, Satish Balay wrote: > Ideally petsc configure is smart and automatically determines the correct > mkl library list. [but this is not easy - hence the existance of MKL advisor] > > You could use intel mkl advisor - and get the *desired* MKL libary > list for your specific setup [compilers, mkl version, etc..] - and > then specify this library list to PETSc configure using > --with-blas-lapack-lib option. > > Alternatively you can edit PETSC_ARCH/lib/conf/petscvariables and change > > -lmkl_sequential to -lmkl_intel_thread [or something suitable for your setup] > > and give it a try.. > > Satish > > On Tue, 8 Mar 2016, Vasileios Kalantzis wrote: > > > Hi Barry, > > > > thanks for your note, I am aware of that too. Having only the MKL solver > > being able to see the second level of parallelism is fine with me. > > > > Thanks for the clarification too! As a sidenote, I set all environmental > > variables correctly but Pardiso prints that is running on 1 OpenMP > > thread only. I can see in the compile line (when I make) that ${CLINKER} > > links with flags -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread > > -lm. > > Could the -lmkl_sequential flag be the reason that MKL sees only a > > single thread? Can I edit ${CLINKER} or somehow remove the sequential > > flag? > > > > Thanks again! > > > > On Tue, Mar 8, 2016 at 2:22 PM, Barry Smith wrote: > > > > > > > > > On Mar 8, 2016, at 2:07 PM, Vasileios Kalantzis > > > wrote: > > > > > > > > Hi everyone, > > > > > > > > Assuming that I want to use MKL's Pardiso in the > > > > context of PETSc (I need it as a local solver for > > > > block Jacobi type preconditioning), how should I form > > > > my ./configure so that I can take advantage of > > > > multi-threading when calling MKL's Pardiso? > > > > > > > > For example, do I simply have to configure with > > > > --with-openmp=1 --with-pthread=1 and/or similar? > > > > I have tried different combinations and thought > > > > of asking, > > > > > > > > Thank you! > > > > > > > > ps. right now I use flat MPI and works fine, I want > > > > to use threading to add a second level of parallelism, > > > > > > If you do this then the only part of the code that takes advantage of > > > that "second level of parallelism" will be within the MKL solver. The rest > > > of the code will not use those "extra" threads. You must make sure you run > > > MPI on less cores so there are enough "free" cores to run the threads for > > > the the MKL solver. > > > > > > To utilize MKL's Pardiso you don't need to configure PETSc with any > > > mention of openmp or pthreads you just need to set appropriate MKL > > > environmental variables so that MKL's Pardiso will use multiple threads. Or > > > you can use MatMkl_PardisoSetCntl() see the manual page for > > > MATSOLVERMKL_PARDISO etc > > > > > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > > > > > From bhatiamanav at gmail.com Wed Mar 9 08:43:39 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Wed, 9 Mar 2016 08:43:39 -0600 Subject: [petsc-users] AMG preconditioners Message-ID: <498AF73F-6389-4CAD-B510-410C42536B51@gmail.com> Hi, Out of GAMG, ML and HYPRE, which are expected to be compatible with SEQBAIJ and MPIBAIJ matrices? Regards, Manav From bsmith at mcs.anl.gov Wed Mar 9 08:59:39 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 9 Mar 2016 08:59:39 -0600 Subject: [petsc-users] AMG preconditioners In-Reply-To: <498AF73F-6389-4CAD-B510-410C42536B51@gmail.com> References: <498AF73F-6389-4CAD-B510-410C42536B51@gmail.com> Message-ID: <0DA57323-CE47-44F4-83DE-7EC096DE49AB@mcs.anl.gov> > On Mar 9, 2016, at 8:43 AM, Manav Bhatia wrote: > > Hi, > > Out of GAMG, ML and HYPRE, which are expected to be compatible with SEQBAIJ and MPIBAIJ matrices? Not necessarily. Note that you can create matrices with MatCreate() and control at run time the matrix version. You don't need to hardwire BAIJ or AIJ (in fact we recommend not hardwiring it). Barry > > Regards, > Manav > From bhatiamanav at gmail.com Wed Mar 9 10:15:42 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Wed, 9 Mar 2016 10:15:42 -0600 Subject: [petsc-users] AMG preconditioners In-Reply-To: <0DA57323-CE47-44F4-83DE-7EC096DE49AB@mcs.anl.gov> References: <498AF73F-6389-4CAD-B510-410C42536B51@gmail.com> <0DA57323-CE47-44F4-83DE-7EC096DE49AB@mcs.anl.gov> Message-ID: <6D0870D8-A9BF-4B1A-8F24-C46F36CF0EE9@gmail.com> Thanks! I am intrigued by this. I am solving a system of equations where the natural block size is 2x2. This is obtained by rewriting a complex linear system of equations in real numbers. I use MatSetValuesBlocked to set the values of each 2x2 block in my matrix. I am assuming that this method works only for blocked matrix. So, to make the code generic, I will need to use the explicit row/column numberings along with a MatSetValues method. -Manav > On Mar 9, 2016, at 8:59 AM, Barry Smith wrote: > > >> On Mar 9, 2016, at 8:43 AM, Manav Bhatia wrote: >> >> Hi, >> >> Out of GAMG, ML and HYPRE, which are expected to be compatible with SEQBAIJ and MPIBAIJ matrices? > > Not necessarily. Note that you can create matrices with MatCreate() and control at run time the matrix version. You don't need to hardwire BAIJ or AIJ (in fact we recommend not hardwiring it). > > Barry > > >> >> Regards, >> Manav >> > From jed at jedbrown.org Wed Mar 9 10:19:14 2016 From: jed at jedbrown.org (Jed Brown) Date: Wed, 09 Mar 2016 16:19:14 +0000 Subject: [petsc-users] AMG preconditioners In-Reply-To: <6D0870D8-A9BF-4B1A-8F24-C46F36CF0EE9@gmail.com> References: <498AF73F-6389-4CAD-B510-410C42536B51@gmail.com> <0DA57323-CE47-44F4-83DE-7EC096DE49AB@mcs.anl.gov> <6D0870D8-A9BF-4B1A-8F24-C46F36CF0EE9@gmail.com> Message-ID: <871t7jwrnh.fsf@jedbrown.org> Manav Bhatia writes: > Thanks! > > I am intrigued by this. > > I am solving a system of equations where the natural block size is > 2x2. This is obtained by rewriting a complex linear system of > equations in real numbers. What equations are you solving? Is this system a leading cost in your application? (Using complex scalars is another option. Yes, it's lame that PETSc can't currently use both in the same library.) > I use MatSetValuesBlocked to set the values of each 2x2 block in my > matrix. I am assuming that this method works only for blocked matrix. No, just call MatSetBlockSize and MatSetValuesBlocked (and variants) will work for any matrix type. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bhatiamanav at gmail.com Wed Mar 9 10:32:40 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Wed, 9 Mar 2016 10:32:40 -0600 Subject: [petsc-users] AMG preconditioners In-Reply-To: <871t7jwrnh.fsf@jedbrown.org> References: <498AF73F-6389-4CAD-B510-410C42536B51@gmail.com> <0DA57323-CE47-44F4-83DE-7EC096DE49AB@mcs.anl.gov> <6D0870D8-A9BF-4B1A-8F24-C46F36CF0EE9@gmail.com> <871t7jwrnh.fsf@jedbrown.org> Message-ID: <050FF9E0-67C0-477D-A71E-6E19419C9030@gmail.com> > On Mar 9, 2016, at 10:19 AM, Jed Brown wrote: > > Manav Bhatia writes: > >> Thanks! >> >> I am intrigued by this. >> >> I am solving a system of equations where the natural block size is >> 2x2. This is obtained by rewriting a complex linear system of >> equations in real numbers. > > What equations are you solving? Is this system a leading cost in your > application? (Using complex scalars is another option. Yes, it's lame > that PETSc can't currently use both in the same library.) > The application is that of small-disturbance frequency domain analysis about a steady-state. This needs a mix of linear solves for system of equations in real numbers and complex numbers. I was able to setup the complex solves using real numbers with a relatively small piece of code. It is working very well for the most part, except that I got some errors while trying to use the AMG preconditioners with it. Making the code generic for BAIJ or AIJ, I think, will be a very small fix in comparison with recompiling PETSC with complex scalars, since a lot of my application code assumes that the underlying libraries (libMesh, PETSC, SLEPC) are compiled for real scalars. -Manav >> I use MatSetValuesBlocked to set the values of each 2x2 block in my >> matrix. I am assuming that this method works only for blocked matrix. > > No, just call MatSetBlockSize and MatSetValuesBlocked (and variants) > will work for any matrix type. From liangzh.cug at gmail.com Wed Mar 9 15:23:02 2016 From: liangzh.cug at gmail.com (Liang ZHENG) Date: Wed, 9 Mar 2016 22:23:02 +0100 Subject: [petsc-users] how to use schur fieldsplit preconditioner twice ~ Message-ID: Hi, all, I wrote a code as following: void petsc_solve_thm2d(type_matrix_petsc *matrix) { IS isg[4],iss,isf; MSGS("[petsc] Set blocked preconditioner index"); PetscScalar np_Sv=matrix->np_col/3; PetscScalar np_Sp=matrix->np_col/6; PetscScalar np_Fv=matrix->np_col/3; PetscScalar np_Fp=matrix->np_col/6; ISCreateStride(PETSC_COMM_SELF,np_Sv,0,1,&isg[0]); ISCreateStride(PETSC_COMM_SELF,np_Sp,np_Sv,1,&isg[1]); ISCreateStride(PETSC_COMM_SELF,np_Fv,np_Sv+np_Sp,1,&isg[2]); ISCreateStride(PETSC_COMM_SELF,np_Fp,np_Sv+np_Sp+np_Fv,1,&isg[3]); IS is_solid[]={isg[0],isg[1]}; IS is_fluid[]={isg[2],isg[3]}; ierr = ISConcatenate(PETSC_COMM_SELF,2,is_solid,&iss);CHKERRV(ierr); ierr = ISConcatenate(PETSC_COMM_SELF,2,is_fluid,&isf);CHKERRV(ierr); MSGS("[petsc] Create solver"); ierr = KSPCreate(PETSC_COMM_SELF,&matrix->ksp);CHKERRV(ierr); ierr = KSPSetOptionsPrefix(matrix->ksp,"thm_");CHKERRV(ierr); ierr = KSPSetOperators(matrix->ksp,matrix->LHS,matrix->LHS);CHKERRV(ierr); ierr = KSPSetFromOptions(matrix->ksp);CHKERRV(ierr); ierr = KSPGetPC(matrix->ksp,&matrix->pc);CHKERRV(ierr); // ierr = PCFieldSplitSetIS(matrix->pc,"Sv",isg[0]);CHKERRV(ierr); // ierr = PCFieldSplitSetIS(matrix->pc,"Sp",isg[1]);CHKERRV(ierr); // ierr = PCFieldSplitSetIS(matrix->pc,"Fv",isg[2]);CHKERRV(ierr); // ierr = PCFieldSplitSetIS(matrix->pc,"Fp",isg[3]);CHKERRV(ierr); ierr = PCFieldSplitSetIS(matrix->pc,"S",iss);CHKERRV(ierr); ierr = PCFieldSplitSetIS(matrix->pc,"F",isf);CHKERRV(ierr); //ierr = PCFieldSplitSetBlockSize(matrix->pc,4);CHKERRV(ierr); MSGS("[petsc] Solving"); ierr = KSPSolve(matrix->ksp,matrix->RHS,matrix->SOL);CHKERRV(ierr); MSGS("[petsc] Destroy solver"); ierr = KSPDestroy(&matrix->ksp);CHKERRV(ierr); ierr = ISDestroy(&isg);CHKERRV(ierr); } then I can call petsc in the command line as: ./thm2d_nonuniform -thm_ksp_monitor -thm_ksp_view -thm_ksp_type fgmres -thm_ksp_rtol 1e-10 -thm_pc_type fieldsplit -thm_pc_fieldsplit_type schur -thm_pc_fieldsplit_S_ksp_type fgmres -thm_fieldsplit_S_pc_type lu -thm_pc_fieldsplit_F_ksp_type fgmres -thm_fieldsplit_F_pc_type lu -thm_fieldsplit_S_pc_factor_mat_solver_package umfpack this command is working. but I also want to set -thm_fieldsplit_S_pc_type using fieldsplit schur preconditioner. Is there a solution to do something like: -thm_fieldsplit_S_pc_type fieldsplit -thm_fieldsplit_S_pc_field_type schur -thm_fieldsplit_S_fieldsplit_Sv_ksp_type xxx -thm_fieldsplit_S_fieldsplit_Sp_ksp_type xxx -thm_fieldsplit_S_fieldsplit_Sv_pc_type xxx - thm_fieldsplit_S_fieldsplit_Sp_pc_type xxx Thanks a lot ~ Cheers, Larry -------------- next part -------------- An HTML attachment was scrubbed... URL: From rupp at iue.tuwien.ac.at Thu Mar 10 04:29:02 2016 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Thu, 10 Mar 2016 11:29:02 +0100 Subject: [petsc-users] Parallel FEM code using PETSc In-Reply-To: <1649257156.2608323.1455182305418.JavaMail.yahoo@mail.yahoo.com> References: <1649257156.2608323.1455182305418.JavaMail.yahoo.ref@mail.yahoo.com> <1649257156.2608323.1455182305418.JavaMail.yahoo@mail.yahoo.com> Message-ID: <56E14C6E.9000104@iue.tuwien.ac.at> Hi Farshid, > Can somebody point me to a simple, linear, parallel FEM code based on > PETSc, implementing efficient application of non-homogeneous Dirichlet > BC? Domain decomposition code will is also welcome. sorry, I just noticed that this may not have been answered to your full satisfaction yet. Please have a look at SNES, ex62, located in $PETSC_DIR/src/snes/examples/tutorials/ex62.c Short description: Stokes Problem in 2d and 3d with simplicial finite elements. We solve the Stokes problem in a rectangular domain, using a parallel unstructured mesh (DMPLEX) to discretize it. Best regards, Karli From bhatiamanav at gmail.com Thu Mar 10 11:46:21 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Thu, 10 Mar 2016 11:46:21 -0600 Subject: [petsc-users] TS support for second order systems Message-ID: Hi, Is there explicit support for second-order systems arising from structural dynamics in the TS library? It can certainly be written as an equivalent first-order system, but some structural dynamics simulations prefer to tune the two-parameter Newmark solvers to provide extra damping for high-frequency modes. Something equivalent might be possible for equivalent first-order representation, but I wanted to check if there is any support along those lines. Thanks, Manav From jychang48 at gmail.com Thu Mar 10 12:03:34 2016 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 10 Mar 2016 12:03:34 -0600 Subject: [petsc-users] Tao TRON solver tolerances In-Reply-To: References: Message-ID: Hi again, I was reading through the TAO manual and the impression I am getting is that the KSP solver computes the gradient/projection, not necessarily the solution itself. Meaning it matters not how accurate this projection is, so long as the actual objective tolerance is met. Is this a correct assessment of why one can get away with a less stringent KSP tolerance and still attain an accurate solution? Thanks, Justin On Tuesday, March 8, 2016, Justin Chang > wrote: > Hi all, > > So I am solving a convex optimization problem of the following form: > > min 1/2 x^T*H*x - x^T*f > s.t. 0 < x < 1 > > Using the TAOTRON solver, I also have CG/ILU for KSP/PC. The following TAO > solver tolerances are used for my specific problem: > > -tao_gatol 1e-12 > -tao_grtol 1e-7 > > I noticed that the KSP tolerance truly defines the performance of this > solver. Attached are three run cases with -ksp_rtol 1e-7, 1e-3, and 1e-1 > with "-ksp_converged_reason -ksp_monitor_true_residual -tao_view > -tao_converged_reason -log_view". It seems that the lower the KSP > tolerance, the faster the time-to-solution where the number of KSP/TAO > solve iterations remains roughly the same. > > So my question is, is this "normal"? That is, if using TRON, one may relax > the KSP tolerances because the convergence of the solver is primarily due > to the objective functional from TRON and not necessarily the KSP solve > itself? Is there a general rule of thumb for this, because it would seem to > me that for any TRON solve I do, i could just set a really low KSP rtol and > still get roughly the same performance. > > Thanks, > Justin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 10 12:11:09 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Mar 2016 12:11:09 -0600 Subject: [petsc-users] TS support for second order systems In-Reply-To: References: Message-ID: On Thu, Mar 10, 2016 at 11:46 AM, Manav Bhatia wrote: > Hi, > > Is there explicit support for second-order systems arising from > structural dynamics in the TS library? > > It can certainly be written as an equivalent first-order system, but > some structural dynamics simulations prefer to tune the two-parameter > Newmark solvers to provide extra damping for high-frequency modes. > > Something equivalent might be possible for equivalent first-order > representation, but I wanted to check if there is any support along those > lines. > No, we only have support for first order systems. Thanks, Matt > Thanks, > Manav > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Thu Mar 10 12:29:37 2016 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 10 Mar 2016 12:29:37 -0600 Subject: [petsc-users] CPU vs GPU for PETSc applications Message-ID: Hi all, When would I ever use GPU computing for a finite element simulation where the limiting factor of performance is the memory bandwidth bound? Say I want to run problems similar to SNES ex12 and 62. I understand that there is an additional bandwidth associated with offloading data from the CPU to GPU but is there more to it? I recall reading through some email threads about GPU's potentially giving you a speed up of 3x that on a CPU but the gain in performance may not be worth the increase in time moving data around. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 10 14:50:44 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Mar 2016 14:50:44 -0600 Subject: [petsc-users] CPU vs GPU for PETSc applications In-Reply-To: References: Message-ID: On Thu, Mar 10, 2016 at 12:29 PM, Justin Chang wrote: > Hi all, > > When would I ever use GPU computing for a finite element simulation where > the limiting factor of performance is the memory bandwidth bound? Say I > want to run problems similar to SNES ex12 and 62. I understand that there > is an additional bandwidth associated with offloading data from the CPU to > GPU but is there more to it? I recall reading through some email threads > about GPU's potentially giving you a speed up of 3x that on a CPU but the > gain in performance may not be worth the increase in time moving data > around. The main use case is if you are being forced to use a machine which has GPUs. Then you can indeed get some benefit from the larger bandwidth. You need a problem where you are doing a bunch of iterations to make sending the initial data down worth it. It would certainly be better if you are computing the action of your operator directly on the GPU, but that is much more disruptive to the code right now. Matt > Thanks, > Justin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From chih-hao.chen2 at mail.mcgill.ca Thu Mar 10 15:23:52 2016 From: chih-hao.chen2 at mail.mcgill.ca (Chih-Hao Chen) Date: Thu, 10 Mar 2016 21:23:52 +0000 Subject: [petsc-users] Questions about ASM in petsc4py Message-ID: Hello PETSc members, Sorry for asking for help about the ASM in petsc4py. Currently I am using your ASM as my preconditioned in my solver. I know how to setup the PCASM based on the ex8.c in the following link. http://www.mcs.anl.gov/petsc/petsc-3.4/src/ksp/ksp/examples/tutorials/ex8.c But when using the function ?getASMSubKSP? in petsc4py, I have tried several methods, but still cannot get the subksp from each mpi. Here is the snippet of the code of the function. def getASMSubKSP(self): cdef PetscInt i = 0, n = 0 cdef PetscKSP *p = NULL CHKERR( PCASMGetSubKSP(self.pc, &n, NULL, &p) ) return [ref_KSP(p[i]) for i from 0 <= i From knepley at gmail.com Thu Mar 10 15:28:23 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Mar 2016 15:28:23 -0600 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: References: Message-ID: On Thu, Mar 10, 2016 at 3:23 PM, Chih-Hao Chen < chih-hao.chen2 at mail.mcgill.ca> wrote: > Hello PETSc members, > > > Sorry for asking for help about the ASM in petsc4py. > Currently I am using your ASM as my preconditioned in my solver. > I know how to setup the PCASM based on the ex8.c in the following link. > http://www.mcs.anl.gov/petsc/petsc-3.4/src/ksp/ksp/examples/tutorials/ex8.c > > But when using the function ?getASMSubKSP? in petsc4py, > I have tried several methods, but still cannot get the subksp from each > mpi. > The subKSPs do not have to do with MPI. On each rank, you get the local KSPs. > Here is the snippet of the code of the function. > > def getASMSubKSP(self): cdef PetscInt i = 0, n = 0 cdef PetscKSP *p = NULL CHKERR( PCASMGetSubKSP(self.pc, &n, NULL, &p) ) return [ref_KSP(p[i]) for i from 0 <= i > > In ex8.c, we could use a ?FOR? loop to access the indiivdual subksp. > But in python, could I use ?FOR? loop to get all the subksps with: > > subksp[i] = pc.getASMSubKSP[i] > I do not understand this, but I think the answer is no. > Another question is in ex8.c, it seems I don?t need to do any setup to > decompose the RHS vector. > But do I need to decompose the RHS vector with any settings if I don?t use > PETSc solvers but with your preconditioners? > I do not understand what you mean by "decompose the RHS vector". Thanks, Matt > Thanks very much. > > Best, > Chih-Hao > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Thu Mar 10 16:48:57 2016 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 10 Mar 2016 16:48:57 -0600 Subject: [petsc-users] CPU vs GPU for PETSc applications In-Reply-To: References: Message-ID: Matt, So what's an example of "doing a bunch of iterations to make sending the initial datadown worth it"? Is there a correlation between that and arithmetic intensity, where an application is likely to be more compute-bound and memory-bandwidth bound? Thanks, Justin On Thu, Mar 10, 2016 at 2:50 PM, Matthew Knepley wrote: > On Thu, Mar 10, 2016 at 12:29 PM, Justin Chang > wrote: > >> Hi all, >> >> When would I ever use GPU computing for a finite element simulation where >> the limiting factor of performance is the memory bandwidth bound? Say I >> want to run problems similar to SNES ex12 and 62. I understand that there >> is an additional bandwidth associated with offloading data from the CPU to >> GPU but is there more to it? I recall reading through some email threads >> about GPU's potentially giving you a speed up of 3x that on a CPU but the >> gain in performance may not be worth the increase in time moving data >> around. > > > The main use case is if you are being forced to use a machine which has > GPUs. Then you can indeed get some benefit > from the larger bandwidth. You need a problem where you are doing a bunch > of iterations to make sending the initial data > down worth it. > > It would certainly be better if you are computing the action of your > operator directly on the GPU, but that is much more > disruptive to the code right now. > > Matt > > >> Thanks, >> Justin >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lvella at gmail.com Thu Mar 10 16:49:02 2016 From: lvella at gmail.com (Lucas Clemente Vella) Date: Thu, 10 Mar 2016 19:49:02 -0300 Subject: [petsc-users] Solving hollow matrices Message-ID: What PETSc setting of KSP and PC can be used to solve a linear system with matrix with null elements on the main diagonal? Such matrices arise on monolithic methods of solving Navier-Stokes equation. I understand it doesn't matter for the KSP method, right? I know that for lots of PC methods it won't work, like Jacobi or SOR. Is there any PC that can handle such matrices? For PC method that can't handle hollow matrices, are there any kind of treatment provided by PETSc that can be done to the linear system to ensure the matrix will have only non-null elements on the main diagonal? Maybe some kind of matrix reordering? -- Lucas Clemente Vella lvella at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Mar 10 16:58:57 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 10 Mar 2016 16:58:57 -0600 Subject: [petsc-users] Solving hollow matrices In-Reply-To: References: Message-ID: > On Mar 10, 2016, at 4:49 PM, Lucas Clemente Vella wrote: > > What PETSc setting of KSP and PC can be used to solve a linear system with matrix with null elements on the main diagonal? Such matrices arise on monolithic methods of solving Navier-Stokes equation. > > I understand it doesn't matter for the KSP method, right? Correct > > I know that for lots of PC methods it won't work, like Jacobi or SOR. Jacobi just uses a 1 on those entries > Is there any PC that can handle such matrices? > > For PC method that can't handle hollow matrices, are there any kind of treatment provided by PETSc that can be done to the linear system to ensure the matrix will have only non-null elements on the main diagonal? Maybe some kind of matrix reordering? If the nonzeros "naturally" belong on the diagonal, such as with a Stokes problem then PCFIELDSPLIT is exactly the right preconditioner to use. Take a look at its docs especially using the Schur complement style preconditioners which are often very good for Stokes problems. Barry > > -- > Lucas Clemente Vella > lvella at gmail.com From bhatiamanav at gmail.com Thu Mar 10 18:00:18 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Thu, 10 Mar 2016 18:00:18 -0600 Subject: [petsc-users] computations on split mpi comm Message-ID: <14DAE4DB-B08E-487F-9587-0398E05605DF@gmail.com> Hi, My interest is in running two separate KSP contexts on two subsets of the global mpi communicator context. Is there an example that demonstrates this? My intention is for the two subsets to have an overlap. For example, on a 4 processor global communicator (0, 1, 2, 3), one subset could be {0} and the second {0, 1, 2, 3}, or perhaps {0, 1} and {1, 2, 3}. Any guidance would be appreciated. Thanks, Manav From bsmith at mcs.anl.gov Thu Mar 10 19:10:16 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 10 Mar 2016 19:10:16 -0600 Subject: [petsc-users] computations on split mpi comm In-Reply-To: <14DAE4DB-B08E-487F-9587-0398E05605DF@gmail.com> References: <14DAE4DB-B08E-487F-9587-0398E05605DF@gmail.com> Message-ID: <80C15D34-25C2-4569-8ECB-1EBD9F71CED7@mcs.anl.gov> > On Mar 10, 2016, at 6:00 PM, Manav Bhatia wrote: > > Hi, > > My interest is in running two separate KSP contexts on two subsets of the global mpi communicator context. Is there an example that demonstrates this? No, but it is very simple. Create the two communicators with MPI_Comm_split() or some other mechanism and then create the matrix, vector, and solver objects for each communicator based on the sub communicator. > > My intention is for the two subsets to have an overlap. For example, on a 4 processor global communicator (0, 1, 2, 3), one subset could be {0} and the second {0, 1, 2, 3}, or perhaps {0, 1} and {1, 2, 3}. This can only work if the two solves are not run at the same time since KSPSolve() etc are blocking. You could not start the second solve on process 0 until the first one is complete. If you truly want to run them "at the same time" you need to use multiple threads on each process that shares the communicator (that is have two threads and each run with its sub communicator). Trying to do this is IMHO completely insane, better to use additional MPI processes and have no overlapping communicators. Barry > > Any guidance would be appreciated. > > Thanks, > Manav > > From petertututk at gmail.com Fri Mar 11 00:48:53 2016 From: petertututk at gmail.com (peter tutuk) Date: Fri, 11 Mar 2016 07:48:53 +0100 Subject: [petsc-users] asynchronous solve Message-ID: I am developing my own nonlinear solver and I would like to achieve asynchronous solve for subdomains. Is there any example around how to use PETSc in such a case? Or in general, is there a possibility to achieve desired behavior, while using PETSc ? best, peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 11 06:38:27 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Mar 2016 06:38:27 -0600 Subject: [petsc-users] asynchronous solve In-Reply-To: References: Message-ID: On Fri, Mar 11, 2016 at 12:48 AM, peter tutuk wrote: > I am developing my own nonlinear solver and I would like to achieve > asynchronous solve for subdomains. > > Is there any example around how to use PETSc in such a case? Or in > general, is there a possibility to achieve desired behavior, while using > PETSc ? > If you can achieve what you want using MPI, then we can definitely do it. You will probably need to explain to us how you want the parallel updates to go. Thanks, Matt > best, > peter > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 11 06:53:04 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Mar 2016 06:53:04 -0600 Subject: [petsc-users] CPU vs GPU for PETSc applications In-Reply-To: References: Message-ID: On Thu, Mar 10, 2016 at 4:48 PM, Justin Chang wrote: > Matt, > > So what's an example of "doing a bunch of iterations to make sending the > initial datadown worth it"? Is there a correlation between that and > arithmetic intensity, where an application is likely to be more > compute-bound and memory-bandwidth bound? > Say 1) Send rhs and assume 0 initial guess 2) Perform k matvecs for some Krylov solver 3) Get back output vector the elephant in the room here is that I have not specified a preconditioner. It is here that GPUs have the most trouble, but if you can find a GPU PC that works, and you use a bunch of iterations before communicating back, you can realize the 2x speed benefit over a modern CPU. Thanks, Matt > Thanks, > Justin > > On Thu, Mar 10, 2016 at 2:50 PM, Matthew Knepley > wrote: > >> On Thu, Mar 10, 2016 at 12:29 PM, Justin Chang >> wrote: >> >>> Hi all, >>> >>> When would I ever use GPU computing for a finite element simulation >>> where the limiting factor of performance is the memory bandwidth bound? Say >>> I want to run problems similar to SNES ex12 and 62. I understand that there >>> is an additional bandwidth associated with offloading data from the CPU to >>> GPU but is there more to it? I recall reading through some email threads >>> about GPU's potentially giving you a speed up of 3x that on a CPU but the >>> gain in performance may not be worth the increase in time moving data >>> around. >> >> >> The main use case is if you are being forced to use a machine which has >> GPUs. Then you can indeed get some benefit >> from the larger bandwidth. You need a problem where you are doing a bunch >> of iterations to make sending the initial data >> down worth it. >> >> It would certainly be better if you are computing the action of your >> operator directly on the GPU, but that is much more >> disruptive to the code right now. >> >> Matt >> >> >>> Thanks, >>> Justin >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Mar 11 11:11:47 2016 From: popov at uni-mainz.de (anton) Date: Fri, 11 Mar 2016 18:11:47 +0100 Subject: [petsc-users] DMShellSetCreateRestriction Message-ID: <56E2FC53.1040709@uni-mainz.de> Hi team, I'm implementing staggered grid in a PETSc-canonical way, trying to build a custom DM object, attach it to SNES, that should later transfered it further to KSP and PC. Yet, the Galerking coarsening for staggered grid is non-symmetric. The question is how possible is it that DMShellSetCreateRestriction can be implemented and included in 3.7 release? Please, please. Thanks, Anton From dave.mayhem23 at gmail.com Fri Mar 11 12:26:45 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 11 Mar 2016 19:26:45 +0100 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: <56E2FC53.1040709@uni-mainz.de> References: <56E2FC53.1040709@uni-mainz.de> Message-ID: On 11 March 2016 at 18:11, anton wrote: > Hi team, > > I'm implementing staggered grid in a PETSc-canonical way, trying to build > a custom DM object, attach it to SNES, that should later transfered it > further to KSP and PC. > > Yet, the Galerking coarsening for staggered grid is non-symmetric. The > question is how possible is it that DMShellSetCreateRestriction can be > implemented and included in 3.7 release? > It's a little more work than just adding a new method within the DM and a new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). PCMG needs to be modified to call DMCreateRestriction(). > > Please, please. > > Thanks, > Anton > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 11 14:53:28 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Mar 2016 14:53:28 -0600 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: References: <56E2FC53.1040709@uni-mainz.de> Message-ID: On Fri, Mar 11, 2016 at 12:26 PM, Dave May wrote: > On 11 March 2016 at 18:11, anton wrote: > >> Hi team, >> >> I'm implementing staggered grid in a PETSc-canonical way, trying to build >> a custom DM object, attach it to SNES, that should later transfered it >> further to KSP and PC. >> >> Yet, the Galerking coarsening for staggered grid is non-symmetric. The >> question is how possible is it that DMShellSetCreateRestriction can be >> implemented and included in 3.7 release? >> > > It's a little more work than just adding a new method within the DM and a > new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). > PCMG needs to be modified to call DMCreateRestriction(). > Dave is correct. Currently, PCMG only calls DMCreateInterpolation(). We would need to add a DMCreateRestriction() call. Thanks, Matt > Please, please. >> >> Thanks, >> Anton >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kaus at uni-mainz.de Fri Mar 11 15:39:04 2016 From: kaus at uni-mainz.de (Boris Kaus) Date: Fri, 11 Mar 2016 21:39:04 +0000 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: References: <56E2FC53.1040709@uni-mainz.de> Message-ID: <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> > On Mar 11, 2016, at 8:53 PM, Matthew Knepley wrote: > > On Fri, Mar 11, 2016 at 12:26 PM, Dave May > wrote: > On 11 March 2016 at 18:11, anton > wrote: > Hi team, > > I'm implementing staggered grid in a PETSc-canonical way, trying to build a custom DM object, attach it to SNES, that should later transfered it further to KSP and PC. > > Yet, the Galerking coarsening for staggered grid is non-symmetric. The question is how possible is it that DMShellSetCreateRestriction can be implemented and included in 3.7 release? > > It's a little more work than just adding a new method within the DM and a new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). > PCMG needs to be modified to call DMCreateRestriction(). > > Dave is correct. Currently, PCMG only calls DMCreateInterpolation(). We would need to add a DMCreateRestriction() call. The PCMG object already uses a restriction operator that is different from the interpolation parameter if it is specified with PCMGSetRestriction . For consistency, one would expect a similar DMCreateRestriction object, not? I realize that this is not relevant for FEM codes, but for staggered FD it makes quite some difference. Other suggestions on how to best integrate staggered finite differences within the current PETSc framework are ofcourse also highly welcome. Our current thinking was to pack it into a DMSHELL (which has the problem of not having a restriction interface). thanks, Boris > Thanks, > > Matt > > Please, please. > > Thanks, > Anton > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Mar 11 16:25:49 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 11 Mar 2016 16:25:49 -0600 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> References: <56E2FC53.1040709@uni-mainz.de> <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> Message-ID: Boris, We will add this support to the DMShell and its usage from PCMG within a few days. Barry > On Mar 11, 2016, at 3:39 PM, Boris Kaus wrote: > > >> On Mar 11, 2016, at 8:53 PM, Matthew Knepley wrote: >> >> On Fri, Mar 11, 2016 at 12:26 PM, Dave May wrote: >> On 11 March 2016 at 18:11, anton wrote: >> Hi team, >> >> I'm implementing staggered grid in a PETSc-canonical way, trying to build a custom DM object, attach it to SNES, that should later transfered it further to KSP and PC. >> >> Yet, the Galerking coarsening for staggered grid is non-symmetric. The question is how possible is it that DMShellSetCreateRestriction can be implemented and included in 3.7 release? >> >> It's a little more work than just adding a new method within the DM and a new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). >> PCMG needs to be modified to call DMCreateRestriction(). >> >> Dave is correct. Currently, PCMG only calls DMCreateInterpolation(). We would need to add a DMCreateRestriction() call. > The PCMG object already uses a restriction operator that is different from the interpolation parameter if it is specified with PCMGSetRestriction. > For consistency, one would expect a similar DMCreateRestriction object, not? I realize that this is not relevant for FEM codes, but for staggered FD it makes quite some difference. > > Other suggestions on how to best integrate staggered finite differences within the current PETSc framework are ofcourse also highly welcome. > Our current thinking was to pack it into a DMSHELL (which has the problem of not having a restriction interface). > > thanks, > Boris > > > > >> Thanks, >> >> Matt >> >> Please, please. >> >> Thanks, >> Anton >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From jed at jedbrown.org Fri Mar 11 07:10:50 2016 From: jed at jedbrown.org (Jed Brown) Date: Fri, 11 Mar 2016 13:10:50 +0000 Subject: [petsc-users] CPU vs GPU for PETSc applications In-Reply-To: References: Message-ID: <871t7htb1h.fsf@jedbrown.org> Justin Chang writes: > Matt, > > So what's an example of "doing a bunch of iterations to make sending the > initial datadown worth it"? CG/Jacobi for a high resolution problem. You pretty much have to have thrown in the towel on finding a good preconditioner, otherwise you'd be at risk of solving the problem too quickly. Some groups have shown acceptable multigrid performance, though it's a tough sell if you're paying for the coprocessor. One problem with the 3x bandwidth difference is that GPU algorithms often require temporaries or multiple passes over the date where a CPU would be able to do a single pass with little or no temporaries. In finite element computations, and also some sparse matrix operations, those intermediate quantities can more than squander the apparent bandwidth advantage. > Is there a correlation between that and arithmetic intensity, where an > application is likely to be more compute-bound and memory-bandwidth > bound? Not really because each iteration accesses the entire sparse matrix. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From kaus at uni-mainz.de Fri Mar 11 17:32:42 2016 From: kaus at uni-mainz.de (Boris Kaus) Date: Fri, 11 Mar 2016 23:32:42 +0000 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: References: <56E2FC53.1040709@uni-mainz.de> <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> Message-ID: <2B9FAF28-C3C7-4ECD-9036-51B1DA2A250B@uni-mainz.de> Thanks - that would be great! > On Mar 11, 2016, at 10:25 PM, Barry Smith wrote: > > > Boris, > > We will add this support to the DMShell and its usage from PCMG within a few days. > > Barry > >> On Mar 11, 2016, at 3:39 PM, Boris Kaus wrote: >> >> >>> On Mar 11, 2016, at 8:53 PM, Matthew Knepley wrote: >>> >>> On Fri, Mar 11, 2016 at 12:26 PM, Dave May wrote: >>> On 11 March 2016 at 18:11, anton wrote: >>> Hi team, >>> >>> I'm implementing staggered grid in a PETSc-canonical way, trying to build a custom DM object, attach it to SNES, that should later transfered it further to KSP and PC. >>> >>> Yet, the Galerking coarsening for staggered grid is non-symmetric. The question is how possible is it that DMShellSetCreateRestriction can be implemented and included in 3.7 release? >>> >>> It's a little more work than just adding a new method within the DM and a new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). >>> PCMG needs to be modified to call DMCreateRestriction(). >>> >>> Dave is correct. Currently, PCMG only calls DMCreateInterpolation(). We would need to add a DMCreateRestriction() call. >> The PCMG object already uses a restriction operator that is different from the interpolation parameter if it is specified with PCMGSetRestriction. >> For consistency, one would expect a similar DMCreateRestriction object, not? I realize that this is not relevant for FEM codes, but for staggered FD it makes quite some difference. >> >> Other suggestions on how to best integrate staggered finite differences within the current PETSc framework are ofcourse also highly welcome. >> Our current thinking was to pack it into a DMSHELL (which has the problem of not having a restriction interface). >> >> thanks, >> Boris >> >> >> >> >>> Thanks, >>> >>> Matt >>> >>> Please, please. >>> >>> Thanks, >>> Anton >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> > From dave.mayhem23 at gmail.com Sat Mar 12 00:24:08 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Sat, 12 Mar 2016 07:24:08 +0100 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> References: <56E2FC53.1040709@uni-mainz.de> <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> Message-ID: > Other suggestions on how to best integrate staggered finite differences > within the current PETSc framework are ofcourse also highly welcome. > Our current thinking was to pack it into a DMSHELL (which has the problem > of not having a restriction interface). > > Using DMShell is the cleanest approach. An alternative is to have you user code simply take control of all of the configuration of the PCMG object. E.g. you call your user code which creates the restriction operator, you pull out the PC and call PCMGSetRestriction() on etc. This can be done easily performed in the context of linear problems. For non-linear problems, you could jam this setup code inside your ComputeJacobian function. This is all possible, albeit clunky and kinda ugly. It works though if you need something before Barry adds the required support in PCMG. Cheers Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Sat Mar 12 03:09:46 2016 From: popov at uni-mainz.de (anton) Date: Sat, 12 Mar 2016 10:09:46 +0100 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: References: <56E2FC53.1040709@uni-mainz.de> <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> Message-ID: <56E3DCDA.5010407@uni-mainz.de> On 03/11/2016 11:25 PM, Barry Smith wrote: > Boris, > > We will add this support to the DMShell and its usage from PCMG within a few days. > > Barry > Tanks Barry. This is super-fast and very helpful. Cheers, Anton >> On Mar 11, 2016, at 3:39 PM, Boris Kaus wrote: >> >> >>> On Mar 11, 2016, at 8:53 PM, Matthew Knepley wrote: >>> >>> On Fri, Mar 11, 2016 at 12:26 PM, Dave May wrote: >>> On 11 March 2016 at 18:11, anton wrote: >>> Hi team, >>> >>> I'm implementing staggered grid in a PETSc-canonical way, trying to build a custom DM object, attach it to SNES, that should later transfered it further to KSP and PC. >>> >>> Yet, the Galerking coarsening for staggered grid is non-symmetric. The question is how possible is it that DMShellSetCreateRestriction can be implemented and included in 3.7 release? >>> >>> It's a little more work than just adding a new method within the DM and a new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). >>> PCMG needs to be modified to call DMCreateRestriction(). >>> >>> Dave is correct. Currently, PCMG only calls DMCreateInterpolation(). We would need to add a DMCreateRestriction() call. >> The PCMG object already uses a restriction operator that is different from the interpolation parameter if it is specified with PCMGSetRestriction. >> For consistency, one would expect a similar DMCreateRestriction object, not? I realize that this is not relevant for FEM codes, but for staggered FD it makes quite some difference. >> >> Other suggestions on how to best integrate staggered finite differences within the current PETSc framework are ofcourse also highly welcome. >> Our current thinking was to pack it into a DMSHELL (which has the problem of not having a restriction interface). >> >> thanks, >> Boris >> >> >> >> >>> Thanks, >>> >>> Matt >>> >>> Please, please. >>> >>> Thanks, >>> Anton >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener From kaus at uni-mainz.de Sat Mar 12 03:51:40 2016 From: kaus at uni-mainz.de (Boris Kaus) Date: Sat, 12 Mar 2016 09:51:40 +0000 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: References: <56E2FC53.1040709@uni-mainz.de> <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> Message-ID: <56E00AC9-41F7-4F13-8264-E7F792E24CB5@uni-mainz.de> > Other suggestions on how to best integrate staggered finite differences within the current PETSc framework are ofcourse also highly welcome. > Our current thinking was to pack it into a DMSHELL (which has the problem of not having a restriction interface). > > > Using DMShell is the cleanest approach. > > An alternative is to have you user code simply take control of all of the configuration of the PCMG object. E.g. you call your user code which creates the restriction operator, you pull out the PC and call PCMGSetRestriction() on etc. This can be done easily performed in the context of linear problems. For non-linear problems, you could jam this setup code inside your ComputeJacobian function. > > This is all possible, albeit clunky and kinda ugly. It works though if you need something before Barry adds the required support in PCMG. > that is indeed what we currently do, which does work. Yet, as you say, it is clunky and does not allow setting up things like FAS in an easy manner, or add new physics to the code in a straightforward manner. Having a cleaner interface would be really nice. Boris > Cheers > Dave > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlcch at dtu.dk Sat Mar 12 05:41:33 2016 From: mlcch at dtu.dk (Max la Cour Christensen) Date: Sat, 12 Mar 2016 11:41:33 +0000 Subject: [petsc-users] Using TS Message-ID: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> Hi guys, We are making preparations to implement adjoint based optimisation in our in-house oil and gas reservoir simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, but instead we have our own backward Euler and Newton-Raphson implementation. Due to the upcoming implementation of adjoints, we are considering changing the code and begin using TS and SNES. After examining the PETSc manual and examples, we are still not completely clear on how to apply TS to our system of PDEs. In a simplified formulation, it can be written as: \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) S_o + S_w = 1, where p is the pressure, \phi( p ) is a porosity function depending on pressure, \rho_x( p ) is a density function depending on pressure, S_o is the saturation of oil, S_g is the saturation of gas, t is time, F_x(p,S) is a function containing fluxes and source terms. The primary variables are p, S_o and S_w. We are using a lowest order Finite Volume discretisation. Now for implementing this in TS (with the prospect of later using TSAdjoint), we are not sure if we need all of the functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and what parts of the equations go where. Especially we are unsure of how to use the concept of a shifted jacobian (TSSetIJacobian). Any advice you could provide will be highly appreciated. Many thanks, Max la Cour Christensen PhD student, Technical University of Denmark From knepley at gmail.com Sat Mar 12 12:04:19 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 12 Mar 2016 12:04:19 -0600 Subject: [petsc-users] Using TS In-Reply-To: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> Message-ID: On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen wrote: > > Hi guys, > > We are making preparations to implement adjoint based optimisation in our > in-house oil and gas reservoir simulator. Currently our code uses PETSc's > DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, but > instead we have our own backward Euler and Newton-Raphson implementation. > Due to the upcoming implementation of adjoints, we are considering changing > the code and begin using TS and SNES. > > After examining the PETSc manual and examples, we are still not completely > clear on how to apply TS to our system of PDEs. In a simplified > formulation, it can be written as: > > \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) > \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) > S_o + S_w = 1, > > where p is the pressure, > \phi( p ) is a porosity function depending on pressure, > \rho_x( p ) is a density function depending on pressure, > S_o is the saturation of oil, > S_g is the saturation of gas, > t is time, > F_x(p,S) is a function containing fluxes and source terms. The primary > variables are p, S_o and S_w. > > We are using a lowest order Finite Volume discretisation. > > Now for implementing this in TS (with the prospect of later using > TSAdjoint), we are not sure if we need all of the functions: > TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and > what parts of the equations go where. Especially we are unsure of how to > use the concept of a shifted jacobian (TSSetIJacobian). > > Any advice you could provide will be highly appreciated. > Barry and Emil, I am also interested in this, since I don't know how to do it. Thanks, Matt > Many thanks, > Max la Cour Christensen > PhD student, Technical University of Denmark > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Mar 12 15:19:19 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 12 Mar 2016 15:19:19 -0600 Subject: [petsc-users] Using TS In-Reply-To: References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> Message-ID: This is only a starting point, Jed and Emil can fix my mistakes and provide additional details. In your case you will not provide a TSSetRHSFunction and TSSetRHSJacobian since everything should be treated implicitly as a DAE. First move everything in the three equations to the left side and then differentiate through the \partial/\partial t so that it only applies to the S_o, S_w, and p. For example for the first equation using the product rule twice you get something like \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o \partial \phi( p ) \partial t - F_o = 0 \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p ) \partial p \partial t - F_o = 0 The two vector arguments to your IFunction are exactly the S_o, S_w, and p and \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t so it is immediate to code up your IFunction once you have the analytic form above For the IJacobian and the "shift business" just remember that dF/dU means take the derivative of the IFunction with respect to S_o, S_w, and p treating the \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t as if they were independent of S_o, S_w, and p. For the dF/dU_t that means taking the derivate with respect to the \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t treating the S_o, S_w, and p as independent of \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t. Then you just need to form the sum of the two parts with the a "shift" scaling dF/dU + a*dF/dU_t For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial S_w)/\partial t = 0 (\partial p)/\partial t = 0 Computations for the first two equations are messy but straightforward. For example for the first equation dF/dS_o = phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) = \phi( p ) \rho_o( p ) Barry > On Mar 12, 2016, at 12:04 PM, Matthew Knepley wrote: > > On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen wrote: > > Hi guys, > > We are making preparations to implement adjoint based optimisation in our in-house oil and gas reservoir simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, but instead we have our own backward Euler and Newton-Raphson implementation. Due to the upcoming implementation of adjoints, we are considering changing the code and begin using TS and SNES. > > After examining the PETSc manual and examples, we are still not completely clear on how to apply TS to our system of PDEs. In a simplified formulation, it can be written as: > > \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) > \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) > S_o + S_w = 1, > > where p is the pressure, > \phi( p ) is a porosity function depending on pressure, > \rho_x( p ) is a density function depending on pressure, > S_o is the saturation of oil, > S_g is the saturation of gas, > t is time, > F_x(p,S) is a function containing fluxes and source terms. The primary variables are p, S_o and S_w. > > We are using a lowest order Finite Volume discretisation. > > Now for implementing this in TS (with the prospect of later using TSAdjoint), we are not sure if we need all of the functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and what parts of the equations go where. Especially we are unsure of how to use the concept of a shifted jacobian (TSSetIJacobian). > > Any advice you could provide will be highly appreciated. > > Barry and Emil, > > I am also interested in this, since I don't know how to do it. > > Thanks, > > Matt > > Many thanks, > Max la Cour Christensen > PhD student, Technical University of Denmark > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From knepley at gmail.com Sat Mar 12 15:38:43 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 12 Mar 2016 15:38:43 -0600 Subject: [petsc-users] Using TS In-Reply-To: References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> Message-ID: On Sat, Mar 12, 2016 at 3:19 PM, Barry Smith wrote: > > This is only a starting point, Jed and Emil can fix my mistakes and > provide additional details. > > In your case you will not provide a TSSetRHSFunction and > TSSetRHSJacobian since everything should be treated implicitly as a DAE. > > First move everything in the three equations to the left side and then > differentiate through the \partial/\partial t so that it only applies to > the S_o, S_w, and p. For example for the first equation using the product > rule twice you get something like > > \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o > \partial \rho_o( p ) \partial t + \rho_o( p ) S_o \partial \phi( p ) > \partial t - F_o = 0 > > \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o > \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p ) \partial p > \partial t - F_o = 0 > > The two vector arguments to your IFunction are exactly the S_o, S_w, and p > and \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ > \partial t so it is immediate to code up your IFunction once you have the > analytic form above > > For the IJacobian and the "shift business" just remember that dF/dU means > take the derivative of the IFunction with respect to S_o, S_w, and p > treating the \partial S_o/ \partial t , \partial S_w/ \partial t, and > \partial p/ \partial t as if they were independent of S_o, S_w, and p. For > the dF/dU_t that means taking the derivate with respect to the \partial > S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t > treating the S_o, S_w, and p as independent of \partial S_o/ \partial t , > \partial S_w/ \partial t, and \partial p/ \partial t. Then you just need > to form the sum of the two parts with the a "shift" scaling dF/dU + > a*dF/dU_t > > For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = 1 dF/dp = > 0 dF/d (\partial S_o)/\partial t = 0 (\partial S_w)/\partial t = 0 > (\partial p)/\partial t = 0 > > Computations for the first two equations are messy but straightforward. > For example for the first equation dF/dS_o = phi( p ) \rho_o'(p) \partial > p \partial t + \rho_o( p ) \phi'( p ) \partial p + dF_o/dS_o and dF/d > (\partial S_o)/\partial t) = \phi( p ) \rho_o( p ) Max and Stefan, You can see me trying to do this in a more generic sense here: https://bitbucket.org/petsc/petsc/src/f0b116324093eeda71fbbb2872e1bdb3483ad365/src/snes/utils/dmplexsnes.c?at=master&fileviewer=file-view-default#dmplexsnes.c-1678 This is intended to work with both FEM and FVM, which makes it look a little messier than I would like right now. Thanks, Matt > > Barry > > > On Mar 12, 2016, at 12:04 PM, Matthew Knepley wrote: > > > > On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen > wrote: > > > > Hi guys, > > > > We are making preparations to implement adjoint based optimisation in > our in-house oil and gas reservoir simulator. Currently our code uses > PETSc's DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, > but instead we have our own backward Euler and Newton-Raphson > implementation. Due to the upcoming implementation of adjoints, we are > considering changing the code and begin using TS and SNES. > > > > After examining the PETSc manual and examples, we are still not > completely clear on how to apply TS to our system of PDEs. In a simplified > formulation, it can be written as: > > > > \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) > > \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) > > S_o + S_w = 1, > > > > where p is the pressure, > > \phi( p ) is a porosity function depending on pressure, > > \rho_x( p ) is a density function depending on pressure, > > S_o is the saturation of oil, > > S_g is the saturation of gas, > > t is time, > > F_x(p,S) is a function containing fluxes and source terms. The primary > variables are p, S_o and S_w. > > > > We are using a lowest order Finite Volume discretisation. > > > > Now for implementing this in TS (with the prospect of later using > TSAdjoint), we are not sure if we need all of the functions: > TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and > what parts of the equations go where. Especially we are unsure of how to > use the concept of a shifted jacobian (TSSetIJacobian). > > > > Any advice you could provide will be highly appreciated. > > > > Barry and Emil, > > > > I am also interested in this, since I don't know how to do it. > > > > Thanks, > > > > Matt > > > > Many thanks, > > Max la Cour Christensen > > PhD student, Technical University of Denmark > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From emconsta at mcs.anl.gov Sat Mar 12 20:34:25 2016 From: emconsta at mcs.anl.gov (Emil Constantinescu) Date: Sat, 12 Mar 2016 20:34:25 -0600 Subject: [petsc-users] Using TS In-Reply-To: References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> Message-ID: <56E4D1B1.4040901@mcs.anl.gov> I also find it useful to go through one of the simple examples available for TS: http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html (ex8 may be a good start). As Barry suggested, you need to implement IFunction and IJacobian. The argument "u" is S_o, S_w, and p stacked together and "u_t" their corresponding time derivatives. The rest is calculus, but following an example usually helps a lot in the beginning. Out of curiosity, what is the application? Emil On 3/12/16 3:19 PM, Barry Smith wrote: > > This is only a starting point, Jed and Emil can fix my mistakes and provide additional details. > > > In your case you will not provide a TSSetRHSFunction and TSSetRHSJacobian since everything should be treated implicitly as a DAE. > > First move everything in the three equations to the left side and then differentiate through the \partial/\partial t so that it only applies to the S_o, S_w, and p. For example for the first equation using the product rule twice you get something like > > \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o \partial \phi( p ) \partial t - F_o = 0 > > \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p ) \partial p \partial t - F_o = 0 > > The two vector arguments to your IFunction are exactly the S_o, S_w, and p and \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t so it is immediate to code up your IFunction once you have the analytic form above > > For the IJacobian and the "shift business" just remember that dF/dU means take the derivative of the IFunction with respect to S_o, S_w, and p treating the \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t as if they were independent of S_o, S_w, and p. For the dF/dU_t that means taking the derivate with respect to the \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t treating the S_o, S_w, and p as independent of \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t. Then you just need to form the sum of the two parts with the a "shift" scaling dF/dU + a*dF/dU_t > > For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial S_w)/\partial t = 0 (\partial p)/\partial t = 0 > > Computations for the first two equations are messy but straightforward. For example for the first equation dF/dS_o = phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) = \phi( p ) \rho_o( p ) > > > Barry > >> On Mar 12, 2016, at 12:04 PM, Matthew Knepley wrote: >> >> On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen wrote: >> >> Hi guys, >> >> We are making preparations to implement adjoint based optimisation in our in-house oil and gas reservoir simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, but instead we have our own backward Euler and Newton-Raphson implementation. Due to the upcoming implementation of adjoints, we are considering changing the code and begin using TS and SNES. >> >> After examining the PETSc manual and examples, we are still not completely clear on how to apply TS to our system of PDEs. In a simplified formulation, it can be written as: >> >> \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) >> \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) >> S_o + S_w = 1, >> >> where p is the pressure, >> \phi( p ) is a porosity function depending on pressure, >> \rho_x( p ) is a density function depending on pressure, >> S_o is the saturation of oil, >> S_g is the saturation of gas, >> t is time, >> F_x(p,S) is a function containing fluxes and source terms. The primary variables are p, S_o and S_w. >> >> We are using a lowest order Finite Volume discretisation. >> >> Now for implementing this in TS (with the prospect of later using TSAdjoint), we are not sure if we need all of the functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and what parts of the equations go where. Especially we are unsure of how to use the concept of a shifted jacobian (TSSetIJacobian). >> >> Any advice you could provide will be highly appreciated. >> >> Barry and Emil, >> >> I am also interested in this, since I don't know how to do it. >> >> Thanks, >> >> Matt >> >> Many thanks, >> Max la Cour Christensen >> PhD student, Technical University of Denmark >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From knepley at gmail.com Sat Mar 12 20:37:41 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 12 Mar 2016 20:37:41 -0600 Subject: [petsc-users] Using TS In-Reply-To: <56E4D1B1.4040901@mcs.anl.gov> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> Message-ID: On Sat, Mar 12, 2016 at 8:34 PM, Emil Constantinescu wrote: > I also find it useful to go through one of the simple examples available > for TS: > http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html > (ex8 may be a good start). > > As Barry suggested, you need to implement IFunction and IJacobian. The > argument "u" is S_o, S_w, and p stacked together and "u_t" their > corresponding time derivatives. The rest is calculus, but following an > example usually helps a lot in the beginning. > Are you guys saying that IFunction and IJacobian are enough to do the adjoint system as well? Matt > Out of curiosity, what is the application? > > Emil > > > On 3/12/16 3:19 PM, Barry Smith wrote: > >> >> This is only a starting point, Jed and Emil can fix my mistakes and >> provide additional details. >> >> >> In your case you will not provide a TSSetRHSFunction and >> TSSetRHSJacobian since everything should be treated implicitly as a DAE. >> >> First move everything in the three equations to the left side and >> then differentiate through the \partial/\partial t so that it only applies >> to the S_o, S_w, and p. For example for the first equation using the >> product rule twice you get something like >> >> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o >> \partial \rho_o( p ) \partial t + \rho_o( p ) S_o \partial \phi( p ) >> \partial t - F_o = 0 >> >> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o >> \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p ) \partial p >> \partial t - F_o = 0 >> >> The two vector arguments to your IFunction are exactly the S_o, S_w, and >> p and \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial >> p/ \partial t so it is immediate to code up your IFunction once you have >> the analytic form above >> >> For the IJacobian and the "shift business" just remember that dF/dU means >> take the derivative of the IFunction with respect to S_o, S_w, and p >> treating the \partial S_o/ \partial t , \partial S_w/ \partial t, and >> \partial p/ \partial t as if they were independent of S_o, S_w, and p. For >> the dF/dU_t that means taking the derivate with respect to the \partial >> S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t >> treating the S_o, S_w, and p as independent of \partial S_o/ \partial t , >> \partial S_w/ \partial t, and \partial p/ \partial t. Then you just need >> to form the sum of the two parts with the a "shift" scaling dF/dU + >> a*dF/dU_t >> >> For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = 1 dF/dp >> = 0 dF/d (\partial S_o)/\partial t = 0 (\partial S_w)/\partial t = 0 >> (\partial p)/\partial t = 0 >> >> Computations for the first two equations are messy but straightforward. >> For example for the first equation dF/dS_o = phi( p ) \rho_o'(p) \partial >> p \partial t + \rho_o( p ) \phi'( p ) \partial p + dF_o/dS_o and dF/d >> (\partial S_o)/\partial t) = \phi( p ) \rho_o( p ) >> >> >> Barry >> >> On Mar 12, 2016, at 12:04 PM, Matthew Knepley wrote: >>> >>> On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen >>> wrote: >>> >>> Hi guys, >>> >>> We are making preparations to implement adjoint based optimisation in >>> our in-house oil and gas reservoir simulator. Currently our code uses >>> PETSc's DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, >>> but instead we have our own backward Euler and Newton-Raphson >>> implementation. Due to the upcoming implementation of adjoints, we are >>> considering changing the code and begin using TS and SNES. >>> >>> After examining the PETSc manual and examples, we are still not >>> completely clear on how to apply TS to our system of PDEs. In a simplified >>> formulation, it can be written as: >>> >>> \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) >>> \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) >>> S_o + S_w = 1, >>> >>> where p is the pressure, >>> \phi( p ) is a porosity function depending on pressure, >>> \rho_x( p ) is a density function depending on pressure, >>> S_o is the saturation of oil, >>> S_g is the saturation of gas, >>> t is time, >>> F_x(p,S) is a function containing fluxes and source terms. The primary >>> variables are p, S_o and S_w. >>> >>> We are using a lowest order Finite Volume discretisation. >>> >>> Now for implementing this in TS (with the prospect of later using >>> TSAdjoint), we are not sure if we need all of the functions: >>> TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and >>> what parts of the equations go where. Especially we are unsure of how to >>> use the concept of a shifted jacobian (TSSetIJacobian). >>> >>> Any advice you could provide will be highly appreciated. >>> >>> Barry and Emil, >>> >>> I am also interested in this, since I don't know how to do it. >>> >>> Thanks, >>> >>> Matt >>> >>> Many thanks, >>> Max la Cour Christensen >>> PhD student, Technical University of Denmark >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From emconsta at mcs.anl.gov Sat Mar 12 21:02:54 2016 From: emconsta at mcs.anl.gov (Emil Constantinescu) Date: Sat, 12 Mar 2016 21:02:54 -0600 Subject: [petsc-users] Using TS In-Reply-To: References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> Message-ID: <56E4D85E.6060102@mcs.anl.gov> On 3/12/16 8:37 PM, Matthew Knepley wrote: > On Sat, Mar 12, 2016 at 8:34 PM, Emil Constantinescu > > wrote: > > I also find it useful to go through one of the simple examples > available for TS: > http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html > (ex8 may be a good start). > > As Barry suggested, you need to implement IFunction and IJacobian. > The argument "u" is S_o, S_w, and p stacked together and "u_t" > their corresponding time derivatives. The rest is calculus, but > following an example usually helps a lot in the beginning. > > > Are you guys saying that IFunction and IJacobian are enough to do the > adjoint system as well? > Pretty much yes, but it depends on the cost function. This is the beauty of discrete adjoints - if you have the Jacobian (transpose, done internally through KSP) you're done. You need IJacobian for sure to do the backward propagation. If you have that, the rest is usually trivial. Mr. Hong Zhang (my postdoc) set up quite a few simple examples. Emil > Matt > > Out of curiosity, what is the application? > > Emil > > > On 3/12/16 3:19 PM, Barry Smith wrote: > > > This is only a starting point, Jed and Emil can fix my > mistakes and provide additional details. > > > In your case you will not provide a TSSetRHSFunction and > TSSetRHSJacobian since everything should be treated implicitly > as a DAE. > > First move everything in the three equations to the left > side and then differentiate through the \partial/\partial t so > that it only applies to the S_o, S_w, and p. For example for the > first equation using the product rule twice you get something like > > \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) > S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o > \partial \phi( p ) \partial t - F_o = 0 > > \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) > S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p > ) \partial p \partial t - F_o = 0 > > The two vector arguments to your IFunction are exactly the S_o, > S_w, and p and \partial S_o/ \partial t , \partial S_w/ > \partial t, and \partial p/ \partial t so it is immediate to > code up your IFunction once you have the analytic form above > > For the IJacobian and the "shift business" just remember that > dF/dU means take the derivative of the IFunction with respect to > S_o, S_w, and p treating the \partial S_o/ \partial t , > \partial S_w/ \partial t, and \partial p/ \partial t as if they > were independent of S_o, S_w, and p. For the dF/dU_t that means > taking the derivate with respect to the \partial S_o/ \partial t > , \partial S_w/ \partial t, and \partial p/ \partial t > treating the S_o, S_w, and p as independent of \partial S_o/ > \partial t , \partial S_w/ \partial t, and \partial p/ > \partial t. Then you just need to form the sum of the two > parts with the a "shift" scaling dF/dU + a*dF/dU_t > > For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = > 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial > S_w)/\partial t = 0 (\partial p)/\partial t = 0 > > Computations for the first two equations are messy but > straightforward. For example for the first equation dF/dS_o = > phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( > p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) > = \phi( p ) \rho_o( p ) > > > Barry > > On Mar 12, 2016, at 12:04 PM, Matthew Knepley > > wrote: > > On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen > > wrote: > > Hi guys, > > We are making preparations to implement adjoint based > optimisation in our in-house oil and gas reservoir > simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, > KSP and PC. We are still not using SNES and TS, but instead > we have our own backward Euler and Newton-Raphson > implementation. Due to the upcoming implementation of > adjoints, we are considering changing the code and begin > using TS and SNES. > > After examining the PETSc manual and examples, we are still > not completely clear on how to apply TS to our system of > PDEs. In a simplified formulation, it can be written as: > > \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) > \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) > S_o + S_w = 1, > > where p is the pressure, > \phi( p ) is a porosity function depending on pressure, > \rho_x( p ) is a density function depending on pressure, > S_o is the saturation of oil, > S_g is the saturation of gas, > t is time, > F_x(p,S) is a function containing fluxes and source terms. > The primary variables are p, S_o and S_w. > > We are using a lowest order Finite Volume discretisation. > > Now for implementing this in TS (with the prospect of later > using TSAdjoint), we are not sure if we need all of the > functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian > and TSSetRHSJacobian and what parts of the equations go > where. Especially we are unsure of how to use the concept of > a shifted jacobian (TSSetIJacobian). > > Any advice you could provide will be highly appreciated. > > Barry and Emil, > > I am also interested in this, since I don't know how to do it. > > Thanks, > > Matt > > Many thanks, > Max la Cour Christensen > PhD student, Technical University of Denmark > > > > -- > What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any > results to which their experiments lead. > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From davydden at gmail.com Mon Mar 14 10:55:59 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 14 Mar 2016 16:55:59 +0100 Subject: [petsc-users] build with Scalapack/Blacs from MKL (failed to compile a test) Message-ID: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> Dear all, What is the correct way to build PETSc with Scalapack/Blacs from MKL? I tried: ?with-scalapack-lib=/path/to/mkl/lib/intel64/libmkl_scalapack_lp64.so /path/to/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so ?with-scalapack-include=/path/to/mkl/include but PETSc config fails to compile a test program and gives a lot of ?undefined reference to ?ompi_mpi_real?, ?MPI_Allgather?, ?MPI_Cart_create?? etc from libmkl_blacs_openmpi_lp64.so. Strangely enough I do see ?-lmpi_f90 -lmpi_f77 -lmpi? in the compiler line, so i don?t know why these symbols are missing. Also MPI wrapper is used (/usr/bin/mpicc). I already built MUMPS without a problem and even tests run ok, so I am quite puzzled whether it?s something related to PETSc or MKL. p.s. I use OpenMPI provided by Ubuntu. Kind regards, Denis From balay at mcs.anl.gov Mon Mar 14 10:59:38 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 Mar 2016 10:59:38 -0500 Subject: [petsc-users] build with Scalapack/Blacs from MKL (failed to compile a test) In-Reply-To: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> References: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> Message-ID: please send corresponding configure.log Satish On Mon, 14 Mar 2016, Denis Davydov wrote: > Dear all, > > What is the correct way to build PETSc with Scalapack/Blacs from MKL? > I tried: > > ?with-scalapack-lib=/path/to/mkl/lib/intel64/libmkl_scalapack_lp64.so /path/to/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so > ?with-scalapack-include=/path/to/mkl/include > > but PETSc config fails to compile a test program and gives a lot of > ?undefined reference to ?ompi_mpi_real?, ?MPI_Allgather?, ?MPI_Cart_create?? etc from libmkl_blacs_openmpi_lp64.so. > Strangely enough I do see ?-lmpi_f90 -lmpi_f77 -lmpi? in the compiler line, so i don?t know why these symbols are missing. > Also MPI wrapper is used (/usr/bin/mpicc). > > I already built MUMPS without a problem and even tests run ok, so I am quite puzzled whether it?s something related to PETSc or MKL. > > p.s. I use OpenMPI provided by Ubuntu. > > Kind regards, > Denis > > From davydden at gmail.com Mon Mar 14 11:03:25 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 14 Mar 2016 17:03:25 +0100 Subject: [petsc-users] build with Scalapack/Blacs from MKL (failed to compile a test) In-Reply-To: References: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> Message-ID: Hi Satish, Please find the log attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.zip Type: application/zip Size: 242459 bytes Desc: not available URL: -------------- next part -------------- Kind regards, Denis > On 14 Mar 2016, at 16:59, Satish Balay wrote: > > please send corresponding configure.log > > Satish > > On Mon, 14 Mar 2016, Denis Davydov wrote: > >> Dear all, >> >> What is the correct way to build PETSc with Scalapack/Blacs from MKL? >> I tried: >> >> ?with-scalapack-lib=/path/to/mkl/lib/intel64/libmkl_scalapack_lp64.so /path/to/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so >> ?with-scalapack-include=/path/to/mkl/include >> >> but PETSc config fails to compile a test program and gives a lot of >> ?undefined reference to ?ompi_mpi_real?, ?MPI_Allgather?, ?MPI_Cart_create?? etc from libmkl_blacs_openmpi_lp64.so. >> Strangely enough I do see ?-lmpi_f90 -lmpi_f77 -lmpi? in the compiler line, so i don?t know why these symbols are missing. >> Also MPI wrapper is used (/usr/bin/mpicc). >> >> I already built MUMPS without a problem and even tests run ok, so I am quite puzzled whether it?s something related to PETSc or MKL. >> >> p.s. I use OpenMPI provided by Ubuntu. >> >> Kind regards, >> Denis >> >> From balay at mcs.anl.gov Mon Mar 14 11:22:08 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 Mar 2016 11:22:08 -0500 Subject: [petsc-users] build with Scalapack/Blacs from MKL (failed to compile a test) In-Reply-To: References: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> Message-ID: I do not know why this is hapenning. I just tried to reproduce - but it ran fine for me. balay at es^/scratch/balay/petsc(master=) $ ./configure --with-cc=mpicc.openmpi --with-cxx=mpicxx.openmpi --with-fc=mpif90.openmpi --with-blas-lapack-dir=$MKL_HOME --with-scalapack-include=$MKL_HOME/include --with-scalapack-lib="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64" Is this issue reproduceable if you just configure with scalapack [as above?] - and not any other package? What if you try using --download-openmpi - instead of ubuntu openmpi? Sorry - I don't have a good answer for this issue.. Satish On Mon, 14 Mar 2016, Denis Davydov wrote: > Hi Satish, > > Please find the log attached. > > > Kind regards, > Denis > > > On 14 Mar 2016, at 16:59, Satish Balay wrote: > > > > please send corresponding configure.log > > > > Satish > > > > On Mon, 14 Mar 2016, Denis Davydov wrote: > > > >> Dear all, > >> > >> What is the correct way to build PETSc with Scalapack/Blacs from MKL? > >> I tried: > >> > >> ?with-scalapack-lib=/path/to/mkl/lib/intel64/libmkl_scalapack_lp64.so /path/to/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so > >> ?with-scalapack-include=/path/to/mkl/include > >> > >> but PETSc config fails to compile a test program and gives a lot of > >>?????? ?undefined reference to ?ompi_mpi_real?, ?MPI_Allgather?, ?MPI_Cart_create?? etc from libmkl_blacs_openmpi_lp64.so. > >> Strangely enough I do see ?-lmpi_f90 -lmpi_f77 -lmpi? in the compiler line, so i don?t know why these symbols are missing. > >> Also MPI wrapper is used (/usr/bin/mpicc). > >> > >> I already built MUMPS without a problem and even tests run ok, so I am quite puzzled whether it?s something related to PETSc or MKL. > >> > >> p.s. I use OpenMPI provided by Ubuntu. > >> > >> Kind regards, > >> Denis > >> > >> > > > From davydden at gmail.com Mon Mar 14 11:52:23 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 14 Mar 2016 17:52:23 +0100 Subject: [petsc-users] build with Scalapack/Blacs from MKL (failed to compile a test) In-Reply-To: References: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> Message-ID: > On 14 Mar 2016, at 17:22, Satish Balay wrote: > > I do not know why this is hapenning. I just tried to reproduce - but it ran fine for me. > > balay at es^/scratch/balay/petsc(master=) $ ./configure --with-cc=mpicc.openmpi --with-cxx=mpicxx.openmpi --with-fc=mpif90.openmpi --with-blas-lapack-dir=$MKL_HOME --with-scalapack-include=$MKL_HOME/include --with-scalapack-lib="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64" > > > Is this issue reproduceable if you just configure with scalapack [as above?] - and not any other package? I tried with the same simple setup, but no luck. In Ubuntu OpenMPI is 1.6.5. > > What if you try using --download-openmpi - instead of ubuntu openmpi? did that, 1.8.5 was fetched. But this did not change the result. Very strange... > > Sorry - I don't have a good answer for this issue.. Could this be related to the usage of GNU compilers? But since MUMPS linked and tests run, i don?t see how this is possible. Regards, Denis. > > Satish > > > On Mon, 14 Mar 2016, Denis Davydov wrote: > >> Hi Satish, >> >> Please find the log attached. >> >> >> Kind regards, >> Denis >> >>> On 14 Mar 2016, at 16:59, Satish Balay wrote: >>> >>> please send corresponding configure.log >>> >>> Satish >>> >>> On Mon, 14 Mar 2016, Denis Davydov wrote: >>> >>>> Dear all, >>>> >>>> What is the correct way to build PETSc with Scalapack/Blacs from MKL? >>>> I tried: >>>> >>>> ?with-scalapack-lib=/path/to/mkl/lib/intel64/libmkl_scalapack_lp64.so /path/to/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so >>>> ?with-scalapack-include=/path/to/mkl/include >>>> >>>> but PETSc config fails to compile a test program and gives a lot of >>>> ?undefined reference to ?ompi_mpi_real?, ?MPI_Allgather?, ?MPI_Cart_create?? etc from libmkl_blacs_openmpi_lp64.so. >>>> Strangely enough I do see ?-lmpi_f90 -lmpi_f77 -lmpi? in the compiler line, so i don?t know why these symbols are missing. >>>> Also MPI wrapper is used (/usr/bin/mpicc). >>>> >>>> I already built MUMPS without a problem and even tests run ok, so I am quite puzzled whether it?s something related to PETSc or MKL. >>>> >>>> p.s. I use OpenMPI provided by Ubuntu. >>>> >>>> Kind regards, >>>> Denis >>>> >>>> >> >> >> From balay at mcs.anl.gov Mon Mar 14 11:55:07 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 Mar 2016 11:55:07 -0500 Subject: [petsc-users] build with Scalapack/Blacs from MKL (failed to compile a test) In-Reply-To: References: <8CFAFB21-445B-44DB-ACE5-9961D9A3ECC8@gmail.com> Message-ID: My trial was with default openmpi on ubuntu 12.04 [with gcc] I'll recommend using --download-scalapack. Not sure how much improvement intel scalapack will show for mumps.. Satish On Mon, 14 Mar 2016, Denis Davydov wrote: > > > On 14 Mar 2016, at 17:22, Satish Balay wrote: > > > > I do not know why this is hapenning. I just tried to reproduce - but it ran fine for me. > > > > balay at es^/scratch/balay/petsc(master=) $ ./configure --with-cc=mpicc.openmpi --with-cxx=mpicxx.openmpi --with-fc=mpif90.openmpi --with-blas-lapack-dir=$MKL_HOME --with-scalapack-include=$MKL_HOME/include --with-scalapack-lib="-L$MKL_HOME/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64" > > > > > > Is this issue reproduceable if you just configure with scalapack [as above?] - and not any other package? > > I tried with the same simple setup, but no luck. In Ubuntu OpenMPI is 1.6.5. > > > > > What if you try using --download-openmpi - instead of ubuntu openmpi? > > did that, 1.8.5 was fetched. But this did not change the result. > Very strange... > > > > > Sorry - I don't have a good answer for this issue.. > > Could this be related to the usage of GNU compilers? > But since MUMPS linked and tests run, i don?t see how this is possible. > > Regards, > Denis. > > > > > > Satish > > > > > > On Mon, 14 Mar 2016, Denis Davydov wrote: > > > >> Hi Satish, > >> > >> Please find the log attached. > >> > >> > >> Kind regards, > >> Denis > >> > >>> On 14 Mar 2016, at 16:59, Satish Balay wrote: > >>> > >>> please send corresponding configure.log > >>> > >>> Satish > >>> > >>> On Mon, 14 Mar 2016, Denis Davydov wrote: > >>> > >>>> Dear all, > >>>> > >>>> What is the correct way to build PETSc with Scalapack/Blacs from MKL? > >>>> I tried: > >>>> > >>>> ?with-scalapack-lib=/path/to/mkl/lib/intel64/libmkl_scalapack_lp64.so /path/to/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so > >>>> ?with-scalapack-include=/path/to/mkl/include > >>>> > >>>> but PETSc config fails to compile a test program and gives a lot of > >>>> ?undefined reference to ?ompi_mpi_real?, ?MPI_Allgather?, ?MPI_Cart_create?? etc from libmkl_blacs_openmpi_lp64.so. > >>>> Strangely enough I do see ?-lmpi_f90 -lmpi_f77 -lmpi? in the compiler line, so i don?t know why these symbols are missing. > >>>> Also MPI wrapper is used (/usr/bin/mpicc). > >>>> > >>>> I already built MUMPS without a problem and even tests run ok, so I am quite puzzled whether it?s something related to PETSc or MKL. > >>>> > >>>> p.s. I use OpenMPI provided by Ubuntu. > >>>> > >>>> Kind regards, > >>>> Denis > >>>> > >>>> > >> > >> > >> > > From chih-hao.chen2 at mail.mcgill.ca Mon Mar 14 12:58:34 2016 From: chih-hao.chen2 at mail.mcgill.ca (Chih-Hao Chen) Date: Mon, 14 Mar 2016 17:58:34 +0000 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: References: Message-ID: Hell Matt, Thanks for the information. I am still trying it now. But I observed an interesting thing about the performance difference when running ex8.c about the ASM. When using Mvapich2 for mph, its convergence speed is much faster than OpenMPI. Is it becasue the ASM function has been optimized based on Mvapich2? Thanks very much. Best, Chih-Hao On Mar 10, 2016, at 4:28 PM, Matthew Knepley > wrote: On Thu, Mar 10, 2016 at 3:23 PM, Chih-Hao Chen > wrote: Hello PETSc members, Sorry for asking for help about the ASM in petsc4py. Currently I am using your ASM as my preconditioned in my solver. I know how to setup the PCASM based on the ex8.c in the following link. http://www.mcs.anl.gov/petsc/petsc-3.4/src/ksp/ksp/examples/tutorials/ex8.c But when using the function ?getASMSubKSP? in petsc4py, I have tried several methods, but still cannot get the subksp from each mpi. The subKSPs do not have to do with MPI. On each rank, you get the local KSPs. Here is the snippet of the code of the function. def getASMSubKSP(self): cdef PetscInt i = 0, n = 0 cdef PetscKSP *p = NULL CHKERR( PCASMGetSubKSP(self.pc, &n, NULL, &p) ) return [ref_KSP(p[i]) for i from 0 <= i From knepley at gmail.com Mon Mar 14 14:02:14 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 14 Mar 2016 14:02:14 -0500 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: References: Message-ID: On Mon, Mar 14, 2016 at 12:58 PM, Chih-Hao Chen < chih-hao.chen2 at mail.mcgill.ca> wrote: > Hell Matt, > > Thanks for the information. > I am still trying it now. > But I observed an interesting thing about the performance difference when > running ex8.c about the ASM. > When using Mvapich2 for mph, its convergence speed is much faster than > OpenMPI. > Is it becasue the ASM function has been optimized based on Mvapich2? > No, it is probably because you have a different partition on the two cases. For performance questions, we need to see the output of -log_summary for each case. Thanks, Matt > Thanks very much. > > > Best, > Chih-Hao > > On Mar 10, 2016, at 4:28 PM, Matthew Knepley wrote: > > On Thu, Mar 10, 2016 at 3:23 PM, Chih-Hao Chen < > chih-hao.chen2 at mail.mcgill.ca> wrote: > >> Hello PETSc members, >> >> >> Sorry for asking for help about the ASM in petsc4py. >> Currently I am using your ASM as my preconditioned in my solver. >> I know how to setup the PCASM based on the ex8.c in the following link. >> >> http://www.mcs.anl.gov/petsc/petsc-3.4/src/ksp/ksp/examples/tutorials/ex8.c >> >> But when using the function ?getASMSubKSP? in petsc4py, >> I have tried several methods, but still cannot get the subksp from each >> mpi. >> > > The subKSPs do not have to do with MPI. On each rank, you get the local > KSPs. > > >> Here is the snippet of the code of the function. >> >> def getASMSubKSP(self): cdef PetscInt i = 0, n = 0 cdef PetscKSP *p = NULL CHKERR( PCASMGetSubKSP(self.pc, &n, NULL, &p) ) return [ref_KSP(p[i]) for i from 0 <= i > >> >> In ex8.c, we could use a ?FOR? loop to access the indiivdual subksp. >> But in python, could I use ?FOR? loop to get all the subksps with: >> >> subksp[i] = pc.getASMSubKSP[i] >> > > I do not understand this, but I think the answer is no. > > >> Another question is in ex8.c, it seems I don?t need to do any setup to >> decompose the RHS vector. >> But do I need to decompose the RHS vector with any settings if I don?t >> use PETSc solvers but with your preconditioners? >> > > I do not understand what you mean by "decompose the RHS vector". > > Thanks, > > Matt > > >> Thanks very much. >> >> Best, >> Chih-Hao >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.tadeu at gmail.com Mon Mar 14 15:22:35 2016 From: e.tadeu at gmail.com (E. Tadeu) Date: Mon, 14 Mar 2016 17:22:35 -0300 Subject: [petsc-users] Build PETSc as shared library on Windows ("traditional" build or CMake) Message-ID: Hi, What is the current status of building PETSc as a shared library on Windows? It seems non-trivial: `--with-shared-libraries=1` won't work, since Cygwin's `ld` fails, and `win32fe` also fails. Also, what is the status of building PETSc with CMake on Windows? Perhaps through using CMake it would be easier to generate the .DLL's, but I couldn't find documentation on how to do this. Thanks! Edson -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Mar 14 16:31:51 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 Mar 2016 16:31:51 -0500 Subject: [petsc-users] Build PETSc as shared library on Windows ("traditional" build or CMake) In-Reply-To: References: Message-ID: you could try precompiled petsc from http://www.msic.ch/Downloads/Software [its old 3.5 version though] We don't have any changes wrt .dlls or cmake on windows.. Satish On Mon, 14 Mar 2016, E. Tadeu wrote: > Hi, > > What is the current status of building PETSc as a shared library on > Windows? It seems non-trivial: `--with-shared-libraries=1` won't work, > since Cygwin's `ld` fails, and `win32fe` also fails. > > Also, what is the status of building PETSc with CMake on Windows? Perhaps > through using CMake it would be easier to generate the .DLL's, but I > couldn't find documentation on how to do this. > > > Thanks! > Edson > From mrosso at uci.edu Mon Mar 14 20:10:42 2016 From: mrosso at uci.edu (Michele Rosso) Date: Mon, 14 Mar 2016 18:10:42 -0700 Subject: [petsc-users] PETSc without debugging Message-ID: <56E76112.1090900@uci.edu> Hi, I compiled the development branch of PETSc with "--with-debugging=0", but log_view warns me that I am running in debugging mode: ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## I attached the full output of log_view: it seems that I configured PETSc correctly. Can I safely ignore this or I am missing something? Thanks, Michele -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log.txt URL: From knepley at gmail.com Mon Mar 14 20:26:26 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 14 Mar 2016 20:26:26 -0500 Subject: [petsc-users] PETSc without debugging In-Reply-To: <56E76112.1090900@uci.edu> References: <56E76112.1090900@uci.edu> Message-ID: On Mon, Mar 14, 2016 at 8:10 PM, Michele Rosso wrote: > Hi, > > I compiled the development branch of PETSc with "--with-debugging=0", but > log_view warns me that I am running in debugging mode: > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run ./configure # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > I attached the full output of log_view: it seems that I configured PETSc > correctly. > Can I safely ignore this or I am missing something? > >From the log you sent: Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-has-attribute-aligned=1 --prefix=/u/sciteam/mrosso/libs/petsc/gnu/4.9/opt --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib=/opt/acml/5.3.1/gfortran64/lib/libacml.a --COPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC" --FOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC" --CXXOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC" --with-x="0 " --with-debugging=O --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --download-hypre=1 --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " PETSC_ARCH=gnu-opt-32idx It looks like you have set --with-debugging=O not --with-debugging=0 Thanks, Matt > Thanks, > Michele > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrosso at uci.edu Mon Mar 14 20:37:41 2016 From: mrosso at uci.edu (Michele Rosso) Date: Mon, 14 Mar 2016 18:37:41 -0700 Subject: [petsc-users] PETSc without debugging In-Reply-To: References: <56E76112.1090900@uci.edu> Message-ID: <56E76765.4090701@uci.edu> Matt, thank you very much. I apologize for this idiotic mistake. Michele On 03/14/2016 06:26 PM, Matthew Knepley wrote: > On Mon, Mar 14, 2016 at 8:10 PM, Michele Rosso > wrote: > > Hi, > > I compiled the development branch of PETSc with > "--with-debugging=0", but log_view warns me that I am running in > debugging mode: > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run ./configure # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > I attached the full output of log_view: it seems that I configured > PETSc correctly. > Can I safely ignore this or I am missing something? > > > From the log you sent: > > Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-has-attribute-aligned=1 --prefix=/u/sciteam/mrosso/libs/petsc/gnu/4.9/opt --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib=/opt/acml/5.3.1/gfortran64/lib/libacml.a --COPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC" --FOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC" --CXXOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC" --with-x="0 " --with-debugging=O --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --download-hypre=1 --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " PETSC_ARCH=gnu-opt-32idx > It looks like you have set > > --with-debugging=O > > not > > --with-debugging=0 > > Thanks, > > Matt > > Thanks, > Michele > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From steena.hpc at gmail.com Mon Mar 14 22:05:38 2016 From: steena.hpc at gmail.com (Steena Monteiro) Date: Mon, 14 Mar 2016 20:05:38 -0700 Subject: [petsc-users] MatSetSizes with blocked matrix Message-ID: Hello, I am having difficulty getting MatSetSize to work prior to using MatMult. For matrix A with rows=cols=1,139,905 and block size = 2, rank 0 gets 400000 rows and rank 1 739905 rows, like so: /*Matrix setup*/ ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); ierr = MatCreate(PETSC_COMM_WORLD,&A); ierr = MatSetFromOptions(A); ierr = MatSetType(A,MATBAIJ); ierr = MatSetBlockSize(A,2); /*Unequal row assignment*/ if (!rank) { ierr = MatSetSizes(A, 400000, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); } else { ierr = MatSetSizes(A, 739905, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); } MatMult (A,x,y); /************************************/ Error message: 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this object type Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 local 1139906 global after previously setting them to 739905 local 1139905 global -Without messing with row assignment, MatMult works fine on this matrix for block size = 2, presumably because an extra padded row is automatically added to facilitate blocking. -The above code snippet works well for block size = 1. Is it possible to do unequal row distribution *while using blocking*? Thank you for any advice. -Steena -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 14 22:46:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 14 Mar 2016 22:46:57 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro wrote: > Hello, > > I am having difficulty getting MatSetSize to work prior to using MatMult. > > For matrix A with rows=cols=1,139,905 and block size = 2, > It is inconsistent to have a row/col size that is not divisible by the block size. Matt > rank 0 gets 400000 rows and rank 1 739905 rows, like so: > > /*Matrix setup*/ > > ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); > ierr = MatCreate(PETSC_COMM_WORLD,&A); > ierr = MatSetFromOptions(A); > ierr = MatSetType(A,MATBAIJ); > ierr = MatSetBlockSize(A,2); > > /*Unequal row assignment*/ > > if (!rank) { > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, > 1139905,1139905);CHKERRQ(ierr); > } > else { > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, > 1139905,1139905);CHKERRQ(ierr); > } > > MatMult (A,x,y); > > /************************************/ > > Error message: > > 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this > object type > Cannot change/reset row sizes to 400000 local 1139906 global after > previously setting them to 400000 local 1139905 global > > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 > local 1139906 global after previously setting them to 739905 local 1139905 > global > > -Without messing with row assignment, MatMult works fine on this matrix > for block size = 2, presumably because an extra padded row is automatically > added to facilitate blocking. > > -The above code snippet works well for block size = 1. > > Is it possible to do unequal row distribution *while using blocking*? > > Thank you for any advice. > > -Steena > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Tue Mar 15 02:12:41 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 15 Mar 2016 08:12:41 +0100 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: On 15 March 2016 at 04:46, Matthew Knepley wrote: > On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro > wrote: > >> Hello, >> >> I am having difficulty getting MatSetSize to work prior to using MatMult. >> >> For matrix A with rows=cols=1,139,905 and block size = 2, >> > > It is inconsistent to have a row/col size that is not divisible by the > block size. > To be honest, I don't think the error message being thrown clearly indicates what the actual problem is (hence the email from Steena). What about "Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global. Local and global sizes must be divisible by the block size" > > Matt > > >> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >> >> /*Matrix setup*/ >> >> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >> ierr = MatCreate(PETSC_COMM_WORLD,&A); >> ierr = MatSetFromOptions(A); >> ierr = MatSetType(A,MATBAIJ); >> ierr = MatSetBlockSize(A,2); >> >> /*Unequal row assignment*/ >> >> if (!rank) { >> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >> 1139905,1139905);CHKERRQ(ierr); >> } >> else { >> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >> 1139905,1139905);CHKERRQ(ierr); >> } >> >> MatMult (A,x,y); >> >> /************************************/ >> >> Error message: >> >> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this >> object type >> Cannot change/reset row sizes to 400000 local 1139906 global after >> previously setting them to 400000 local 1139905 global >> >> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 >> local 1139906 global after previously setting them to 739905 local 1139905 >> global >> >> -Without messing with row assignment, MatMult works fine on this matrix >> for block size = 2, presumably because an extra padded row is automatically >> added to facilitate blocking. >> >> -The above code snippet works well for block size = 1. >> >> Is it possible to do unequal row distribution *while using blocking*? >> >> Thank you for any advice. >> >> -Steena >> >> >> >> >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Lukasz.Kaczmarczyk at glasgow.ac.uk Tue Mar 15 04:43:16 2016 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Tue, 15 Mar 2016 09:43:16 +0000 Subject: [petsc-users] multigrid preconditioning and adaptivity In-Reply-To: <6150D526-76C6-4F22-8721-C9876313C1BC@glasgow.ac.uk> References: <2FF531C4-2425-46FE-B031-20DA09074C2D@glasgow.ac.uk> <56DD8E5B.2000108@imperial.ac.uk> <2D37758E-47E5-47E2-A799-0F2F770A408A@glasgow.ac.uk> <6150D526-76C6-4F22-8721-C9876313C1BC@glasgow.ac.uk> Message-ID: <77227A6C-5B61-441B-B7BA-0A451D2B84D6@glasgow.ac.uk> Hello Matt, On 7 Mar 2016, at 16:42, Matthew Knepley > wrote: Yes, that should be it. It would be nice to have some example that does this if you would be willing to contribute some version of your code. Following discussion from last week, I had time to implement MG via DM, see following code, http://cdash.eng.gla.ac.uk/mofem/_p_c_m_g_set_up_via_approx_orders_8hpp_source.html Minimal setup for DM and MG looks like that 132 PetscErrorCode DMCreate_MGViaApproxOrders(DM dm) { 133 PetscErrorCode ierr; 134 PetscValidHeaderSpecific(dm,DM_CLASSID,1); 135 PetscFunctionBegin; 136 if(!dm->data) { 137 dm->data = new DMMGViaApproxOrdersCtx(); 138 } else { 139 ((DMCtx*)(dm->data))->referenceNumber++; 140 } 141 ierr = DMSetOperators_MoFEM(dm); CHKERRQ(ierr); 142 dm->ops->creatematrix = DMCreateMatrix_MGViaApproxOrders; 143 dm->ops->createglobalvector = DMCreateGlobalVector_MGViaApproxOrders; 144 dm->ops->coarsen = DMCoarsen_MGViaApproxOrders; 145 dm->ops->createinterpolation = DMCreateInterpolation_MGViaApproxOrders; 146 ierr = DMKSPSetComputeOperators(dm,ksp_set_operators,NULL); CHKERRQ(ierr); 147 PetscInfo1(dm,"Create DMMGViaApproxOrders reference = %d\n",((DMCtx*)(dm->data))->referenceNumber); 148 PetscFunctionReturn(0); 149 } It is not yet perfect, I need to add additional interface functions to improve functionality. If this in any way would be useful for you, pleas feel free to use it. I building now two examples, f.e. http://cdash.eng.gla.ac.uk/mofem/elasticity_8cpp_source.html, see Register/Create and SetUp DM; 303 DMType dm_name = "ELASTIC_PROB"; 304 ierr = DMRegister_MGViaApproxOrders(dm_name); CHKERRQ(ierr); 305 // ierr = DMRegister_MoFEM(dm_name); CHKERRQ(ierr); 306 307 //craete dm instance 308 DM dm; 309 ierr = DMCreate(PETSC_COMM_WORLD,&dm);CHKERRQ(ierr); 310 ierr = DMSetType(dm,dm_name);CHKERRQ(ierr); 311 //set dm datastruture whict created mofem datastructures 312 ierr = DMMoFEMCreateMoFEM(dm,&m_field,dm_name,bit_level0); CHKERRQ(ierr); 313 ierr = DMSetFromOptions(dm); CHKERRQ(ierr); 314 ierr = DMMoFEMSetIsPartitioned(dm,is_partitioned); CHKERRQ(ierr); 315 //add elements to dm 316 ierr = DMMoFEMAddElement(dm,"ELASTIC"); CHKERRQ(ierr); 317 ierr = DMMoFEMAddElement(dm,"BODY_FORCE"); CHKERRQ(ierr); 318 ierr = DMMoFEMAddElement(dm,"FLUID_PRESSURE_FE"); CHKERRQ(ierr); 319 ierr = DMMoFEMAddElement(dm,"FORCE_FE"); CHKERRQ(ierr); 320 ierr = DMMoFEMAddElement(dm,"PRESSURE_FE"); CHKERRQ(ierr); 321 ierr = DMSetUp(dm); CHKERRQ(ierr); Matrices and vector via DM, 326 Vec F,D,D0; 327 ierr = DMCreateGlobalVector(dm,&F); CHKERRQ(ierr); 328 ierr = VecDuplicate(F,&D); CHKERRQ(ierr); 329 ierr = VecDuplicate(F,&D0); CHKERRQ(ierr); 330 Mat Aij; 331 ierr = DMCreateMatrix(dm,&Aij); CHKERRQ(ierr); SetUP solver, 421 KSP solver; 422 ierr = KSPCreate(PETSC_COMM_WORLD,&solver); CHKERRQ(ierr); 423 ierr = KSPSetDM(solver,dm); CHKERRQ(ierr); 424 ierr = KSPSetFromOptions(solver); CHKERRQ(ierr); 425 ierr = KSPSetOperators(solver,Aij,Aij); CHKERRQ(ierr); 426 { 427 //from PETSc example ex42.c 428 PetscBool same = PETSC_FALSE; 429 PC pc; 430 ierr = KSPGetPC(solver,&pc); CHKERRQ(ierr); 431 PetscObjectTypeCompare((PetscObject)pc,PCMG,&same); 432 if (same) { 433 PCMGSetUpViaApproxOrdersCtx pc_ctx(dm,Aij); 434 ierr = PCMGSetUpViaApproxOrders(pc,&pc_ctx); CHKERRQ(ierr); 435 } else { 436 // Operators are already set, do not use DM for doing that 437 ierr = KSPSetDMActive(solver,PETSC_FALSE); CHKERRQ(ierr); 438 } 439 } 440 ierr = KSPSetInitialGuessKnoll(solver,PETSC_FALSE); CHKERRQ(ierr); 441 ierr = KSPSetInitialGuessNonzero(solver,PETSC_TRUE); CHKERRQ(ierr); 442 ierr = KSPSetUp(solver); CHKERRQ(ierr); Kind regards, Lukasz On 7 Mar 2016, at 17:41, Lukasz Kaczmarczyk > wrote: On 7 Mar 2016, at 16:42, Matthew Knepley > wrote: On Mon, Mar 7, 2016 at 10:28 AM, Lukasz Kaczmarczyk > wrote: Many thanks all for help. I started to implement function for DM. I understand that minimal implementation is that for the DM i need to have, is to have DMCoarsen and in each level for all DMs, set operators DMKSPSetComputeOperators and DMCreateInterpolation. Matrix matrix free P from DMCreateInterpolation have to have operators for mult and mult_traspose. Is that is all? Yes, that should be it. It would be nice to have some example that does this if you would be willing to contribute some version of your code. No problem, I will do that will pleasure. Kind regards, Lukasz On 7 Mar 2016, at 15:55, Mark Adams > wrote: You can just set the coarse grid matrix/operator instead of using Galerkin. If you have a shell (matrix free) P then you will need to create and set this yourself. Our Galerkin requires a matrix P. On Mon, Mar 7, 2016 at 9:32 AM, Lukasz Kaczmarczyk > wrote: > On 7 Mar 2016, at 14:21, Lawrence Mitchell > wrote: > > On 07/03/16 14:16, Lukasz Kaczmarczyk wrote: >> >>> On 7 Mar 2016, at 13:50, Matthew Knepley >>> >> wrote: >>> >>> On Mon, Mar 7, 2016 at 6:58 AM, Lukasz Kaczmarczyk >>> >>> >> wrote: >>> >>> Hello, >>> >>> I run multi-grid solver, with adaptivity, works well, however It >>> is some space for improving efficiency. I using hierarchical >>> approximation basis, for which >>> construction of interpolation operators is simple, it is simple >>> injection. >>> >>> After each refinement level (increase of order of approximation >>> on some element) I rebuild multigrid pre-conditioner with >>> additional level. It is a way to add dynamically new levels >>> without need of rebuilding whole MG pre-conditioner. >>> >>> >>> That does not currently exist, however it would not be hard to add, >>> since the MG structure jsut consists of >>> arrays of pointers. >>> >>> >>> Looking at execution profile I noticed that 50%-60% of time is >>> spent on MatPtAP function during PCSetUP stage. >>> >>> >>> Which means you are using a Galerkin projection to define the coarse >>> operator. Do you have a direct way of defining >>> this operator (rediscretization)? >> >> Matt, >> >> >> Thanks for swift response. You are right, I using Galerkin projection. >> >> Yes, I have a way to get directly coarse operator, it is some sub >> matrix of whole matrix. I taking advantage here form hierarchical >> approximation. >> >> I could reimplement PCSetUp_MG to set the MG structure directly, but >> this probably not good approach, since my implementation which will >> work with current petsc version could be incompatible which future >> changes in native MG data structures. The alternative option is to >> hack MatPtAP itself, and until petsc MG will use this, whatever >> changes you will make in MG in the future my code will work. > > Why not provide a shell DM to the KSP that knows how to compute the > operators (and how to refine/coarsen and therefore > restrict/interpolate). Then there's no need to use Galerkin coarse > grid operators, and the KSP will just call back to your code to create > the appropriate matrices. Hello Lawrence, Thanks, it is good advice. I have already my DM shell, however I have not looked yet how make it in the context of MG. Now is probably time to do that. DM shell http://userweb.eng.gla.ac.uk/lukasz.kaczmarczyk/MoFem/html/group__dm.html Regards, Lukasz -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlcch at dtu.dk Tue Mar 15 06:33:25 2016 From: mlcch at dtu.dk (Max la Cour Christensen) Date: Tue, 15 Mar 2016 11:33:25 +0000 Subject: [petsc-users] Using TS In-Reply-To: <56E4D85E.6060102@mcs.anl.gov> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> <56E4D85E.6060102@mcs.anl.gov> Message-ID: <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> Barry, Matt, Emil, thanks for the explanations! They are very useful. Emil: The application is the flow of oil and gas in the subsurface. The code uses a model of the rock in the subsurface and given a setup of wells, it computes a prediction for the production of oil and gas. It can also be used to model CO2 storage in the subsurface. I think we will attempt to switch to TS. We are working with different formulations of the equations. In the following, I have summarised how I think IFunction and IJacobian should be implemented for the formulation of the most interest to us. If you can bare with me and confirm this that would be great. The equations are: dm_o/dt = f_o(m,p) dm_g/dt = f_g(m,p) S_o(m,p) + S_g(m,p) = 1, where m_o is the mass of oil m_g is the mass of gas t is time, f(m_x,p) contains fluxes for mass transfer between adjacent cells and source terms for wells S_o is the oil saturation S_g is the gas saturation p is the pressure. In this formulation, the primary variables are m_o, m_g and p. The above looks simpler than it is. The code to evaluate these things is around 90k lines of Fortran. If I understood correctly, IFunction and IJacobian would implement the following: IFunction: f_o(m,p) - dm_o/dt f_g(m,p) - dm_g/dt S_o(m,p) + S_g(m,p) - 1, where dm_o/dt and dm_g/dt is given by the time derivative (u_t in PETSc) from IFunction() IJacobian: First 2 equations: \partial f(m,p) / \partial m - a and \partial f(m,p) / \partial p Last equation: \partial (S_o(m,p)+S_g(m,p)-1) / \partial m and \partial (S_o(m,p)+S_g(m,p)-1) / \partial p, where a comes from PETSc IJacobian(). Is this correct usage? Emil: Now for the adjoint implementation, we will need to provide a function for TSAdjointSetRHSJacobian, which computes the derivative of our system wrt. our control variables (control variables being the p in the PETSc manual) and some additional cost stuff: TSSetCostGradients() and possibly TSSetCostIntegrand(). Is this correct? Our cost function would typically be to maximize profit of an oil reservoir. In a simple way, this could be to maximize the number of oil barrels produced over the lifetime of the reservoir (e.g. 50 years). In the equations above, the number of oil barrels is contained in the source terms for the wells. Many thanks, Max On Mar 13, 2016, at 4:02 AM, Emil Constantinescu > wrote: On 3/12/16 8:37 PM, Matthew Knepley wrote: On Sat, Mar 12, 2016 at 8:34 PM, Emil Constantinescu > wrote: I also find it useful to go through one of the simple examples available for TS: http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html (ex8 may be a good start). As Barry suggested, you need to implement IFunction and IJacobian. The argument "u" is S_o, S_w, and p stacked together and "u_t" their corresponding time derivatives. The rest is calculus, but following an example usually helps a lot in the beginning. Are you guys saying that IFunction and IJacobian are enough to do the adjoint system as well? Pretty much yes, but it depends on the cost function. This is the beauty of discrete adjoints - if you have the Jacobian (transpose, done internally through KSP) you're done. You need IJacobian for sure to do the backward propagation. If you have that, the rest is usually trivial. Mr. Hong Zhang (my postdoc) set up quite a few simple examples. Emil Matt Out of curiosity, what is the application? Emil On 3/12/16 3:19 PM, Barry Smith wrote: This is only a starting point, Jed and Emil can fix my mistakes and provide additional details. In your case you will not provide a TSSetRHSFunction and TSSetRHSJacobian since everything should be treated implicitly as a DAE. First move everything in the three equations to the left side and then differentiate through the \partial/\partial t so that it only applies to the S_o, S_w, and p. For example for the first equation using the product rule twice you get something like \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o \partial \phi( p ) \partial t - F_o = 0 \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p ) \partial p \partial t - F_o = 0 The two vector arguments to your IFunction are exactly the S_o, S_w, and p and \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t so it is immediate to code up your IFunction once you have the analytic form above For the IJacobian and the "shift business" just remember that dF/dU means take the derivative of the IFunction with respect to S_o, S_w, and p treating the \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t as if they were independent of S_o, S_w, and p. For the dF/dU_t that means taking the derivate with respect to the \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t treating the S_o, S_w, and p as independent of \partial S_o/ \partial t , \partial S_w/ \partial t, and \partial p/ \partial t. Then you just need to form the sum of the two parts with the a "shift" scaling dF/dU + a*dF/dU_t For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial S_w)/\partial t = 0 (\partial p)/\partial t = 0 Computations for the first two equations are messy but straightforward. For example for the first equation dF/dS_o = phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) = \phi( p ) \rho_o( p ) Barry On Mar 12, 2016, at 12:04 PM, Matthew Knepley > wrote: On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen > wrote: Hi guys, We are making preparations to implement adjoint based optimisation in our in-house oil and gas reservoir simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, KSP and PC. We are still not using SNES and TS, but instead we have our own backward Euler and Newton-Raphson implementation. Due to the upcoming implementation of adjoints, we are considering changing the code and begin using TS and SNES. After examining the PETSc manual and examples, we are still not completely clear on how to apply TS to our system of PDEs. In a simplified formulation, it can be written as: \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) S_o + S_w = 1, where p is the pressure, \phi( p ) is a porosity function depending on pressure, \rho_x( p ) is a density function depending on pressure, S_o is the saturation of oil, S_g is the saturation of gas, t is time, F_x(p,S) is a function containing fluxes and source terms. The primary variables are p, S_o and S_w. We are using a lowest order Finite Volume discretisation. Now for implementing this in TS (with the prospect of later using TSAdjoint), we are not sure if we need all of the functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian and TSSetRHSJacobian and what parts of the equations go where. Especially we are unsure of how to use the concept of a shifted jacobian (TSSetIJacobian). Any advice you could provide will be highly appreciated. Barry and Emil, I am also interested in this, since I don't know how to do it. Thanks, Matt Many thanks, Max la Cour Christensen PhD student, Technical University of Denmark -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From emconsta at mcs.anl.gov Tue Mar 15 10:12:22 2016 From: emconsta at mcs.anl.gov (Emil Constantinescu) Date: Tue, 15 Mar 2016 10:12:22 -0500 Subject: [petsc-users] Using TS In-Reply-To: <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> <56E4D85E.6060102@mcs.anl.gov> <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> Message-ID: <56E82656.9090408@mcs.anl.gov> On 3/15/16 6:33 AM, Max la Cour Christensen wrote: > > Barry, Matt, Emil, thanks for the explanations! They are very useful. > > *Emil:* The application is the flow of oil and gas in the subsurface. > The code uses a model of the rock in the subsurface and given a setup of > wells, it computes a prediction for the production of oil and gas. It > can also be used to model CO2 storage in the subsurface. > > I think we will attempt to switch to TS. We are working with different > formulations of the equations. In the following, I have summarised how I > think IFunction and IJacobian should be implemented for the formulation > of the most interest to us. If you can bare with me and confirm this > that would be great. The equations are: > > dm_o/dt = f_o(m,p) > dm_g/dt = f_g(m,p) > S_o(m,p) + S_g(m,p) = 1, > > where > > m_o is the mass of oil > m_g is the mass of gas > t is time, > f(m_x,p) contains fluxes for mass transfer between adjacent cells and > source terms for wells > S_o is the oil saturation > S_g is the gas saturation > p is the pressure. > > In this formulation, the primary variables are m_o, m_g and p. So then in PETSc, the u variable will be u=[m_o,m_g,p]^T and u_t=[dm_o/dt, dm_g/dt, dp/dt]^T. See the RoberFunction in http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex8.c.html where you can think of [m_o,m_g,p] = [x[0], x[1], x[2]] and [dm_o/dt, dm_g/dt, dp/dt] = [xdot[0], xdot[1], xdot[2]]. > The above looks simpler than it is. The code to evaluate these things is > around 90k lines of Fortran. This is a good level of abstraction. > If I understood correctly, IFunction and IJacobian would implement the > following: > > *IFunction: * > > f_o(m,p) - dm_o/dt > f_g(m,p) - dm_g/dt > S_o(m,p) + S_g(m,p) - 1, Yes, this looks fine. > where dm_o/dt and dm_g/dt is given by the time derivative (u_t in PETSc) > from IFunction() > > *IJacobian:* > > First 2 equations: \partial f(m,p) / \partial m - a and \partial f(m,p) > / \partial p > Last equation: \partial (S_o(m,p)+S_g(m,p)-1) / \partial m and \partial > (S_o(m,p)+S_g(m,p)-1) / \partial p, > > where a comes from PETSc IJacobian(). > > Is this correct usage? Okay, so here you need the Jacobian matrix. This is matrix with rows corresponding to "equations" and columns corresponding to "derivatives." You have at least tow options here: store the Jacobian (see ex36.c in TS for a simple DAE example: IFunction and IJacobian) or if it's too big, you can compute its action on a vector. In both cases you would need to do the same calculations but in the former and you can think of the Jacobian matrix by blocks: in your case you will have 3x3 blocks. The first row corresponding to your first equation will be something like: J[0][0] = \partial f(m,p) / \partial m_o + a*(-dm_o/dt)/dm_o J[0][1] = \partial f(m,p) / \partial m_g + a*( dm_o/dt)/dm_g J[0][2] = \partial f(m,p) / \partial p + a*( dm_o/dt)/dp so you'd get something like J[0][0] = \partial f(m,p) / \partial m_o - a (times identity) J[0][1] = \partial f(m,p) / \partial m_g J[0][2] = \partial f(m,p) / \partial p and so on. I think ex8 is really good at driving this point if there is any ambiguity. Note that there are 3 problems defined in that file. > *Emil: *Now for the adjoint implementation, we will need to provide a > function for TSAdjointSetRHSJacobian, which computes the derivative of > our system wrt. our control variables (control variables being the p in > the PETSc manual) and some additional cost stuff: TSSetCostGradients() > and possibly TSSetCostIntegrand(). Is this correct? Our cost function > would typically be to maximize profit of an oil reservoir. In a simple > way, this could be to maximize the number of oil barrels produced over > the lifetime of the reservoir (e.g. 50 years). In the equations above, > the number of oil barrels is contained in the source terms for the wells. Yups, that sounds reasonable. Note that we have simple examples in PETSc that might treat exactly this situation (including the optimization). Although they are not documented yet [Hong promised yesterday to add more details ;)] if you see exXX.c and then exXXadj.c exXXopt_ic.c exXXopt_p.c they correspond to exXX.c with sensitivity, optimization for initial conditions and optimization for parameters. Hong is the main driver behind the adjoint implementation and can help on much shorter notice than I can. Hope this helps, Emil > Many thanks, > Max > > > On Mar 13, 2016, at 4:02 AM, Emil Constantinescu > > wrote: > >> On 3/12/16 8:37 PM, Matthew Knepley wrote: >>> On Sat, Mar 12, 2016 at 8:34 PM, Emil Constantinescu >>> >>> > wrote: >>> >>> I also find it useful to go through one of the simple examples >>> available for TS: >>> http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html >>> (ex8 may be a good start). >>> >>> As Barry suggested, you need to implement IFunction and IJacobian. >>> The argument "u" is S_o, S_w, and p stacked together and "u_t" >>> their corresponding time derivatives. The rest is calculus, but >>> following an example usually helps a lot in the beginning. >>> >>> >>> Are you guys saying that IFunction and IJacobian are enough to do the >>> adjoint system as well? >>> >> >> Pretty much yes, but it depends on the cost function. This is the >> beauty of discrete adjoints - if you have the Jacobian (transpose, >> done internally through KSP) you're done. You need IJacobian for sure >> to do the backward propagation. If you have that, the rest is usually >> trivial. Mr. Hong Zhang (my postdoc) set up quite a few simple examples. >> >> Emil >> >>> Matt >>> >>> Out of curiosity, what is the application? >>> >>> Emil >>> >>> >>> On 3/12/16 3:19 PM, Barry Smith wrote: >>> >>> >>> This is only a starting point, Jed and Emil can fix my >>> mistakes and provide additional details. >>> >>> >>> In your case you will not provide a TSSetRHSFunction and >>> TSSetRHSJacobian since everything should be treated implicitly >>> as a DAE. >>> >>> First move everything in the three equations to the left >>> side and then differentiate through the \partial/\partial t so >>> that it only applies to the S_o, S_w, and p. For example for the >>> first equation using the product rule twice you get something like >>> >>> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) >>> S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o >>> \partial \phi( p ) \partial t - F_o = 0 >>> >>> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) >>> S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p >>> ) \partial p \partial t - F_o = 0 >>> >>> The two vector arguments to your IFunction are exactly the S_o, >>> S_w, and p and \partial S_o/ \partial t , \partial S_w/ >>> \partial t, and \partial p/ \partial t so it is immediate to >>> code up your IFunction once you have the analytic form above >>> >>> For the IJacobian and the "shift business" just remember that >>> dF/dU means take the derivative of the IFunction with respect to >>> S_o, S_w, and p treating the \partial S_o/ \partial t , >>> \partial S_w/ \partial t, and \partial p/ \partial t as if they >>> were independent of S_o, S_w, and p. For the dF/dU_t that means >>> taking the derivate with respect to the \partial S_o/ \partial t >>> , \partial S_w/ \partial t, and \partial p/ \partial t >>> treating the S_o, S_w, and p as independent of \partial S_o/ >>> \partial t , \partial S_w/ \partial t, and \partial p/ >>> \partial t. Then you just need to form the sum of the two >>> parts with the a "shift" scaling dF/dU + a*dF/dU_t >>> >>> For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = >>> 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial >>> S_w)/\partial t = 0 (\partial p)/\partial t = 0 >>> >>> Computations for the first two equations are messy but >>> straightforward. For example for the first equation dF/dS_o = >>> phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( >>> p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) >>> = \phi( p ) \rho_o( p ) >>> >>> >>> Barry >>> >>> On Mar 12, 2016, at 12:04 PM, Matthew Knepley >>> >>> > wrote: >>> >>> On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen >>> > >>> wrote: >>> >>> Hi guys, >>> >>> We are making preparations to implement adjoint based >>> optimisation in our in-house oil and gas reservoir >>> simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, >>> KSP and PC. We are still not using SNES and TS, but instead >>> we have our own backward Euler and Newton-Raphson >>> implementation. Due to the upcoming implementation of >>> adjoints, we are considering changing the code and begin >>> using TS and SNES. >>> >>> After examining the PETSc manual and examples, we are still >>> not completely clear on how to apply TS to our system of >>> PDEs. In a simplified formulation, it can be written as: >>> >>> \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) >>> \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) >>> S_o + S_w = 1, >>> >>> where p is the pressure, >>> \phi( p ) is a porosity function depending on pressure, >>> \rho_x( p ) is a density function depending on pressure, >>> S_o is the saturation of oil, >>> S_g is the saturation of gas, >>> t is time, >>> F_x(p,S) is a function containing fluxes and source terms. >>> The primary variables are p, S_o and S_w. >>> >>> We are using a lowest order Finite Volume discretisation. >>> >>> Now for implementing this in TS (with the prospect of later >>> using TSAdjoint), we are not sure if we need all of the >>> functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian >>> and TSSetRHSJacobian and what parts of the equations go >>> where. Especially we are unsure of how to use the concept of >>> a shifted jacobian (TSSetIJacobian). >>> >>> Any advice you could provide will be highly appreciated. >>> >>> Barry and Emil, >>> >>> I am also interested in this, since I don't know how to do it. >>> >>> Thanks, >>> >>> Matt >>> >>> Many thanks, >>> Max la Cour Christensen >>> PhD student, Technical University of Denmark >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener > From mlcch at dtu.dk Tue Mar 15 10:35:39 2016 From: mlcch at dtu.dk (Max la Cour Christensen) Date: Tue, 15 Mar 2016 15:35:39 +0000 Subject: [petsc-users] Using TS In-Reply-To: <56E82656.9090408@mcs.anl.gov> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> <56E4D85E.6060102@mcs.anl.gov> <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> <56E82656.9090408@mcs.anl.gov> Message-ID: Great, many thanks! This would only be minor adjustments to our existing code for the evaluation of residual and Jacobian. The majority of our work will be in migrating time step size controllers, initialisation, controls for the wells, well outputs, 3D outputs and error/warning outputs. /Max On Mar 15, 2016, at 4:12 PM, Emil Constantinescu wrote: > On 3/15/16 6:33 AM, Max la Cour Christensen wrote: >> >> Barry, Matt, Emil, thanks for the explanations! They are very useful. >> >> *Emil:* The application is the flow of oil and gas in the subsurface. >> The code uses a model of the rock in the subsurface and given a setup of >> wells, it computes a prediction for the production of oil and gas. It >> can also be used to model CO2 storage in the subsurface. >> >> I think we will attempt to switch to TS. We are working with different >> formulations of the equations. In the following, I have summarised how I >> think IFunction and IJacobian should be implemented for the formulation >> of the most interest to us. If you can bare with me and confirm this >> that would be great. The equations are: >> >> dm_o/dt = f_o(m,p) >> dm_g/dt = f_g(m,p) >> S_o(m,p) + S_g(m,p) = 1, >> >> where >> >> m_o is the mass of oil >> m_g is the mass of gas >> t is time, >> f(m_x,p) contains fluxes for mass transfer between adjacent cells and >> source terms for wells >> S_o is the oil saturation >> S_g is the gas saturation >> p is the pressure. >> >> In this formulation, the primary variables are m_o, m_g and p. > > So then in PETSc, the u variable will be u=[m_o,m_g,p]^T and u_t=[dm_o/dt, dm_g/dt, dp/dt]^T. See the RoberFunction in > > http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex8.c.html > > where you can think of > > [m_o,m_g,p] = [x[0], x[1], x[2]] and > [dm_o/dt, dm_g/dt, dp/dt] = [xdot[0], xdot[1], xdot[2]]. > > >> The above looks simpler than it is. The code to evaluate these things is >> around 90k lines of Fortran. > > This is a good level of abstraction. > >> If I understood correctly, IFunction and IJacobian would implement the >> following: >> >> *IFunction: * >> >> f_o(m,p) - dm_o/dt >> f_g(m,p) - dm_g/dt >> S_o(m,p) + S_g(m,p) - 1, > > Yes, this looks fine. > >> where dm_o/dt and dm_g/dt is given by the time derivative (u_t in PETSc) >> from IFunction() >> >> *IJacobian:* >> >> First 2 equations: \partial f(m,p) / \partial m - a and \partial f(m,p) >> / \partial p >> Last equation: \partial (S_o(m,p)+S_g(m,p)-1) / \partial m and \partial >> (S_o(m,p)+S_g(m,p)-1) / \partial p, >> >> where a comes from PETSc IJacobian(). >> >> Is this correct usage? > > Okay, so here you need the Jacobian matrix. This is matrix with rows corresponding to "equations" and columns corresponding to "derivatives." > > You have at least tow options here: store the Jacobian (see ex36.c in TS for a simple DAE example: IFunction and IJacobian) or if it's too big, you can compute its action on a vector. > > In both cases you would need to do the same calculations but in the former and you can think of the Jacobian matrix by blocks: in your case you will have 3x3 blocks. The first row corresponding to your first equation will be something like: > > J[0][0] = \partial f(m,p) / \partial m_o + a*(-dm_o/dt)/dm_o > J[0][1] = \partial f(m,p) / \partial m_g + a*( dm_o/dt)/dm_g > J[0][2] = \partial f(m,p) / \partial p + a*( dm_o/dt)/dp > > so you'd get something like > > J[0][0] = \partial f(m,p) / \partial m_o - a (times identity) > J[0][1] = \partial f(m,p) / \partial m_g > J[0][2] = \partial f(m,p) / \partial p > > and so on. > > I think ex8 is really good at driving this point if there is any ambiguity. Note that there are 3 problems defined in that file. > >> *Emil: *Now for the adjoint implementation, we will need to provide a >> function for TSAdjointSetRHSJacobian, which computes the derivative of >> our system wrt. our control variables (control variables being the p in >> the PETSc manual) and some additional cost stuff: TSSetCostGradients() >> and possibly TSSetCostIntegrand(). Is this correct? Our cost function >> would typically be to maximize profit of an oil reservoir. In a simple >> way, this could be to maximize the number of oil barrels produced over >> the lifetime of the reservoir (e.g. 50 years). In the equations above, >> the number of oil barrels is contained in the source terms for the wells. > > Yups, that sounds reasonable. Note that we have simple examples in PETSc that might treat exactly this situation (including the optimization). > > Although they are not documented yet [Hong promised yesterday to add more details ;)] if you see exXX.c and then exXXadj.c exXXopt_ic.c exXXopt_p.c they correspond to exXX.c with sensitivity, optimization for initial conditions and optimization for parameters. > > Hong is the main driver behind the adjoint implementation and can help on much shorter notice than I can. > > Hope this helps, > Emil > >> Many thanks, >> Max >> >> >> On Mar 13, 2016, at 4:02 AM, Emil Constantinescu > > >> wrote: >> >>> On 3/12/16 8:37 PM, Matthew Knepley wrote: >>>> On Sat, Mar 12, 2016 at 8:34 PM, Emil Constantinescu >>>> >>>> > wrote: >>>> >>>> I also find it useful to go through one of the simple examples >>>> available for TS: >>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html >>>> (ex8 may be a good start). >>>> >>>> As Barry suggested, you need to implement IFunction and IJacobian. >>>> The argument "u" is S_o, S_w, and p stacked together and "u_t" >>>> their corresponding time derivatives. The rest is calculus, but >>>> following an example usually helps a lot in the beginning. >>>> >>>> >>>> Are you guys saying that IFunction and IJacobian are enough to do the >>>> adjoint system as well? >>>> >>> >>> Pretty much yes, but it depends on the cost function. This is the >>> beauty of discrete adjoints - if you have the Jacobian (transpose, >>> done internally through KSP) you're done. You need IJacobian for sure >>> to do the backward propagation. If you have that, the rest is usually >>> trivial. Mr. Hong Zhang (my postdoc) set up quite a few simple examples. >>> >>> Emil >>> >>>> Matt >>>> >>>> Out of curiosity, what is the application? >>>> >>>> Emil >>>> >>>> >>>> On 3/12/16 3:19 PM, Barry Smith wrote: >>>> >>>> >>>> This is only a starting point, Jed and Emil can fix my >>>> mistakes and provide additional details. >>>> >>>> >>>> In your case you will not provide a TSSetRHSFunction and >>>> TSSetRHSJacobian since everything should be treated implicitly >>>> as a DAE. >>>> >>>> First move everything in the three equations to the left >>>> side and then differentiate through the \partial/\partial t so >>>> that it only applies to the S_o, S_w, and p. For example for the >>>> first equation using the product rule twice you get something like >>>> >>>> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) >>>> S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o >>>> \partial \phi( p ) \partial t - F_o = 0 >>>> >>>> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) >>>> S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p >>>> ) \partial p \partial t - F_o = 0 >>>> >>>> The two vector arguments to your IFunction are exactly the S_o, >>>> S_w, and p and \partial S_o/ \partial t , \partial S_w/ >>>> \partial t, and \partial p/ \partial t so it is immediate to >>>> code up your IFunction once you have the analytic form above >>>> >>>> For the IJacobian and the "shift business" just remember that >>>> dF/dU means take the derivative of the IFunction with respect to >>>> S_o, S_w, and p treating the \partial S_o/ \partial t , >>>> \partial S_w/ \partial t, and \partial p/ \partial t as if they >>>> were independent of S_o, S_w, and p. For the dF/dU_t that means >>>> taking the derivate with respect to the \partial S_o/ \partial t >>>> , \partial S_w/ \partial t, and \partial p/ \partial t >>>> treating the S_o, S_w, and p as independent of \partial S_o/ >>>> \partial t , \partial S_w/ \partial t, and \partial p/ >>>> \partial t. Then you just need to form the sum of the two >>>> parts with the a "shift" scaling dF/dU + a*dF/dU_t >>>> >>>> For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = >>>> 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial >>>> S_w)/\partial t = 0 (\partial p)/\partial t = 0 >>>> >>>> Computations for the first two equations are messy but >>>> straightforward. For example for the first equation dF/dS_o = >>>> phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( >>>> p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) >>>> = \phi( p ) \rho_o( p ) >>>> >>>> >>>> Barry >>>> >>>> On Mar 12, 2016, at 12:04 PM, Matthew Knepley >>>> >>>> > wrote: >>>> >>>> On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen >>>> > >>>> wrote: >>>> >>>> Hi guys, >>>> >>>> We are making preparations to implement adjoint based >>>> optimisation in our in-house oil and gas reservoir >>>> simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, >>>> KSP and PC. We are still not using SNES and TS, but instead >>>> we have our own backward Euler and Newton-Raphson >>>> implementation. Due to the upcoming implementation of >>>> adjoints, we are considering changing the code and begin >>>> using TS and SNES. >>>> >>>> After examining the PETSc manual and examples, we are still >>>> not completely clear on how to apply TS to our system of >>>> PDEs. In a simplified formulation, it can be written as: >>>> >>>> \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) >>>> \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) >>>> S_o + S_w = 1, >>>> >>>> where p is the pressure, >>>> \phi( p ) is a porosity function depending on pressure, >>>> \rho_x( p ) is a density function depending on pressure, >>>> S_o is the saturation of oil, >>>> S_g is the saturation of gas, >>>> t is time, >>>> F_x(p,S) is a function containing fluxes and source terms. >>>> The primary variables are p, S_o and S_w. >>>> >>>> We are using a lowest order Finite Volume discretisation. >>>> >>>> Now for implementing this in TS (with the prospect of later >>>> using TSAdjoint), we are not sure if we need all of the >>>> functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian >>>> and TSSetRHSJacobian and what parts of the equations go >>>> where. Especially we are unsure of how to use the concept of >>>> a shifted jacobian (TSSetIJacobian). >>>> >>>> Any advice you could provide will be highly appreciated. >>>> >>>> Barry and Emil, >>>> >>>> I am also interested in this, since I don't know how to do it. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Many thanks, >>>> Max la Cour Christensen >>>> PhD student, Technical University of Denmark >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin >>>> their experiments is infinitely more interesting than any >>>> results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which >>>> their experiments lead. >>>> -- Norbert Wiener >> From steena.hpc at gmail.com Tue Mar 15 10:54:40 2016 From: steena.hpc at gmail.com (Steena Monteiro) Date: Tue, 15 Mar 2016 08:54:40 -0700 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: Thank you, Dave. Matt: I understand the inconsistency but MatMult with non divisible block sizes (here, 2) does not throw any errors and fail, when MatSetSize is commented out. Implying that 1139905 global size does work with block size 2. On 15 March 2016 at 00:12, Dave May wrote: > > On 15 March 2016 at 04:46, Matthew Knepley wrote: > >> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro >> wrote: >> >>> Hello, >>> >>> I am having difficulty getting MatSetSize to work prior to using MatMult. >>> >>> For matrix A with rows=cols=1,139,905 and block size = 2, >>> >> >> It is inconsistent to have a row/col size that is not divisible by the >> block size. >> > > > To be honest, I don't think the error message being thrown clearly > indicates what the actual problem is (hence the email from Steena). What > about > > "Cannot change/reset row sizes to 400000 local 1139906 global after > previously setting them to 400000 local 1139905 global. Local and global > sizes must be divisible by the block size" > > >> >> Matt >> >> >>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>> >>> /*Matrix setup*/ >>> >>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>> ierr = MatSetFromOptions(A); >>> ierr = MatSetType(A,MATBAIJ); >>> ierr = MatSetBlockSize(A,2); >>> >>> /*Unequal row assignment*/ >>> >>> if (!rank) { >>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>> 1139905,1139905);CHKERRQ(ierr); >>> } >>> else { >>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>> 1139905,1139905);CHKERRQ(ierr); >>> } >>> >>> MatMult (A,x,y); >>> >>> /************************************/ >>> >>> Error message: >>> >>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this >>> object type >>> Cannot change/reset row sizes to 400000 local 1139906 global after >>> previously setting them to 400000 local 1139905 global >>> >>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 >>> local 1139906 global after previously setting them to 739905 local 1139905 >>> global >>> >>> -Without messing with row assignment, MatMult works fine on this matrix >>> for block size = 2, presumably because an extra padded row is automatically >>> added to facilitate blocking. >>> >>> -The above code snippet works well for block size = 1. >>> >>> Is it possible to do unequal row distribution *while using blocking*? >>> >>> Thank you for any advice. >>> >>> -Steena >>> >>> >>> >>> >>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 15 10:58:05 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Mar 2016 10:58:05 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro wrote: > Thank you, Dave. > > Matt: I understand the inconsistency but MatMult with non divisible block > sizes (here, 2) does not throw any errors and fail, when MatSetSize is > commented out. Implying that 1139905 global size does work with block size > 2. > If you comment out MatSetSize(), how does it know what size the Mat is? Matt > On 15 March 2016 at 00:12, Dave May wrote: > >> >> On 15 March 2016 at 04:46, Matthew Knepley wrote: >> >>> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro >>> wrote: >>> >>>> Hello, >>>> >>>> I am having difficulty getting MatSetSize to work prior to using >>>> MatMult. >>>> >>>> For matrix A with rows=cols=1,139,905 and block size = 2, >>>> >>> >>> It is inconsistent to have a row/col size that is not divisible by the >>> block size. >>> >> >> >> To be honest, I don't think the error message being thrown clearly >> indicates what the actual problem is (hence the email from Steena). What >> about >> >> "Cannot change/reset row sizes to 400000 local 1139906 global after >> previously setting them to 400000 local 1139905 global. Local and global >> sizes must be divisible by the block size" >> >> >>> >>> Matt >>> >>> >>>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>>> >>>> /*Matrix setup*/ >>>> >>>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>>> ierr = MatSetFromOptions(A); >>>> ierr = MatSetType(A,MATBAIJ); >>>> ierr = MatSetBlockSize(A,2); >>>> >>>> /*Unequal row assignment*/ >>>> >>>> if (!rank) { >>>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>>> 1139905,1139905);CHKERRQ(ierr); >>>> } >>>> else { >>>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>>> 1139905,1139905);CHKERRQ(ierr); >>>> } >>>> >>>> MatMult (A,x,y); >>>> >>>> /************************************/ >>>> >>>> Error message: >>>> >>>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this >>>> object type >>>> Cannot change/reset row sizes to 400000 local 1139906 global after >>>> previously setting them to 400000 local 1139905 global >>>> >>>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 >>>> local 1139906 global after previously setting them to 739905 local 1139905 >>>> global >>>> >>>> -Without messing with row assignment, MatMult works fine on this >>>> matrix for block size = 2, presumably because an extra padded row is >>>> automatically added to facilitate blocking. >>>> >>>> -The above code snippet works well for block size = 1. >>>> >>>> Is it possible to do unequal row distribution *while using blocking*? >>>> >>>> Thank you for any advice. >>>> >>>> -Steena >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From steena.hpc at gmail.com Tue Mar 15 11:04:36 2016 From: steena.hpc at gmail.com (Steena Monteiro) Date: Tue, 15 Mar 2016 09:04:36 -0700 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: I pass a binary, matrix data file at the command line and load it into the matrix: PetscInitialize(&argc,&args,(char*)0,help); ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); /* converted mtx to dat file*/ ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); CHKERRQ(ierr); if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat file with -f"); /* Load matrices */ ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); ierr = MatSetFromOptions(A);CHKERRQ(ierr); Thanks, Steena On 15 March 2016 at 08:58, Matthew Knepley wrote: > On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro > wrote: > >> Thank you, Dave. >> >> Matt: I understand the inconsistency but MatMult with non divisible block >> sizes (here, 2) does not throw any errors and fail, when MatSetSize is >> commented out. Implying that 1139905 global size does work with block size >> 2. >> > > If you comment out MatSetSize(), how does it know what size the Mat is? > > Matt > > >> On 15 March 2016 at 00:12, Dave May wrote: >> >>> >>> On 15 March 2016 at 04:46, Matthew Knepley wrote: >>> >>>> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro >>> > wrote: >>>> >>>>> Hello, >>>>> >>>>> I am having difficulty getting MatSetSize to work prior to using >>>>> MatMult. >>>>> >>>>> For matrix A with rows=cols=1,139,905 and block size = 2, >>>>> >>>> >>>> It is inconsistent to have a row/col size that is not divisible by the >>>> block size. >>>> >>> >>> >>> To be honest, I don't think the error message being thrown clearly >>> indicates what the actual problem is (hence the email from Steena). What >>> about >>> >>> "Cannot change/reset row sizes to 400000 local 1139906 global after >>> previously setting them to 400000 local 1139905 global. Local and global >>> sizes must be divisible by the block size" >>> >>> >>>> >>>> Matt >>>> >>>> >>>>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>>>> >>>>> /*Matrix setup*/ >>>>> >>>>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>>>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>>>> ierr = MatSetFromOptions(A); >>>>> ierr = MatSetType(A,MATBAIJ); >>>>> ierr = MatSetBlockSize(A,2); >>>>> >>>>> /*Unequal row assignment*/ >>>>> >>>>> if (!rank) { >>>>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>>>> 1139905,1139905);CHKERRQ(ierr); >>>>> } >>>>> else { >>>>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>>>> 1139905,1139905);CHKERRQ(ierr); >>>>> } >>>>> >>>>> MatMult (A,x,y); >>>>> >>>>> /************************************/ >>>>> >>>>> Error message: >>>>> >>>>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this >>>>> object type >>>>> Cannot change/reset row sizes to 400000 local 1139906 global after >>>>> previously setting them to 400000 local 1139905 global >>>>> >>>>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to >>>>> 739905 local 1139906 global after previously setting them to 739905 local >>>>> 1139905 global >>>>> >>>>> -Without messing with row assignment, MatMult works fine on this >>>>> matrix for block size = 2, presumably because an extra padded row is >>>>> automatically added to facilitate blocking. >>>>> >>>>> -The above code snippet works well for block size = 1. >>>>> >>>>> Is it possible to do unequal row distribution *while using blocking*? >>>>> >>>>> Thank you for any advice. >>>>> >>>>> -Steena >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 15 11:15:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Mar 2016 11:15:01 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro wrote: > I pass a binary, matrix data file at the command line and load it into the > matrix: > > PetscInitialize(&argc,&args,(char*)0,help); > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > > /* converted mtx to dat file*/ > ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); > CHKERRQ(ierr); > > if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat file > with -f"); > > /* Load matrices */ > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); > ierr = MatSetFromOptions(A);CHKERRQ(ierr); > Nothing above loads a matrix. Do you also call MatLoad()? Matt > Thanks, > Steena > > On 15 March 2016 at 08:58, Matthew Knepley wrote: > >> On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro >> wrote: >> >>> Thank you, Dave. >>> >>> Matt: I understand the inconsistency but MatMult with non divisible >>> block sizes (here, 2) does not throw any errors and fail, when MatSetSize >>> is commented out. Implying that 1139905 global size does work with block >>> size 2. >>> >> >> If you comment out MatSetSize(), how does it know what size the Mat is? >> >> Matt >> >> >>> On 15 March 2016 at 00:12, Dave May wrote: >>> >>>> >>>> On 15 March 2016 at 04:46, Matthew Knepley wrote: >>>> >>>>> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro < >>>>> steena.hpc at gmail.com> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I am having difficulty getting MatSetSize to work prior to using >>>>>> MatMult. >>>>>> >>>>>> For matrix A with rows=cols=1,139,905 and block size = 2, >>>>>> >>>>> >>>>> It is inconsistent to have a row/col size that is not divisible by the >>>>> block size. >>>>> >>>> >>>> >>>> To be honest, I don't think the error message being thrown clearly >>>> indicates what the actual problem is (hence the email from Steena). What >>>> about >>>> >>>> "Cannot change/reset row sizes to 400000 local 1139906 global after >>>> previously setting them to 400000 local 1139905 global. Local and global >>>> sizes must be divisible by the block size" >>>> >>>> >>>>> >>>>> Matt >>>>> >>>>> >>>>>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>>>>> >>>>>> /*Matrix setup*/ >>>>>> >>>>>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>>>>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>>>>> ierr = MatSetFromOptions(A); >>>>>> ierr = MatSetType(A,MATBAIJ); >>>>>> ierr = MatSetBlockSize(A,2); >>>>>> >>>>>> /*Unequal row assignment*/ >>>>>> >>>>>> if (!rank) { >>>>>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>> } >>>>>> else { >>>>>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>> } >>>>>> >>>>>> MatMult (A,x,y); >>>>>> >>>>>> /************************************/ >>>>>> >>>>>> Error message: >>>>>> >>>>>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this >>>>>> object type >>>>>> Cannot change/reset row sizes to 400000 local 1139906 global after >>>>>> previously setting them to 400000 local 1139905 global >>>>>> >>>>>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to >>>>>> 739905 local 1139906 global after previously setting them to 739905 local >>>>>> 1139905 global >>>>>> >>>>>> -Without messing with row assignment, MatMult works fine on this >>>>>> matrix for block size = 2, presumably because an extra padded row is >>>>>> automatically added to facilitate blocking. >>>>>> >>>>>> -The above code snippet works well for block size = 1. >>>>>> >>>>>> Is it possible to do unequal row distribution *while using blocking*? >>>>>> >>>>>> Thank you for any advice. >>>>>> >>>>>> -Steena >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue Mar 15 11:15:54 2016 From: hongzhang at anl.gov (Hong Zhang) Date: Tue, 15 Mar 2016 11:15:54 -0500 Subject: [petsc-users] Using TS In-Reply-To: <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> <56E4D85E.6060102@mcs.anl.gov> <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> Message-ID: <65987AF5-986D-4389-9B3E-08D9AE9E65E1@anl.gov> > On Mar 15, 2016, at 6:33 AM, Max la Cour Christensen wrote: > > > Barry, Matt, Emil, thanks for the explanations! They are very useful. > > Emil: The application is the flow of oil and gas in the subsurface. The code uses a model of the rock in the subsurface and given a setup of wells, it computes a prediction for the production of oil and gas. It can also be used to model CO2 storage in the subsurface. > > I think we will attempt to switch to TS. We are working with different formulations of the equations. In the following, I have summarised how I think IFunction and IJacobian should be implemented for the formulation of the most interest to us. If you can bare with me and confirm this that would be great. The equations are: > > dm_o/dt = f_o(m,p) > dm_g/dt = f_g(m,p) > S_o(m,p) + S_g(m,p) = 1, > > where > > m_o is the mass of oil > m_g is the mass of gas > t is time, > f(m_x,p) contains fluxes for mass transfer between adjacent cells and source terms for wells > S_o is the oil saturation > S_g is the gas saturation > p is the pressure. > > In this formulation, the primary variables are m_o, m_g and p. > > The above looks simpler than it is. The code to evaluate these things is around 90k lines of Fortran. > > If I understood correctly, IFunction and IJacobian would implement the following: > > IFunction: > > f_o(m,p) - dm_o/dt > f_g(m,p) - dm_g/dt > S_o(m,p) + S_g(m,p) - 1, > > where dm_o/dt and dm_g/dt is given by the time derivative (u_t in PETSc) from IFunction() > > IJacobian: > > First 2 equations: \partial f(m,p) / \partial m - a and \partial f(m,p) / \partial p > Last equation: \partial (S_o(m,p)+S_g(m,p)-1) / \partial m and \partial (S_o(m,p)+S_g(m,p)-1) / \partial p, > > where a comes from PETSc IJacobian(). > > Is this correct usage? > > Emil: Now for the adjoint implementation, we will need to provide a function for TSAdjointSetRHSJacobian, which computes the derivative of our system wrt. our control variables (control variables being the p in the PETSc manual) and some additional cost stuff: TSSetCostGradients() and possibly TSSetCostIntegrand(). Is this correct? Our cost function would typically be to maximize profit of an oil reservoir. In a simple way, this could be to maximize the number of oil barrels produced over the lifetime of the reservoir (e.g. 50 years). In the equations above, the number of oil barrels is contained in the source terms for the wells. If your cost function depends on the intermediate states of the simulation, e.g. in an integral form \int F(m_o,m_g,p;c) dt, where c is your control variable, then the integrand F() needs to be provided with TSSetCostIntegrand() so that PETSc can evaluate the cost function and its sensitivities to the control variables. If the cost function depends only on the final states, e.g. the values of m_o,m_g,p at the end of simulation, this function is not needed. By the way, is the number of oil barrels represented in integer or real values? Hong > Many thanks, > Max > > > On Mar 13, 2016, at 4:02 AM, Emil Constantinescu > > wrote: > >> On 3/12/16 8:37 PM, Matthew Knepley wrote: >>> On Sat, Mar 12, 2016 at 8:34 PM, Emil Constantinescu >>> >> wrote: >>> >>> I also find it useful to go through one of the simple examples >>> available for TS: >>> http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html >>> (ex8 may be a good start). >>> >>> As Barry suggested, you need to implement IFunction and IJacobian. >>> The argument "u" is S_o, S_w, and p stacked together and "u_t" >>> their corresponding time derivatives. The rest is calculus, but >>> following an example usually helps a lot in the beginning. >>> >>> >>> Are you guys saying that IFunction and IJacobian are enough to do the >>> adjoint system as well? >>> >> >> Pretty much yes, but it depends on the cost function. This is the beauty of discrete adjoints - if you have the Jacobian (transpose, done internally through KSP) you're done. You need IJacobian for sure to do the backward propagation. If you have that, the rest is usually trivial. Mr. Hong Zhang (my postdoc) set up quite a few simple examples. >> >> Emil >> >>> Matt >>> >>> Out of curiosity, what is the application? >>> >>> Emil >>> >>> >>> On 3/12/16 3:19 PM, Barry Smith wrote: >>> >>> >>> This is only a starting point, Jed and Emil can fix my >>> mistakes and provide additional details. >>> >>> >>> In your case you will not provide a TSSetRHSFunction and >>> TSSetRHSJacobian since everything should be treated implicitly >>> as a DAE. >>> >>> First move everything in the three equations to the left >>> side and then differentiate through the \partial/\partial t so >>> that it only applies to the S_o, S_w, and p. For example for the >>> first equation using the product rule twice you get something like >>> >>> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) >>> S_o \partial \rho_o( p ) \partial t + \rho_o( p ) S_o >>> \partial \phi( p ) \partial t - F_o = 0 >>> >>> \phi( p ) \rho_o( p ) \partial S_o/ \partial t + phi( p ) >>> S_o \rho_o'(p) \partial p \partial t + \rho_o( p ) S_o \phi'( p >>> ) \partial p \partial t - F_o = 0 >>> >>> The two vector arguments to your IFunction are exactly the S_o, >>> S_w, and p and \partial S_o/ \partial t , \partial S_w/ >>> \partial t, and \partial p/ \partial t so it is immediate to >>> code up your IFunction once you have the analytic form above >>> >>> For the IJacobian and the "shift business" just remember that >>> dF/dU means take the derivative of the IFunction with respect to >>> S_o, S_w, and p treating the \partial S_o/ \partial t , >>> \partial S_w/ \partial t, and \partial p/ \partial t as if they >>> were independent of S_o, S_w, and p. For the dF/dU_t that means >>> taking the derivate with respect to the \partial S_o/ \partial t >>> , \partial S_w/ \partial t, and \partial p/ \partial t >>> treating the S_o, S_w, and p as independent of \partial S_o/ >>> \partial t , \partial S_w/ \partial t, and \partial p/ >>> \partial t. Then you just need to form the sum of the two >>> parts with the a "shift" scaling dF/dU + a*dF/dU_t >>> >>> For the third equation everything is easy. dF/dS_o = 1 dF/dS_w = >>> 1 dF/dp = 0 dF/d (\partial S_o)/\partial t = 0 (\partial >>> S_w)/\partial t = 0 (\partial p)/\partial t = 0 >>> >>> Computations for the first two equations are messy but >>> straightforward. For example for the first equation dF/dS_o = >>> phi( p ) \rho_o'(p) \partial p \partial t + \rho_o( p ) \phi'( >>> p ) \partial p + dF_o/dS_o and dF/d (\partial S_o)/\partial t) >>> = \phi( p ) \rho_o( p ) >>> >>> >>> Barry >>> >>> On Mar 12, 2016, at 12:04 PM, Matthew Knepley >>> >> wrote: >>> >>> On Sat, Mar 12, 2016 at 5:41 AM, Max la Cour Christensen >>> >> wrote: >>> >>> Hi guys, >>> >>> We are making preparations to implement adjoint based >>> optimisation in our in-house oil and gas reservoir >>> simulator. Currently our code uses PETSc's DMPlex, Vec, Mat, >>> KSP and PC. We are still not using SNES and TS, but instead >>> we have our own backward Euler and Newton-Raphson >>> implementation. Due to the upcoming implementation of >>> adjoints, we are considering changing the code and begin >>> using TS and SNES. >>> >>> After examining the PETSc manual and examples, we are still >>> not completely clear on how to apply TS to our system of >>> PDEs. In a simplified formulation, it can be written as: >>> >>> \partial( \phi( p ) \rho_o( p ) S_o )/ \partial t = F_o(p,S) >>> \partial( \phi( p ) \rho_w( p ) S_w )/ \partial t = F_w(p,S) >>> S_o + S_w = 1, >>> >>> where p is the pressure, >>> \phi( p ) is a porosity function depending on pressure, >>> \rho_x( p ) is a density function depending on pressure, >>> S_o is the saturation of oil, >>> S_g is the saturation of gas, >>> t is time, >>> F_x(p,S) is a function containing fluxes and source terms. >>> The primary variables are p, S_o and S_w. >>> >>> We are using a lowest order Finite Volume discretisation. >>> >>> Now for implementing this in TS (with the prospect of later >>> using TSAdjoint), we are not sure if we need all of the >>> functions: TSSetIFunction, TSSetRHSFunction, TSSetIJacobian >>> and TSSetRHSJacobian and what parts of the equations go >>> where. Especially we are unsure of how to use the concept of >>> a shifted jacobian (TSSetIJacobian). >>> >>> Any advice you could provide will be highly appreciated. >>> >>> Barry and Emil, >>> >>> I am also interested in this, since I don't know how to do it. >>> >>> Thanks, >>> >>> Matt >>> >>> Many thanks, >>> Max la Cour Christensen >>> PhD student, Technical University of Denmark >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emconsta at mcs.anl.gov Tue Mar 15 11:47:25 2016 From: emconsta at mcs.anl.gov (Emil Constantinescu) Date: Tue, 15 Mar 2016 11:47:25 -0500 Subject: [petsc-users] Using TS In-Reply-To: <65987AF5-986D-4389-9B3E-08D9AE9E65E1@anl.gov> References: <988B5ECF-8F1E-4D23-8620-37B06E328EF0@dtu.dk> <56E4D1B1.4040901@mcs.anl.gov> <56E4D85E.6060102@mcs.anl.gov> <38B910AE-0DC0-4C61-97E9-5C678D736B50@dtu.dk> <65987AF5-986D-4389-9B3E-08D9AE9E65E1@anl.gov> Message-ID: <56E83C9D.4020702@mcs.anl.gov> On 3/15/16 11:15 AM, Hong Zhang wrote: > By the way, is the number of oil barrels represented in integer or real > values? I assume they refer to the unit of measure (1 barrel = about 160 liters) not the actual container, so we should be good there. Emil From steena.hpc at gmail.com Tue Mar 15 12:14:16 2016 From: steena.hpc at gmail.com (Steena Monteiro) Date: Tue, 15 Mar 2016 10:14:16 -0700 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: Hi Matt, Yes, the rest of the code is: ierr = MatSetBlockSize(A,2);CHKERRQ(ierr); * ierr = MatLoad(A,fd);CHKERRQ(ierr);* ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); ierr = VecSet(y,zero); CHKERRQ(ierr); ierr = MatMult(A,x,y); CHKERRQ(ierr); ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); ierr = MatDestroy(&A);CHKERRQ(ierr); ierr = VecDestroy(&x);CHKERRQ(ierr); ierr = VecDestroy(&y);CHKERRQ(ierr); Thanks, Steena On 15 March 2016 at 09:15, Matthew Knepley wrote: > On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro > wrote: > >> I pass a binary, matrix data file at the command line and load it into >> the matrix: >> >> PetscInitialize(&argc,&args,(char*)0,help); >> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); >> >> /* converted mtx to dat file*/ >> ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); >> CHKERRQ(ierr); >> >> if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat >> file with -f"); >> >> /* Load matrices */ >> ierr = >> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >> ierr = >> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >> ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); >> ierr = MatSetFromOptions(A);CHKERRQ(ierr); >> > > Nothing above loads a matrix. Do you also call MatLoad()? > > Matt > > >> Thanks, >> Steena >> >> On 15 March 2016 at 08:58, Matthew Knepley wrote: >> >>> On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro >>> wrote: >>> >>>> Thank you, Dave. >>>> >>>> Matt: I understand the inconsistency but MatMult with non divisible >>>> block sizes (here, 2) does not throw any errors and fail, when MatSetSize >>>> is commented out. Implying that 1139905 global size does work with block >>>> size 2. >>>> >>> >>> If you comment out MatSetSize(), how does it know what size the Mat is? >>> >>> Matt >>> >>> >>>> On 15 March 2016 at 00:12, Dave May wrote: >>>> >>>>> >>>>> On 15 March 2016 at 04:46, Matthew Knepley wrote: >>>>> >>>>>> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro < >>>>>> steena.hpc at gmail.com> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I am having difficulty getting MatSetSize to work prior to using >>>>>>> MatMult. >>>>>>> >>>>>>> For matrix A with rows=cols=1,139,905 and block size = 2, >>>>>>> >>>>>> >>>>>> It is inconsistent to have a row/col size that is not divisible by >>>>>> the block size. >>>>>> >>>>> >>>>> >>>>> To be honest, I don't think the error message being thrown clearly >>>>> indicates what the actual problem is (hence the email from Steena). What >>>>> about >>>>> >>>>> "Cannot change/reset row sizes to 400000 local 1139906 global after >>>>> previously setting them to 400000 local 1139905 global. Local and global >>>>> sizes must be divisible by the block size" >>>>> >>>>> >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>>>>>> >>>>>>> /*Matrix setup*/ >>>>>>> >>>>>>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>>>>>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>>>>>> ierr = MatSetFromOptions(A); >>>>>>> ierr = MatSetType(A,MATBAIJ); >>>>>>> ierr = MatSetBlockSize(A,2); >>>>>>> >>>>>>> /*Unequal row assignment*/ >>>>>>> >>>>>>> if (!rank) { >>>>>>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>>> } >>>>>>> else { >>>>>>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>>> } >>>>>>> >>>>>>> MatMult (A,x,y); >>>>>>> >>>>>>> /************************************/ >>>>>>> >>>>>>> Error message: >>>>>>> >>>>>>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for >>>>>>> this object type >>>>>>> Cannot change/reset row sizes to 400000 local 1139906 global after >>>>>>> previously setting them to 400000 local 1139905 global >>>>>>> >>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to >>>>>>> 739905 local 1139906 global after previously setting them to 739905 local >>>>>>> 1139905 global >>>>>>> >>>>>>> -Without messing with row assignment, MatMult works fine on this >>>>>>> matrix for block size = 2, presumably because an extra padded row is >>>>>>> automatically added to facilitate blocking. >>>>>>> >>>>>>> -The above code snippet works well for block size = 1. >>>>>>> >>>>>>> Is it possible to do unequal row distribution *while using blocking* >>>>>>> ? >>>>>>> >>>>>>> Thank you for any advice. >>>>>>> >>>>>>> -Steena >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 15 12:17:29 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Mar 2016 12:17:29 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: On Tue, Mar 15, 2016 at 12:14 PM, Steena Monteiro wrote: > Hi Matt, > > Yes, the rest of the code is: > > ierr = MatSetBlockSize(A,2);CHKERRQ(ierr); > * ierr = MatLoad(A,fd);CHKERRQ(ierr);* > So, are you saying that 1) You have a matrix with odd total dimension 2) You set the block size of the initial matrix to 2 3) You load the matrix and there is no error? Can you make a simple example with a matrix of size 5? I can put in the relevant error checking. Thanks, Matt > ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); > > > ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); > ierr = VecSet(y,zero); CHKERRQ(ierr); > ierr = MatMult(A,x,y); CHKERRQ(ierr); > > > ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); > ierr = MatDestroy(&A);CHKERRQ(ierr); > ierr = VecDestroy(&x);CHKERRQ(ierr); > ierr = VecDestroy(&y);CHKERRQ(ierr); > > Thanks, > Steena > > > On 15 March 2016 at 09:15, Matthew Knepley wrote: > >> On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro >> wrote: >> >>> I pass a binary, matrix data file at the command line and load it into >>> the matrix: >>> >>> PetscInitialize(&argc,&args,(char*)0,help); >>> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); >>> >>> /* converted mtx to dat file*/ >>> ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); >>> CHKERRQ(ierr); >>> >>> if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat >>> file with -f"); >>> >>> /* Load matrices */ >>> ierr = >>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >>> ierr = >>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >>> ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); >>> ierr = MatSetFromOptions(A);CHKERRQ(ierr); >>> >> >> Nothing above loads a matrix. Do you also call MatLoad()? >> >> Matt >> >> >>> Thanks, >>> Steena >>> >>> On 15 March 2016 at 08:58, Matthew Knepley wrote: >>> >>>> On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro >>> > wrote: >>>> >>>>> Thank you, Dave. >>>>> >>>>> Matt: I understand the inconsistency but MatMult with non divisible >>>>> block sizes (here, 2) does not throw any errors and fail, when MatSetSize >>>>> is commented out. Implying that 1139905 global size does work with block >>>>> size 2. >>>>> >>>> >>>> If you comment out MatSetSize(), how does it know what size the Mat is? >>>> >>>> Matt >>>> >>>> >>>>> On 15 March 2016 at 00:12, Dave May wrote: >>>>> >>>>>> >>>>>> On 15 March 2016 at 04:46, Matthew Knepley wrote: >>>>>> >>>>>>> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro < >>>>>>> steena.hpc at gmail.com> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I am having difficulty getting MatSetSize to work prior to using >>>>>>>> MatMult. >>>>>>>> >>>>>>>> For matrix A with rows=cols=1,139,905 and block size = 2, >>>>>>>> >>>>>>> >>>>>>> It is inconsistent to have a row/col size that is not divisible by >>>>>>> the block size. >>>>>>> >>>>>> >>>>>> >>>>>> To be honest, I don't think the error message being thrown clearly >>>>>> indicates what the actual problem is (hence the email from Steena). What >>>>>> about >>>>>> >>>>>> "Cannot change/reset row sizes to 400000 local 1139906 global after >>>>>> previously setting them to 400000 local 1139905 global. Local and global >>>>>> sizes must be divisible by the block size" >>>>>> >>>>>> >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>>>>>>> >>>>>>>> /*Matrix setup*/ >>>>>>>> >>>>>>>> >>>>>>>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>>>>>>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>>>>>>> ierr = MatSetFromOptions(A); >>>>>>>> ierr = MatSetType(A,MATBAIJ); >>>>>>>> ierr = MatSetBlockSize(A,2); >>>>>>>> >>>>>>>> /*Unequal row assignment*/ >>>>>>>> >>>>>>>> if (!rank) { >>>>>>>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>>>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>>>> } >>>>>>>> else { >>>>>>>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>>>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>>>> } >>>>>>>> >>>>>>>> MatMult (A,x,y); >>>>>>>> >>>>>>>> /************************************/ >>>>>>>> >>>>>>>> Error message: >>>>>>>> >>>>>>>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for >>>>>>>> this object type >>>>>>>> Cannot change/reset row sizes to 400000 local 1139906 global after >>>>>>>> previously setting them to 400000 local 1139905 global >>>>>>>> >>>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to >>>>>>>> 739905 local 1139906 global after previously setting them to 739905 local >>>>>>>> 1139905 global >>>>>>>> >>>>>>>> -Without messing with row assignment, MatMult works fine on this >>>>>>>> matrix for block size = 2, presumably because an extra padded row is >>>>>>>> automatically added to facilitate blocking. >>>>>>>> >>>>>>>> -The above code snippet works well for block size = 1. >>>>>>>> >>>>>>>> Is it possible to do unequal row distribution *while using >>>>>>>> blocking*? >>>>>>>> >>>>>>>> Thank you for any advice. >>>>>>>> >>>>>>>> -Steena >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.R.Smith at jhuapl.edu Tue Mar 15 12:24:12 2016 From: Kevin.R.Smith at jhuapl.edu (Smith, Kevin R.) Date: Tue, 15 Mar 2016 17:24:12 +0000 Subject: [petsc-users] Matrix reuse with changing structure Message-ID: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> Hello, Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I'm hoping to avoid destroying and allocating new matrices each time the structure changes. Thanks, Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 15 12:28:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Mar 2016 12:28:57 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> Message-ID: On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: > Hello, > > > > Is it possible to reuse a sparse matrix and not reuse its non-zero > structure? I have a matrix whose sparse structure changes each time. I?m > hoping to avoid destroying and allocating new matrices each time the > structure changes. > Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using MatSetType(A, MATSHELL) MatSetType(A, ) A little messy but it will work. Thanks, Matt > > > Thanks, > > Kevin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 15 12:47:53 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 15 Mar 2016 12:47:53 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> Message-ID: <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> > On Mar 15, 2016, at 12:28 PM, Matthew Knepley wrote: > > On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: > Hello, > > > > Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. Barry > > > Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using > > MatSetType(A, MATSHELL) > MatSetType(A, ) > > A little messy but it will work. > > Thanks, > > Matt > > > > Thanks, > > Kevin > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From Kevin.R.Smith at jhuapl.edu Tue Mar 15 12:56:28 2016 From: Kevin.R.Smith at jhuapl.edu (Smith, Kevin R.) Date: Tue, 15 Mar 2016 17:56:28 +0000 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> Message-ID: <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. Matt - Thanks, I will try this solution out. Thanks for your help, Kevin -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Tuesday, March 15, 2016 1:48 PM To: Matthew Knepley Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure > On Mar 15, 2016, at 12:28 PM, Matthew Knepley wrote: > > On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: > Hello, > > > > Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. Barry > > > Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using > > MatSetType(A, MATSHELL) > MatSetType(A, ) > > A little messy but it will work. > > Thanks, > > Matt > > > > Thanks, > > Kevin > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From bsmith at mcs.anl.gov Tue Mar 15 13:01:43 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 15 Mar 2016 13:01:43 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> Message-ID: <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. wrote: > > > Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. You need to call a MatSetOption() to tell the matrix that it is allowed to allocate new nonzeros. > > Matt - Thanks, I will try this solution out. > > Thanks for your help, > > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 1:48 PM > To: Matthew Knepley > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley wrote: >> >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: >> Hello, >> >> >> >> Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. > > Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. > > Barry > >> >> >> Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using >> >> MatSetType(A, MATSHELL) >> MatSetType(A, ) >> >> A little messy but it will work. >> >> Thanks, >> >> Matt >> >> >> >> Thanks, >> >> Kevin >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From bsmith at mcs.anl.gov Tue Mar 15 21:54:13 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 15 Mar 2016 21:54:13 -0500 Subject: [petsc-users] Tao TRON solver tolerances In-Reply-To: References: Message-ID: <58AB78DA-3E7F-471F-872D-04ADAEB2EB75@mcs.anl.gov> > On Mar 10, 2016, at 12:03 PM, Justin Chang wrote: > > Hi again, > > I was reading through the TAO manual and the impression I am getting is that the KSP solver computes the gradient/projection, not necessarily the solution itself. Meaning it matters not how accurate this projection is, so long as the actual objective tolerance is met. Not sure what you mean by this. The KSP solver computes an "update" to the solution. So long as the update is "enough of" a descent direction then you will get convergence of the optimization problem. > > Is this a correct assessment of why one can get away with a less stringent KSP tolerance and still attain an accurate solution? Of course knowing how accurate the KSP must be to insure the update is "enough of" a descent direction is impossible :-) Barry > > Thanks, > Justin > > On Tuesday, March 8, 2016, Justin Chang wrote: > Hi all, > > So I am solving a convex optimization problem of the following form: > > min 1/2 x^T*H*x - x^T*f > s.t. 0 < x < 1 > > Using the TAOTRON solver, I also have CG/ILU for KSP/PC. The following TAO solver tolerances are used for my specific problem: > > -tao_gatol 1e-12 > -tao_grtol 1e-7 > > I noticed that the KSP tolerance truly defines the performance of this solver. Attached are three run cases with -ksp_rtol 1e-7, 1e-3, and 1e-1 with "-ksp_converged_reason -ksp_monitor_true_residual -tao_view -tao_converged_reason -log_view". It seems that the lower the KSP tolerance, the faster the time-to-solution where the number of KSP/TAO solve iterations remains roughly the same. > > So my question is, is this "normal"? That is, if using TRON, one may relax the KSP tolerances because the convergence of the solver is primarily due to the objective functional from TRON and not necessarily the KSP solve itself? Is there a general rule of thumb for this, because it would seem to me that for any TRON solve I do, i could just set a really low KSP rtol and still get roughly the same performance. > > Thanks, > Justin > From jychang48 at gmail.com Tue Mar 15 22:31:42 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 15 Mar 2016 22:31:42 -0500 Subject: [petsc-users] Tao TRON solver tolerances In-Reply-To: <58AB78DA-3E7F-471F-872D-04ADAEB2EB75@mcs.anl.gov> References: <58AB78DA-3E7F-471F-872D-04ADAEB2EB75@mcs.anl.gov> Message-ID: Thanks Barry. Yes, KSP solver computing an "update" to the solution makes sense. I also get that it's impossible to know whether this update is "enough of" a descent direction. What I am wondering, however, is why TAO no longer has -tao_fatol or -tao_frtol options. 9 months ago, TAO had those options and that's what I used for my problems. Back then, when I used BLMVM, it took ~900 tao iterations and ~5000 seconds of wall-clock time for one of my problems. Today, when I run the exact same problem but with -tao_grtol, it now takes ~1900 iterations and ~1000 seconds. Same solution, but twice the amount of work. I am guessing this means that the gradient tolerance need not be as stringent as the objective functional tolerance. But do you guys know why it was removed in the first place? Thanks, Justin On Tue, Mar 15, 2016 at 9:54 PM, Barry Smith wrote: > > > On Mar 10, 2016, at 12:03 PM, Justin Chang wrote: > > > > Hi again, > > > > I was reading through the TAO manual and the impression I am getting is > that the KSP solver computes the gradient/projection, not necessarily the > solution itself. Meaning it matters not how accurate this projection is, so > long as the actual objective tolerance is met. > > Not sure what you mean by this. > > The KSP solver computes an "update" to the solution. So long as the > update is "enough of" a descent direction then you will get convergence of > the optimization problem. > > > > > Is this a correct assessment of why one can get away with a less > stringent KSP tolerance and still attain an accurate solution? > > Of course knowing how accurate the KSP must be to insure the update is > "enough of" a descent direction is impossible :-) > > Barry > > > > > Thanks, > > Justin > > > > On Tuesday, March 8, 2016, Justin Chang wrote: > > Hi all, > > > > So I am solving a convex optimization problem of the following form: > > > > min 1/2 x^T*H*x - x^T*f > > s.t. 0 < x < 1 > > > > Using the TAOTRON solver, I also have CG/ILU for KSP/PC. The following > TAO solver tolerances are used for my specific problem: > > > > -tao_gatol 1e-12 > > -tao_grtol 1e-7 > > > > I noticed that the KSP tolerance truly defines the performance of this > solver. Attached are three run cases with -ksp_rtol 1e-7, 1e-3, and 1e-1 > with "-ksp_converged_reason -ksp_monitor_true_residual -tao_view > -tao_converged_reason -log_view". It seems that the lower the KSP > tolerance, the faster the time-to-solution where the number of KSP/TAO > solve iterations remains roughly the same. > > > > So my question is, is this "normal"? That is, if using TRON, one may > relax the KSP tolerances because the convergence of the solver is primarily > due to the objective functional from TRON and not necessarily the KSP solve > itself? Is there a general rule of thumb for this, because it would seem to > me that for any TRON solve I do, i could just set a really low KSP rtol and > still get roughly the same performance. > > > > Thanks, > > Justin > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 15 23:00:43 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 15 Mar 2016 23:00:43 -0500 Subject: [petsc-users] Tao TRON solver tolerances In-Reply-To: References: <58AB78DA-3E7F-471F-872D-04ADAEB2EB75@mcs.anl.gov> Message-ID: > On Mar 15, 2016, at 10:31 PM, Justin Chang wrote: > > Thanks Barry. Yes, KSP solver computing an "update" to the solution makes sense. I also get that it's impossible to know whether this update is "enough of" a descent direction. > > What I am wondering, however, is why TAO no longer has -tao_fatol or -tao_frtol options. 9 months ago, TAO had those options and that's what I used for my problems. Back then, when I used BLMVM, it took ~900 tao iterations and ~5000 seconds of wall-clock time for one of my problems. Today, when I run the exact same problem but with -tao_grtol, it now takes ~1900 iterations and ~1000 seconds. Same solution, but twice the amount of work. > > I am guessing this means that the gradient tolerance need not be as stringent as the objective functional tolerance. But do you guys know why it was removed in the first place? We considered them redundant and not a good measure of the actual quality of the solution. The problem with all of the convergence criteria is that they do not measure the error of your solution, they are only some "indirect" measure of the error, that is we measure something that we can measure and then "hope" that what we measure actually has something to do with the actual error. You should be able to use a lesser tolerance with the grtol then you currently use and yet still get as "quality" a solution as you previously got. I suggest "calibrating" what you should set for grtol by running "the old code" with fa or frtol and determining what you need to use for grtol to get a similar reduction and then use that "calibrated" grtol for future runs. Barry > > Thanks, > Justin > > On Tue, Mar 15, 2016 at 9:54 PM, Barry Smith wrote: > > > On Mar 10, 2016, at 12:03 PM, Justin Chang wrote: > > > > Hi again, > > > > I was reading through the TAO manual and the impression I am getting is that the KSP solver computes the gradient/projection, not necessarily the solution itself. Meaning it matters not how accurate this projection is, so long as the actual objective tolerance is met. > > Not sure what you mean by this. > > The KSP solver computes an "update" to the solution. So long as the update is "enough of" a descent direction then you will get convergence of the optimization problem. > > > > > Is this a correct assessment of why one can get away with a less stringent KSP tolerance and still attain an accurate solution? > > Of course knowing how accurate the KSP must be to insure the update is "enough of" a descent direction is impossible :-) > > Barry > > > > > Thanks, > > Justin > > > > On Tuesday, March 8, 2016, Justin Chang wrote: > > Hi all, > > > > So I am solving a convex optimization problem of the following form: > > > > min 1/2 x^T*H*x - x^T*f > > s.t. 0 < x < 1 > > > > Using the TAOTRON solver, I also have CG/ILU for KSP/PC. The following TAO solver tolerances are used for my specific problem: > > > > -tao_gatol 1e-12 > > -tao_grtol 1e-7 > > > > I noticed that the KSP tolerance truly defines the performance of this solver. Attached are three run cases with -ksp_rtol 1e-7, 1e-3, and 1e-1 with "-ksp_converged_reason -ksp_monitor_true_residual -tao_view -tao_converged_reason -log_view". It seems that the lower the KSP tolerance, the faster the time-to-solution where the number of KSP/TAO solve iterations remains roughly the same. > > > > So my question is, is this "normal"? That is, if using TRON, one may relax the KSP tolerances because the convergence of the solver is primarily due to the objective functional from TRON and not necessarily the KSP solve itself? Is there a general rule of thumb for this, because it would seem to me that for any TRON solve I do, i could just set a really low KSP rtol and still get roughly the same performance. > > > > Thanks, > > Justin > > > > From Kevin.R.Smith at jhuapl.edu Wed Mar 16 08:23:51 2016 From: Kevin.R.Smith at jhuapl.edu (Smith, Kevin R.) Date: Wed, 16 Mar 2016 13:23:51 +0000 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> Message-ID: <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> Barry - I need to avoid dynamic allocation because it results in too much of a slowdown. Matt - The MatSetType thing did not work for some reason. It seems like PETSc wants me to set the preallocation after calling this method (suggested by the error below). I'm trying to reuse the original block of preallocated memory for the sparse matrix and blow away the structure each time. This would let me avoid repeated deallocate/allocate calls throughout my simulation. PETSc Function: MatGetOwnershipRange PETSc File: /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c PETSc Line: 6289 PETSc Error Code: 73 PETSc Error reference: http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatGetOwnershipRange() Kind regards, Kevin -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Tuesday, March 15, 2016 2:02 PM To: Smith, Kevin R. Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. wrote: > > > Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. You need to call a MatSetOption() to tell the matrix that it is allowed to allocate new nonzeros. > > Matt - Thanks, I will try this solution out. > > Thanks for your help, > > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 1:48 PM > To: Matthew Knepley > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley wrote: >> >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: >> Hello, >> >> >> >> Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. > > Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. > > Barry > >> >> >> Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using >> >> MatSetType(A, MATSHELL) >> MatSetType(A, ) >> >> A little messy but it will work. >> >> Thanks, >> >> Matt >> >> >> >> Thanks, >> >> Kevin >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From knepley at gmail.com Wed Mar 16 09:36:38 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 16 Mar 2016 09:36:38 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> Message-ID: On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. wrote: > Barry - I need to avoid dynamic allocation because it results in too much > of a slowdown. > Matt - The MatSetType thing did not work for some reason. It seems like > PETSc wants me to set the preallocation after calling this method > (suggested by the error below). I'm trying to reuse the original block of > preallocated memory for the sparse matrix and blow away the structure each > time. This would let me avoid repeated deallocate/allocate calls throughout > my simulation. > We need to understand the use case better. How can you guarantee that your new matrix will fit in the old matrix memroy? If you cannot, then we have to reallocate anyway. Thanks, Matt > PETSc Function: MatGetOwnershipRange > PETSc File: > /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > PETSc Line: 6289 > PETSc Error Code: 73 > PETSc Error reference: > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatGetOwnershipRange() > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 2:02 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. > wrote: > > > > > > Barry - Yeah, I suspected it was doing this. In my original > implementation I would get allocation errors. > > You need to call a MatSetOption() to tell the matrix that it is allowed > to allocate new nonzeros. > > > > > > Matt - Thanks, I will try this solution out. > > > > Thanks for your help, > > > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 1:48 PM > > To: Matthew Knepley > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley > wrote: > >> > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > >> Hello, > >> > >> > >> > >> Is it possible to reuse a sparse matrix and not reuse its non-zero > structure? I have a matrix whose sparse structure changes each time. I?m > hoping to avoid destroying and allocating new matrices each time the > structure changes. > > > > Sure you can just add new nonzero locations at a later time. But it > won't take out any current entries even if they are zero. So effectively > the nonzero structure grows over time. > > > > Barry > > > >> > >> > >> Hmm, I can't find a toplevel API that does this (it would be something > like MatReset()). You can get this effect using > >> > >> MatSetType(A, MATSHELL) > >> MatSetType(A, ) > >> > >> A little messy but it will work. > >> > >> Thanks, > >> > >> Matt > >> > >> > >> > >> Thanks, > >> > >> Kevin > >> > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >> -- Norbert Wiener > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From chih-hao.chen2 at mail.mcgill.ca Wed Mar 16 10:51:21 2016 From: chih-hao.chen2 at mail.mcgill.ca (Chih-Hao Chen) Date: Wed, 16 Mar 2016 15:51:21 +0000 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: References: Message-ID: Hello Matt, Thanks for the information. After using -log_summary to print out for both cases, I found both of them(openMpi and Mvapich2) give same performance. On the other hand, I still cannot get clear ideas how to get the returned subKsps by the function of getASMSubKSP. It seems the return subKsp are saved in a list of KSP objects. So I used : subksp = pc.getASMSubKsp() subksp[0].setType(?richardson') subksp[1].setType(?richardson?) with 2 processors. But it showed "list index out of range?. As you said on each rank, I can get the local KSPs. Would you mind telling me how to get them? Thanks very much. Best, Chih-Hao On Mar 14, 2016, at 3:02 PM, Matthew Knepley > wrote: On Mon, Mar 14, 2016 at 12:58 PM, Chih-Hao Chen > wrote: Hell Matt, Thanks for the information. I am still trying it now. But I observed an interesting thing about the performance difference when running ex8.c about the ASM. When using Mvapich2 for mph, its convergence speed is much faster than OpenMPI. Is it becasue the ASM function has been optimized based on Mvapich2? No, it is probably because you have a different partition on the two cases. For performance questions, we need to see the output of -log_summary for each case. Thanks, Matt Thanks very much. Best, Chih-Hao On Mar 10, 2016, at 4:28 PM, Matthew Knepley > wrote: On Thu, Mar 10, 2016 at 3:23 PM, Chih-Hao Chen > wrote: Hello PETSc members, Sorry for asking for help about the ASM in petsc4py. Currently I am using your ASM as my preconditioned in my solver. I know how to setup the PCASM based on the ex8.c in the following link. http://www.mcs.anl.gov/petsc/petsc-3.4/src/ksp/ksp/examples/tutorials/ex8.c But when using the function ?getASMSubKSP? in petsc4py, I have tried several methods, but still cannot get the subksp from each mpi. The subKSPs do not have to do with MPI. On each rank, you get the local KSPs. Here is the snippet of the code of the function. def getASMSubKSP(self): cdef PetscInt i = 0, n = 0 cdef PetscKSP *p = NULL CHKERR( PCASMGetSubKSP(self.pc, &n, NULL, &p) ) return [ref_KSP(p[i]) for i from 0 <= i From Kevin.R.Smith at jhuapl.edu Wed Mar 16 11:19:14 2016 From: Kevin.R.Smith at jhuapl.edu (Smith, Kevin R.) Date: Wed, 16 Mar 2016 16:19:14 +0000 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> Message-ID: <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> The use case is dynamic overset mesh coupling. I can guarantee that all matrices will fit since I have all the information I need to do so at initialization. The component meshes do not themselves change, only the overset assembly changes. I have to over-allocate memory to ensure the matrix always fits to cover the range of motion. I did figure out a way to avoid deleting the matrix every time I solve in my system, so that provides some level of an optimization if PETSc doesn?t support restructuring sparse matrices out of the box. With this approach I can also avoid over-allocating memory. However, my use case is unlikely to hit the memory limits, even if I have to over-subscribe the preallocation to cover the entire range of motion. So I think there will still be a clear benefit to doing the allocation once, and avoiding the reallocation each time structure changes. Not sure if this figure will get filtered out by the mailing list, but this shows a basic overset matrix structure. The green portions are the overset coupling regions, which may move around the matrix as the body moves. The number of coupling coefficients has a maximum N which is known, so I can preallocate to cover the entire range of motion. Hope that helps describe the use case. [cid:image001.png at 01D17F7E.0D243AA0] From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Wednesday, March 16, 2016 10:37 AM To: Smith, Kevin R. Cc: Barry Smith; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. > wrote: Barry - I need to avoid dynamic allocation because it results in too much of a slowdown. Matt - The MatSetType thing did not work for some reason. It seems like PETSc wants me to set the preallocation after calling this method (suggested by the error below). I'm trying to reuse the original block of preallocated memory for the sparse matrix and blow away the structure each time. This would let me avoid repeated deallocate/allocate calls throughout my simulation. We need to understand the use case better. How can you guarantee that your new matrix will fit in the old matrix memroy? If you cannot, then we have to reallocate anyway. Thanks, Matt PETSc Function: MatGetOwnershipRange PETSc File: /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c PETSc Line: 6289 PETSc Error Code: 73 PETSc Error reference: http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatGetOwnershipRange() Kind regards, Kevin -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Tuesday, March 15, 2016 2:02 PM To: Smith, Kevin R. Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. > wrote: > > > Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. You need to call a MatSetOption() to tell the matrix that it is allowed to allocate new nonzeros. > > Matt - Thanks, I will try this solution out. > > Thanks for your help, > > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 1:48 PM > To: Matthew Knepley > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley > wrote: >> >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. > wrote: >> Hello, >> >> >> >> Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. > > Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. > > Barry > >> >> >> Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using >> >> MatSetType(A, MATSHELL) >> MatSetType(A, ) >> >> A little messy but it will work. >> >> Thanks, >> >> Matt >> >> >> >> Thanks, >> >> Kevin >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 394372 bytes Desc: image001.png URL: From jed at jedbrown.org Wed Mar 16 11:47:56 2016 From: jed at jedbrown.org (Jed Brown) Date: Wed, 16 Mar 2016 16:47:56 +0000 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: References: Message-ID: <87mvpyqshv.fsf@jedbrown.org> Chih-Hao Chen writes: > Hello Matt, > > > Thanks for the information. > After using -log_summary to print out for both cases, I found both of them(openMpi and Mvapich2) give same performance. > > On the other hand, I still cannot get clear ideas how to get the returned subKsps by the function of getASMSubKSP. > It seems the return subKsp are saved in a list of KSP objects. > So I used : > > subksp = pc.getASMSubKsp() > subksp[0].setType(?richardson') > subksp[1].setType(?richardson?) > > with 2 processors. > But it showed "list index out of range?. If you do ASM with one subdomain per process, then of course only subksp[0] is valid. ASM can also be run with multiple subdomains per process, it just isn't the default. > As you said on each rank, I can get the local KSPs. > Would you mind telling me how to get them? You got them. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From chih-hao.chen2 at mail.mcgill.ca Wed Mar 16 12:03:59 2016 From: chih-hao.chen2 at mail.mcgill.ca (Chih-Hao Chen) Date: Wed, 16 Mar 2016 17:03:59 +0000 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: <87mvpyqshv.fsf@jedbrown.org> References: <87mvpyqshv.fsf@jedbrown.org> Message-ID: <80A24D71-FB63-45CC-B6C9-CB874EDAB8C3@mail.mcgill.ca> Hello Jed, Thanks for the information. In ex8.c about ASM example, we can use ?FOR? loop to get the subKsp from each domain with : ierr = PCASMGetSubKSP(kspPc, &nlocal, &first, &subksp) where subksp can be accessed through its index like subksp[i]. Do yo mean if I run two subdomains with two processors successfully, then I would be able to use subksp[1] to get 2nd subKsp from the 2nd processor? Am I correct? Thanks very much. Best, Chih-Hao On Mar 16, 2016, at 12:47 PM, Jed Brown > wrote: Chih-Hao Chen > writes: Hello Matt, Thanks for the information. After using -log_summary to print out for both cases, I found both of them(openMpi and Mvapich2) give same performance. On the other hand, I still cannot get clear ideas how to get the returned subKsps by the function of getASMSubKSP. It seems the return subKsp are saved in a list of KSP objects. So I used : subksp = pc.getASMSubKsp() subksp[0].setType(?richardson') subksp[1].setType(?richardson?) with 2 processors. But it showed "list index out of range?. If you do ASM with one subdomain per process, then of course only subksp[0] is valid. ASM can also be run with multiple subdomains per process, it just isn't the default. As you said on each rank, I can get the local KSPs. Would you mind telling me how to get them? You got them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Mar 16 12:15:15 2016 From: jed at jedbrown.org (Jed Brown) Date: Wed, 16 Mar 2016 17:15:15 +0000 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: <80A24D71-FB63-45CC-B6C9-CB874EDAB8C3@mail.mcgill.ca> References: <87mvpyqshv.fsf@jedbrown.org> <80A24D71-FB63-45CC-B6C9-CB874EDAB8C3@mail.mcgill.ca> Message-ID: <87h9g6qr8c.fsf@jedbrown.org> Chih-Hao Chen writes: > Hello Jed, > > > Thanks for the information. > In ex8.c about ASM example, > we can use ?FOR? loop to get the subKsp from each domain with : > ierr = PCASMGetSubKSP(kspPc, &nlocal, &first, &subksp) > where subksp can be accessed through its index like subksp[i]. > > Do yo mean if I run two subdomains with two processors successfully, > then I would be able to use subksp[1] to get 2nd subKsp from the 2nd processor? > Am I correct? It sounds like you are not familiar with how distributed memory/MPI works. Each process has a separate address space, so there is no way that rank 0 can have a serial object that lives on rank 1. It would be well worth your time to check out the tutorials or other MPI resources to get the hang of thinking in distributed memory. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From chih-hao.chen2 at mail.mcgill.ca Wed Mar 16 12:30:39 2016 From: chih-hao.chen2 at mail.mcgill.ca (Chih-Hao Chen) Date: Wed, 16 Mar 2016 17:30:39 +0000 Subject: [petsc-users] Questions about ASM in petsc4py In-Reply-To: <87h9g6qr8c.fsf@jedbrown.org> References: <87mvpyqshv.fsf@jedbrown.org> <80A24D71-FB63-45CC-B6C9-CB874EDAB8C3@mail.mcgill.ca> <87h9g6qr8c.fsf@jedbrown.org> Message-ID: <84A37AA3-B2AF-4F2D-A43E-8D8AAA84D5BE@mail.mcgill.ca> Hello Jed, Thanks very much, and you are correct. I will check the related material you suggested. Cheers. Best, Chih-Hao > On Mar 16, 2016, at 1:15 PM, Jed Brown wrote: > > Chih-Hao Chen writes: > >> Hello Jed, >> >> >> Thanks for the information. >> In ex8.c about ASM example, >> we can use ?FOR? loop to get the subKsp from each domain with : >> ierr = PCASMGetSubKSP(kspPc, &nlocal, &first, &subksp) >> where subksp can be accessed through its index like subksp[i]. >> >> Do yo mean if I run two subdomains with two processors successfully, >> then I would be able to use subksp[1] to get 2nd subKsp from the 2nd processor? >> Am I correct? > > It sounds like you are not familiar with how distributed memory/MPI > works. Each process has a separate address space, so there is no way > that rank 0 can have a serial object that lives on rank 1. It would be > well worth your time to check out the tutorials or other MPI resources > to get the hang of thinking in distributed memory. From bsmith at mcs.anl.gov Wed Mar 16 13:10:07 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 16 Mar 2016 13:10:07 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> Message-ID: You need to do two things 1) over preallocate 2) fill up all possible nonzero entries in the matrix at the beginning before calling the first MatAssemblyBegin/End() then it should run fine and efficiently (except it will always compute with those extra zeros but so long as they are relatively small amounts that is ok). The reason you need to do 2 is that Mat "throws away" any "extra" preallocation that you provided but did not set values into in the first MatAssemblyBegin/End. If you do not do 2) then it will need to do all the mallocs later when you set the new nonzero locations and so the over preallocation doesn't help at all. Barry > On Mar 16, 2016, at 11:19 AM, Smith, Kevin R. wrote: > > The use case is dynamic overset mesh coupling. I can guarantee that all matrices will fit since I have all the information I need to do so at initialization. The component meshes do not themselves change, only the overset assembly changes. I have to over-allocate memory to ensure the matrix always fits to cover the range of motion. > > I did figure out a way to avoid deleting the matrix every time I solve in my system, so that provides some level of an optimization if PETSc doesn?t support restructuring sparse matrices out of the box. With this approach I can also avoid over-allocating memory. However, my use case is unlikely to hit the memory limits, even if I have to over-subscribe the preallocation to cover the entire range of motion. So I think there will still be a clear benefit to doing the allocation once, and avoiding the reallocation each time structure changes. > > Not sure if this figure will get filtered out by the mailing list, but this shows a basic overset matrix structure. The green portions are the overset coupling regions, which may move around the matrix as the body moves. The number of coupling coefficients has a maximum N which is known, so I can preallocate to cover the entire range of motion. Hope that helps describe the use case. > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > Sent: Wednesday, March 16, 2016 10:37 AM > To: Smith, Kevin R. > Cc: Barry Smith; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. wrote: > Barry - I need to avoid dynamic allocation because it results in too much of a slowdown. > Matt - The MatSetType thing did not work for some reason. It seems like PETSc wants me to set the preallocation after calling this method (suggested by the error below). I'm trying to reuse the original block of preallocated memory for the sparse matrix and blow away the structure each time. This would let me avoid repeated deallocate/allocate calls throughout my simulation. > > We need to understand the use case better. How can you guarantee that your new matrix will > fit in the old matrix memroy? If you cannot, then we have to reallocate anyway. > > Thanks, > > Matt > > PETSc Function: MatGetOwnershipRange > PETSc File: /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > PETSc Line: 6289 > PETSc Error Code: 73 > PETSc Error reference: http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatGetOwnershipRange() > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 2:02 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. wrote: > > > > > > Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. > > You need to call a MatSetOption() to tell the matrix that it is allowed to allocate new nonzeros. > > > > > > Matt - Thanks, I will try this solution out. > > > > Thanks for your help, > > > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 1:48 PM > > To: Matthew Knepley > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley wrote: > >> > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: > >> Hello, > >> > >> > >> > >> Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. > > > > Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. > > > > Barry > > > >> > >> > >> Hmm, I can't find a toplevel API that does this (it would be something like MatReset()). You can get this effect using > >> > >> MatSetType(A, MATSHELL) > >> MatSetType(A, ) > >> > >> A little messy but it will work. > >> > >> Thanks, > >> > >> Matt > >> > >> > >> > >> Thanks, > >> > >> Kevin > >> > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From e.tadeu at gmail.com Wed Mar 16 15:22:39 2016 From: e.tadeu at gmail.com (E. Tadeu) Date: Wed, 16 Mar 2016 17:22:39 -0300 Subject: [petsc-users] Build PETSc as shared library on Windows ("traditional" build or CMake) In-Reply-To: References: Message-ID: Thanks Satish, Just to let you know, with a few minor modifications and extra parameters, I've managed to build it using configure under Cygwin, and then, outside cygwin, running non-cygwin cmake inside the "arch-..." folder, targeting MSVC. On Mon, Mar 14, 2016 at 6:31 PM, Satish Balay wrote: > you could try precompiled petsc from > http://www.msic.ch/Downloads/Software > [its old 3.5 version though] > > We don't have any changes wrt .dlls or cmake on windows.. > > Satish > > On Mon, 14 Mar 2016, E. Tadeu wrote: > > > Hi, > > > > What is the current status of building PETSc as a shared library on > > Windows? It seems non-trivial: `--with-shared-libraries=1` won't work, > > since Cygwin's `ld` fails, and `win32fe` also fails. > > > > Also, what is the status of building PETSc with CMake on Windows? > Perhaps > > through using CMake it would be easier to generate the .DLL's, but I > > couldn't find documentation on how to do this. > > > > > > Thanks! > > Edson > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 16 15:27:26 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 16 Mar 2016 15:27:26 -0500 Subject: [petsc-users] Build PETSc as shared library on Windows ("traditional" build or CMake) In-Reply-To: References: Message-ID: <895EE3E2-941B-407A-BF73-79CBD1345F4E@mcs.anl.gov> > On Mar 16, 2016, at 3:22 PM, E. Tadeu wrote: > > Thanks Satish, > > Just to let you know, with a few minor modifications and extra parameters, I've managed to build it using configure under Cygwin, and then, outside cygwin, running non-cygwin cmake inside the "arch-..." folder, targeting MSVC. Awesome, can you send any changes you needed to make to PETSc so other users can benefit? Barry > > > On Mon, Mar 14, 2016 at 6:31 PM, Satish Balay wrote: > you could try precompiled petsc from > http://www.msic.ch/Downloads/Software > [its old 3.5 version though] > > We don't have any changes wrt .dlls or cmake on windows.. > > Satish > > On Mon, 14 Mar 2016, E. Tadeu wrote: > > > Hi, > > > > What is the current status of building PETSc as a shared library on > > Windows? It seems non-trivial: `--with-shared-libraries=1` won't work, > > since Cygwin's `ld` fails, and `win32fe` also fails. > > > > Also, what is the status of building PETSc with CMake on Windows? Perhaps > > through using CMake it would be easier to generate the .DLL's, but I > > couldn't find documentation on how to do this. > > > > > > Thanks! > > Edson > > > > From e.tadeu at gmail.com Wed Mar 16 15:48:01 2016 From: e.tadeu at gmail.com (E. Tadeu) Date: Wed, 16 Mar 2016 17:48:01 -0300 Subject: [petsc-users] Build PETSc as shared library on Windows ("traditional" build or CMake) In-Reply-To: <895EE3E2-941B-407A-BF73-79CBD1345F4E@mcs.anl.gov> References: <895EE3E2-941B-407A-BF73-79CBD1345F4E@mcs.anl.gov> Message-ID: Of course, this is what I did (I'm building it for MSVC 2010, but it should be similar for other versions): - call `setenv.cmd` (or `vcvarsall`) to configure the compiler environment (this should be configured at all steps) - enter Cygwin, call `configure` normally from inside Cygwin, adding `--with-shared-libraries=1` (the output says "shared libraries: disabled", but no problem) - modify PETScConfig.cmake to replace `/cygdrive/c/` with `C:/` - if needed, change `iomp5md` to `libiomp5md` - add `add_definitions(-DPETSC_USE_SHARED_LIBRARIES=1)` after the project line of CMakeLists.txt - outside of Cygwin, call cmake like this, from inside the "arch-mswin-c-opt" directory: cmake .. -G "Visual Studio 10 2010 Win64" -DPETSC_CMAKE_ARCH=arch-mswin-c-opt -DCMAKE_C_FLAGS="-wd4996" -DBUILD_SHARED_LIBS:BOOL=ON - build the generated PETSc.sln from inside Visual Studio On Wed, Mar 16, 2016 at 5:27 PM, Barry Smith wrote: > > > On Mar 16, 2016, at 3:22 PM, E. Tadeu wrote: > > > > Thanks Satish, > > > > Just to let you know, with a few minor modifications and extra > parameters, I've managed to build it using configure under Cygwin, and > then, outside cygwin, running non-cygwin cmake inside the "arch-..." > folder, targeting MSVC. > > Awesome, can you send any changes you needed to make to PETSc so other > users can benefit? > > Barry > > > > > > > On Mon, Mar 14, 2016 at 6:31 PM, Satish Balay wrote: > > you could try precompiled petsc from > > http://www.msic.ch/Downloads/Software > > [its old 3.5 version though] > > > > We don't have any changes wrt .dlls or cmake on windows.. > > > > Satish > > > > On Mon, 14 Mar 2016, E. Tadeu wrote: > > > > > Hi, > > > > > > What is the current status of building PETSc as a shared library on > > > Windows? It seems non-trivial: `--with-shared-libraries=1` won't work, > > > since Cygwin's `ld` fails, and `win32fe` also fails. > > > > > > Also, what is the status of building PETSc with CMake on Windows? > Perhaps > > > through using CMake it would be easier to generate the .DLL's, but I > > > couldn't find documentation on how to do this. > > > > > > > > > Thanks! > > > Edson > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 16 15:54:04 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 16 Mar 2016 15:54:04 -0500 Subject: [petsc-users] DMShellSetCreateRestriction In-Reply-To: <56E3DCDA.5010407@uni-mainz.de> References: <56E2FC53.1040709@uni-mainz.de> <2373A243-EF0A-4139-9BB1-8BC809F9B8D1@uni-mainz.de> <56E3DCDA.5010407@uni-mainz.de> Message-ID: Ok, it is finally available in the branch barry/add-dmshellcreaterestriction sorry for the delay I needed some emergency surgery for Toby due to overly complex reference counting. See src/ksp/ksp/examples/tutorials/ex65.c that is now a full example of using DMSHELL and uses this new functionality. So if you provide a DMCreateRestriction function for SHELL (or any DM in fact) it will use this restriction operator automatically in PCMG Please let us know if you have any difficulties. Barry > On Mar 12, 2016, at 3:09 AM, anton wrote: > > > > On 03/11/2016 11:25 PM, Barry Smith wrote: >> Boris, >> >> We will add this support to the DMShell and its usage from PCMG within a few days. >> >> Barry >> > > Tanks Barry. This is super-fast and very helpful. > > > Cheers, > Anton >>> On Mar 11, 2016, at 3:39 PM, Boris Kaus wrote: >>> >>> >>>> On Mar 11, 2016, at 8:53 PM, Matthew Knepley wrote: >>>> >>>> On Fri, Mar 11, 2016 at 12:26 PM, Dave May wrote: >>>> On 11 March 2016 at 18:11, anton wrote: >>>> Hi team, >>>> >>>> I'm implementing staggered grid in a PETSc-canonical way, trying to build a custom DM object, attach it to SNES, that should later transfered it further to KSP and PC. >>>> >>>> Yet, the Galerking coarsening for staggered grid is non-symmetric. The question is how possible is it that DMShellSetCreateRestriction can be implemented and included in 3.7 release? >>>> >>>> It's a little more work than just adding a new method within the DM and a new APIs for DMCreateRestriction() and DMShellSetCreateRestriction(). >>>> PCMG needs to be modified to call DMCreateRestriction(). >>>> >>>> Dave is correct. Currently, PCMG only calls DMCreateInterpolation(). We would need to add a DMCreateRestriction() call. >>> The PCMG object already uses a restriction operator that is different from the interpolation parameter if it is specified with PCMGSetRestriction. >>> For consistency, one would expect a similar DMCreateRestriction object, not? I realize that this is not relevant for FEM codes, but for staggered FD it makes quite some difference. >>> >>> Other suggestions on how to best integrate staggered finite differences within the current PETSc framework are ofcourse also highly welcome. >>> Our current thinking was to pack it into a DMSHELL (which has the problem of not having a restriction interface). >>> >>> thanks, >>> Boris >>> >>> >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> Please, please. >>>> >>>> Thanks, >>>> Anton >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener > From Kevin.R.Smith at jhuapl.edu Thu Mar 17 08:54:01 2016 From: Kevin.R.Smith at jhuapl.edu (Smith, Kevin R.) Date: Thu, 17 Mar 2016 13:54:01 +0000 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> Message-ID: <51f7358e4da44c0cb97e34eab1ff1a5c@APLEX07.dom1.jhuapl.edu> Hi Barry, Thanks! So this approach will allow the non-zero structure to change while avoiding reallocations? From what I understood from the documentation and discussion so far is that the non-zero structure is locked in once you do SetValue calls. I would think that while this approach does prevent PETSc from throwing away the oversubscribed pre-allocation, it would still not allow the matrix structure to change. I can only predict the preallocation beforehand, not the structure of the matrix during its entire lifetime. Thank you very much for taking the time to discuss this with me. Kind regards, Kevin -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Wednesday, March 16, 2016 2:10 PM To: Smith, Kevin R. Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure You need to do two things 1) over preallocate 2) fill up all possible nonzero entries in the matrix at the beginning before calling the first MatAssemblyBegin/End() then it should run fine and efficiently (except it will always compute with those extra zeros but so long as they are relatively small amounts that is ok). The reason you need to do 2 is that Mat "throws away" any "extra" preallocation that you provided but did not set values into in the first MatAssemblyBegin/End. If you do not do 2) then it will need to do all the mallocs later when you set the new nonzero locations and so the over preallocation doesn't help at all. Barry > On Mar 16, 2016, at 11:19 AM, Smith, Kevin R. wrote: > > The use case is dynamic overset mesh coupling. I can guarantee that all matrices will fit since I have all the information I need to do so at initialization. The component meshes do not themselves change, only the overset assembly changes. I have to over-allocate memory to ensure the matrix always fits to cover the range of motion. > > I did figure out a way to avoid deleting the matrix every time I solve in my system, so that provides some level of an optimization if PETSc doesn?t support restructuring sparse matrices out of the box. With this approach I can also avoid over-allocating memory. However, my use case is unlikely to hit the memory limits, even if I have to over-subscribe the preallocation to cover the entire range of motion. So I think there will still be a clear benefit to doing the allocation once, and avoiding the reallocation each time structure changes. > > Not sure if this figure will get filtered out by the mailing list, but this shows a basic overset matrix structure. The green portions are the overset coupling regions, which may move around the matrix as the body moves. The number of coupling coefficients has a maximum N which is known, so I can preallocate to cover the entire range of motion. Hope that helps describe the use case. > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > Sent: Wednesday, March 16, 2016 10:37 AM > To: Smith, Kevin R. > Cc: Barry Smith; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. wrote: > Barry - I need to avoid dynamic allocation because it results in too much of a slowdown. > Matt - The MatSetType thing did not work for some reason. It seems like PETSc wants me to set the preallocation after calling this method (suggested by the error below). I'm trying to reuse the original block of preallocated memory for the sparse matrix and blow away the structure each time. This would let me avoid repeated deallocate/allocate calls throughout my simulation. > > We need to understand the use case better. How can you guarantee that > your new matrix will fit in the old matrix memroy? If you cannot, then we have to reallocate anyway. > > Thanks, > > Matt > > PETSc Function: MatGetOwnershipRange > PETSc File: > /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > PETSc Line: 6289 > PETSc Error Code: 73 > PETSc Error reference: > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatGetOwnershipRange() > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 2:02 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. wrote: > > > > > > Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. > > You need to call a MatSetOption() to tell the matrix that it is allowed to allocate new nonzeros. > > > > > > Matt - Thanks, I will try this solution out. > > > > Thanks for your help, > > > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 1:48 PM > > To: Matthew Knepley > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley wrote: > >> > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. wrote: > >> Hello, > >> > >> > >> > >> Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. > > > > Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. > > > > Barry > > > >> > >> > >> Hmm, I can't find a toplevel API that does this (it would be > >> something like MatReset()). You can get this effect using > >> > >> MatSetType(A, MATSHELL) > >> MatSetType(A, ) > >> > >> A little messy but it will work. > >> > >> Thanks, > >> > >> Matt > >> > >> > >> > >> Thanks, > >> > >> Kevin > >> > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From mfadams at lbl.gov Thu Mar 17 08:56:49 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 17 Mar 2016 09:56:49 -0400 Subject: [petsc-users] libtools mismatch on osx Message-ID: I manually installed libtools yesterday and all seemed well but I cloned a new PETSc just now and now get this error. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1897821 bytes Desc: not available URL: From knepley at gmail.com Thu Mar 17 09:06:50 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Mar 2016 09:06:50 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <51f7358e4da44c0cb97e34eab1ff1a5c@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> <51f7358e4da44c0cb97e34eab1ff1a5c@APLEX07.dom1.jhuapl.edu> Message-ID: On Thu, Mar 17, 2016 at 8:54 AM, Smith, Kevin R. wrote: > Hi Barry, > > Thanks! So this approach will allow the non-zero structure to change > while avoiding reallocations? From what I understood from the documentation > and discussion so far is that the non-zero structure is locked in once you > do SetValue calls. I would think that while this approach does prevent > PETSc from throwing away the oversubscribed pre-allocation, it would still > not allow the matrix structure to change. I can only predict the > preallocation beforehand, not the structure of the matrix during its entire > lifetime. > The strategy Barry suggested means that you have to 0 out all potential columns at the start. So suppose you translate your small box over the entire coarse mesh. You might say, at any one time I never fill up more than C columns, however the strategy above would mean you would need to 0 out every column in all those rows. It sounds like what you want is something like this. At each iteration, 1) Wipe out nonzero structure, but keep allocation. This returns us to the situation right after MatSetPreallocation*() 2) Fill up matrix, including enough 0s to fill up the allocation If that is what you want, we can show you how to do it. Thanks, Matt > Thank you very much for taking the time to discuss this with me. > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Wednesday, March 16, 2016 2:10 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > You need to do two things > > 1) over preallocate > 2) fill up all possible nonzero entries in the matrix at the beginning > before calling the first MatAssemblyBegin/End() > > then it should run fine and efficiently (except it will always compute > with those extra zeros but so long as they are relatively small amounts > that is ok). > > The reason you need to do 2 is that Mat "throws away" any "extra" > preallocation that you provided but did not set values into in the first > MatAssemblyBegin/End. If you do not do 2) then it will need to do all the > mallocs later when you set the new nonzero locations and so the over > preallocation doesn't help at all. > > Barry > > > On Mar 16, 2016, at 11:19 AM, Smith, Kevin R. > wrote: > > > > The use case is dynamic overset mesh coupling. I can guarantee that all > matrices will fit since I have all the information I need to do so at > initialization. The component meshes do not themselves change, only the > overset assembly changes. I have to over-allocate memory to ensure the > matrix always fits to cover the range of motion. > > > > I did figure out a way to avoid deleting the matrix every time I solve > in my system, so that provides some level of an optimization if PETSc > doesn?t support restructuring sparse matrices out of the box. With this > approach I can also avoid over-allocating memory. However, my use case is > unlikely to hit the memory limits, even if I have to over-subscribe the > preallocation to cover the entire range of motion. So I think there will > still be a clear benefit to doing the allocation once, and avoiding the > reallocation each time structure changes. > > > > Not sure if this figure will get filtered out by the mailing list, but > this shows a basic overset matrix structure. The green portions are the > overset coupling regions, which may move around the matrix as the body > moves. The number of coupling coefficients has a maximum N which is known, > so I can preallocate to cover the entire range of motion. Hope that helps > describe the use case. > > > > > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > > Sent: Wednesday, March 16, 2016 10:37 AM > > To: Smith, Kevin R. > > Cc: Barry Smith; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > Barry - I need to avoid dynamic allocation because it results in too > much of a slowdown. > > Matt - The MatSetType thing did not work for some reason. It seems like > PETSc wants me to set the preallocation after calling this method > (suggested by the error below). I'm trying to reuse the original block of > preallocated memory for the sparse matrix and blow away the structure each > time. This would let me avoid repeated deallocate/allocate calls throughout > my simulation. > > > > We need to understand the use case better. How can you guarantee that > > your new matrix will fit in the old matrix memroy? If you cannot, then > we have to reallocate anyway. > > > > Thanks, > > > > Matt > > > > PETSc Function: MatGetOwnershipRange > > PETSc File: > > /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > > PETSc Line: 6289 > > PETSc Error Code: 73 > > PETSc Error reference: > > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on > > argument 1 "mat" before MatGetOwnershipRange() > > > > Kind regards, > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 2:02 PM > > To: Smith, Kevin R. > > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > > > > > > > > Barry - Yeah, I suspected it was doing this. In my original > implementation I would get allocation errors. > > > > You need to call a MatSetOption() to tell the matrix that it is > allowed to allocate new nonzeros. > > > > > > > > > > Matt - Thanks, I will try this solution out. > > > > > > Thanks for your help, > > > > > > Kevin > > > > > > -----Original Message----- > > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > > Sent: Tuesday, March 15, 2016 1:48 PM > > > To: Matthew Knepley > > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley > wrote: > > >> > > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > >> Hello, > > >> > > >> > > >> > > >> Is it possible to reuse a sparse matrix and not reuse its non-zero > structure? I have a matrix whose sparse structure changes each time. I?m > hoping to avoid destroying and allocating new matrices each time the > structure changes. > > > > > > Sure you can just add new nonzero locations at a later time. But it > won't take out any current entries even if they are zero. So effectively > the nonzero structure grows over time. > > > > > > Barry > > > > > >> > > >> > > >> Hmm, I can't find a toplevel API that does this (it would be > > >> something like MatReset()). You can get this effect using > > >> > > >> MatSetType(A, MATSHELL) > > >> MatSetType(A, ) > > >> > > >> A little messy but it will work. > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> > > >> > > >> Thanks, > > >> > > >> Kevin > > >> > > >> > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >> -- Norbert Wiener > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.R.Smith at jhuapl.edu Thu Mar 17 09:29:13 2016 From: Kevin.R.Smith at jhuapl.edu (Smith, Kevin R.) Date: Thu, 17 Mar 2016 14:29:13 +0000 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> <51f7358e4da44c0cb97e34eab1ff1a5c@APLEX07.dom1.jhuapl.edu> Message-ID: <8f078ad2b4e94fa3b4ee5ae2b72f7777@APLEX07.dom1.jhuapl.edu> Hi Matt, Yeah, if I need to put zeros in all potential columns at the start, I think that matrix will become essentially dense. The size of my problem is such that I need to keep the matrix sparse. Yes, what you described is exactly what I?m after. Thanks, Kevin From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Thursday, March 17, 2016 10:07 AM To: Smith, Kevin R. Cc: Barry Smith; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure On Thu, Mar 17, 2016 at 8:54 AM, Smith, Kevin R. > wrote: Hi Barry, Thanks! So this approach will allow the non-zero structure to change while avoiding reallocations? From what I understood from the documentation and discussion so far is that the non-zero structure is locked in once you do SetValue calls. I would think that while this approach does prevent PETSc from throwing away the oversubscribed pre-allocation, it would still not allow the matrix structure to change. I can only predict the preallocation beforehand, not the structure of the matrix during its entire lifetime. The strategy Barry suggested means that you have to 0 out all potential columns at the start. So suppose you translate your small box over the entire coarse mesh. You might say, at any one time I never fill up more than C columns, however the strategy above would mean you would need to 0 out every column in all those rows. It sounds like what you want is something like this. At each iteration, 1) Wipe out nonzero structure, but keep allocation. This returns us to the situation right after MatSetPreallocation*() 2) Fill up matrix, including enough 0s to fill up the allocation If that is what you want, we can show you how to do it. Thanks, Matt Thank you very much for taking the time to discuss this with me. Kind regards, Kevin -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Wednesday, March 16, 2016 2:10 PM To: Smith, Kevin R. Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix reuse with changing structure You need to do two things 1) over preallocate 2) fill up all possible nonzero entries in the matrix at the beginning before calling the first MatAssemblyBegin/End() then it should run fine and efficiently (except it will always compute with those extra zeros but so long as they are relatively small amounts that is ok). The reason you need to do 2 is that Mat "throws away" any "extra" preallocation that you provided but did not set values into in the first MatAssemblyBegin/End. If you do not do 2) then it will need to do all the mallocs later when you set the new nonzero locations and so the over preallocation doesn't help at all. Barry > On Mar 16, 2016, at 11:19 AM, Smith, Kevin R. > wrote: > > The use case is dynamic overset mesh coupling. I can guarantee that all matrices will fit since I have all the information I need to do so at initialization. The component meshes do not themselves change, only the overset assembly changes. I have to over-allocate memory to ensure the matrix always fits to cover the range of motion. > > I did figure out a way to avoid deleting the matrix every time I solve in my system, so that provides some level of an optimization if PETSc doesn?t support restructuring sparse matrices out of the box. With this approach I can also avoid over-allocating memory. However, my use case is unlikely to hit the memory limits, even if I have to over-subscribe the preallocation to cover the entire range of motion. So I think there will still be a clear benefit to doing the allocation once, and avoiding the reallocation each time structure changes. > > Not sure if this figure will get filtered out by the mailing list, but this shows a basic overset matrix structure. The green portions are the overset coupling regions, which may move around the matrix as the body moves. The number of coupling coefficients has a maximum N which is known, so I can preallocate to cover the entire range of motion. Hope that helps describe the use case. > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > Sent: Wednesday, March 16, 2016 10:37 AM > To: Smith, Kevin R. > Cc: Barry Smith; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. > wrote: > Barry - I need to avoid dynamic allocation because it results in too much of a slowdown. > Matt - The MatSetType thing did not work for some reason. It seems like PETSc wants me to set the preallocation after calling this method (suggested by the error below). I'm trying to reuse the original block of preallocated memory for the sparse matrix and blow away the structure each time. This would let me avoid repeated deallocate/allocate calls throughout my simulation. > > We need to understand the use case better. How can you guarantee that > your new matrix will fit in the old matrix memroy? If you cannot, then we have to reallocate anyway. > > Thanks, > > Matt > > PETSc Function: MatGetOwnershipRange > PETSc File: > /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > PETSc Line: 6289 > PETSc Error Code: 73 > PETSc Error reference: > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatGetOwnershipRange() > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, March 15, 2016 2:02 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. > wrote: > > > > > > Barry - Yeah, I suspected it was doing this. In my original implementation I would get allocation errors. > > You need to call a MatSetOption() to tell the matrix that it is allowed to allocate new nonzeros. > > > > > > Matt - Thanks, I will try this solution out. > > > > Thanks for your help, > > > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 1:48 PM > > To: Matthew Knepley > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley > wrote: > >> > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. > wrote: > >> Hello, > >> > >> > >> > >> Is it possible to reuse a sparse matrix and not reuse its non-zero structure? I have a matrix whose sparse structure changes each time. I?m hoping to avoid destroying and allocating new matrices each time the structure changes. > > > > Sure you can just add new nonzero locations at a later time. But it won't take out any current entries even if they are zero. So effectively the nonzero structure grows over time. > > > > Barry > > > >> > >> > >> Hmm, I can't find a toplevel API that does this (it would be > >> something like MatReset()). You can get this effect using > >> > >> MatSetType(A, MATSHELL) > >> MatSetType(A, ) > >> > >> A little messy but it will work. > >> > >> Thanks, > >> > >> Matt > >> > >> > >> > >> Thanks, > >> > >> Kevin > >> > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Mar 17 09:59:15 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 17 Mar 2016 10:59:15 -0400 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <8f078ad2b4e94fa3b4ee5ae2b72f7777@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> <51f7358e4da44c0cb97e34eab1ff1a5c@APLEX07.dom1.jhuapl.edu> <8f078ad2b4e94fa3b4ee5ae2b72f7777@APLEX07.dom1.jhuapl.edu> Message-ID: I don't think this will work w/o a lot of pain. You would have to figure out how much of the preallocation you did not use in each row and add random (new) zero columns to each row as needed. And is this really buying you much? PETSc will have to redo all the maps anyway. This is just saving one big malloc and maybe a few reductions. Right? On Thu, Mar 17, 2016 at 10:29 AM, Smith, Kevin R. wrote: > Hi Matt, > > > > Yeah, if I need to put zeros in all potential columns at the start, I > think that matrix will become essentially dense. The size of my problem is > such that I need to keep the matrix sparse. > > > > Yes, what you described is exactly what I?m after. > > > > Thanks, > > Kevin > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Thursday, March 17, 2016 10:07 AM > > *To:* Smith, Kevin R. > *Cc:* Barry Smith; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Matrix reuse with changing structure > > > > On Thu, Mar 17, 2016 at 8:54 AM, Smith, Kevin R. > wrote: > > Hi Barry, > > Thanks! So this approach will allow the non-zero structure to change > while avoiding reallocations? From what I understood from the documentation > and discussion so far is that the non-zero structure is locked in once you > do SetValue calls. I would think that while this approach does prevent > PETSc from throwing away the oversubscribed pre-allocation, it would still > not allow the matrix structure to change. I can only predict the > preallocation beforehand, not the structure of the matrix during its entire > lifetime. > > > > The strategy Barry suggested means that you have to 0 out all potential > columns at the start. > > > > So suppose you translate your small box over the entire coarse mesh. You > might say, at any > > one time I never fill up more than C columns, however the strategy above > would mean you > > would need to 0 out every column in all those rows. > > > > It sounds like what you want is something like this. At each iteration, > > > > 1) Wipe out nonzero structure, but keep allocation. This returns us to > > the situation right after MatSetPreallocation*() > > > > 2) Fill up matrix, including enough 0s to fill up the allocation > > > > If that is what you want, we can show you how to do it. > > > > Thanks, > > > > Matt > > > > Thank you very much for taking the time to discuss this with me. > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Wednesday, March 16, 2016 2:10 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > You need to do two things > > 1) over preallocate > 2) fill up all possible nonzero entries in the matrix at the beginning > before calling the first MatAssemblyBegin/End() > > then it should run fine and efficiently (except it will always compute > with those extra zeros but so long as they are relatively small amounts > that is ok). > > The reason you need to do 2 is that Mat "throws away" any "extra" > preallocation that you provided but did not set values into in the first > MatAssemblyBegin/End. If you do not do 2) then it will need to do all the > mallocs later when you set the new nonzero locations and so the over > preallocation doesn't help at all. > > Barry > > > On Mar 16, 2016, at 11:19 AM, Smith, Kevin R. > wrote: > > > > The use case is dynamic overset mesh coupling. I can guarantee that all > matrices will fit since I have all the information I need to do so at > initialization. The component meshes do not themselves change, only the > overset assembly changes. I have to over-allocate memory to ensure the > matrix always fits to cover the range of motion. > > > > I did figure out a way to avoid deleting the matrix every time I solve > in my system, so that provides some level of an optimization if PETSc > doesn?t support restructuring sparse matrices out of the box. With this > approach I can also avoid over-allocating memory. However, my use case is > unlikely to hit the memory limits, even if I have to over-subscribe the > preallocation to cover the entire range of motion. So I think there will > still be a clear benefit to doing the allocation once, and avoiding the > reallocation each time structure changes. > > > > Not sure if this figure will get filtered out by the mailing list, but > this shows a basic overset matrix structure. The green portions are the > overset coupling regions, which may move around the matrix as the body > moves. The number of coupling coefficients has a maximum N which is known, > so I can preallocate to cover the entire range of motion. Hope that helps > describe the use case. > > > > > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > > Sent: Wednesday, March 16, 2016 10:37 AM > > To: Smith, Kevin R. > > Cc: Barry Smith; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > Barry - I need to avoid dynamic allocation because it results in too > much of a slowdown. > > Matt - The MatSetType thing did not work for some reason. It seems like > PETSc wants me to set the preallocation after calling this method > (suggested by the error below). I'm trying to reuse the original block of > preallocated memory for the sparse matrix and blow away the structure each > time. This would let me avoid repeated deallocate/allocate calls throughout > my simulation. > > > > We need to understand the use case better. How can you guarantee that > > your new matrix will fit in the old matrix memroy? If you cannot, then > we have to reallocate anyway. > > > > Thanks, > > > > Matt > > > > PETSc Function: MatGetOwnershipRange > > PETSc File: > > /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > > PETSc Line: 6289 > > PETSc Error Code: 73 > > PETSc Error reference: > > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on > > argument 1 "mat" before MatGetOwnershipRange() > > > > Kind regards, > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 2:02 PM > > To: Smith, Kevin R. > > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > > > > > > > > Barry - Yeah, I suspected it was doing this. In my original > implementation I would get allocation errors. > > > > You need to call a MatSetOption() to tell the matrix that it is > allowed to allocate new nonzeros. > > > > > > > > > > Matt - Thanks, I will try this solution out. > > > > > > Thanks for your help, > > > > > > Kevin > > > > > > -----Original Message----- > > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > > Sent: Tuesday, March 15, 2016 1:48 PM > > > To: Matthew Knepley > > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley > wrote: > > >> > > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > >> Hello, > > >> > > >> > > >> > > >> Is it possible to reuse a sparse matrix and not reuse its non-zero > structure? I have a matrix whose sparse structure changes each time. I?m > hoping to avoid destroying and allocating new matrices each time the > structure changes. > > > > > > Sure you can just add new nonzero locations at a later time. But it > won't take out any current entries even if they are zero. So effectively > the nonzero structure grows over time. > > > > > > Barry > > > > > >> > > >> > > >> Hmm, I can't find a toplevel API that does this (it would be > > >> something like MatReset()). You can get this effect using > > >> > > >> MatSetType(A, MATSHELL) > > >> MatSetType(A, ) > > >> > > >> A little messy but it will work. > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> > > >> > > >> Thanks, > > >> > > >> Kevin > > >> > > >> > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >> -- Norbert Wiener > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 17 10:00:24 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Mar 2016 10:00:24 -0500 Subject: [petsc-users] Matrix reuse with changing structure In-Reply-To: <8f078ad2b4e94fa3b4ee5ae2b72f7777@APLEX07.dom1.jhuapl.edu> References: <0ccc6b0afec24436abd257a2f026ff9c@APLEX07.dom1.jhuapl.edu> <7D2C4864-3630-454A-8B1C-BA089B19402D@mcs.anl.gov> <2da59a4a42b241eca9a7f5593ae3faff@APLEX07.dom1.jhuapl.edu> <3CA31214-0B64-451F-8371-22463926FDF3@mcs.anl.gov> <2c93a0a2cf84462cbd055774bad0e0f7@APLEX07.dom1.jhuapl.edu> <4763937cf3a7426399ce811517c395be@APLEX07.dom1.jhuapl.edu> <51f7358e4da44c0cb97e34eab1ff1a5c@APLEX07.dom1.jhuapl.edu> <8f078ad2b4e94fa3b4ee5ae2b72f7777@APLEX07.dom1.jhuapl.edu> Message-ID: On Thu, Mar 17, 2016 at 9:29 AM, Smith, Kevin R. wrote: > Hi Matt, > > > > Yeah, if I need to put zeros in all potential columns at the start, I > think that matrix will become essentially dense. The size of my problem is > such that I need to keep the matrix sparse. > > > > Yes, what you described is exactly what I?m after. > Okay, I think it may be enough to 0 out the row lengths and then carry out your MatSetValues() again. Here is how to try it #include <../src/mat/impls/aij/seq/aij.h> #undef __FUNCT__ #define __FUNCT__ "ClearStructure" PetscErrorCode ClearStructure(Mat A) { Mat_SeqAIJ *a = (Mat_SeqAIJ *) A->data; PetscFunctionBeginUser; ierr = PetscMemzero(a->ailen, A->rmap->n * sizeof(PetscInt));CHKERRQ(ierr); PetscFunctionReturn(0); } Make sure you remember to put in zero for any missing spaces before assembling the matrix. Let us know how it goes. Thanks, Matt > Thanks, > > Kevin > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Thursday, March 17, 2016 10:07 AM > > *To:* Smith, Kevin R. > *Cc:* Barry Smith; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Matrix reuse with changing structure > > > > On Thu, Mar 17, 2016 at 8:54 AM, Smith, Kevin R. > wrote: > > Hi Barry, > > Thanks! So this approach will allow the non-zero structure to change > while avoiding reallocations? From what I understood from the documentation > and discussion so far is that the non-zero structure is locked in once you > do SetValue calls. I would think that while this approach does prevent > PETSc from throwing away the oversubscribed pre-allocation, it would still > not allow the matrix structure to change. I can only predict the > preallocation beforehand, not the structure of the matrix during its entire > lifetime. > > > > The strategy Barry suggested means that you have to 0 out all potential > columns at the start. > > > > So suppose you translate your small box over the entire coarse mesh. You > might say, at any > > one time I never fill up more than C columns, however the strategy above > would mean you > > would need to 0 out every column in all those rows. > > > > It sounds like what you want is something like this. At each iteration, > > > > 1) Wipe out nonzero structure, but keep allocation. This returns us to > > the situation right after MatSetPreallocation*() > > > > 2) Fill up matrix, including enough 0s to fill up the allocation > > > > If that is what you want, we can show you how to do it. > > > > Thanks, > > > > Matt > > > > Thank you very much for taking the time to discuss this with me. > > Kind regards, > Kevin > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Wednesday, March 16, 2016 2:10 PM > To: Smith, Kevin R. > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > You need to do two things > > 1) over preallocate > 2) fill up all possible nonzero entries in the matrix at the beginning > before calling the first MatAssemblyBegin/End() > > then it should run fine and efficiently (except it will always compute > with those extra zeros but so long as they are relatively small amounts > that is ok). > > The reason you need to do 2 is that Mat "throws away" any "extra" > preallocation that you provided but did not set values into in the first > MatAssemblyBegin/End. If you do not do 2) then it will need to do all the > mallocs later when you set the new nonzero locations and so the over > preallocation doesn't help at all. > > Barry > > > On Mar 16, 2016, at 11:19 AM, Smith, Kevin R. > wrote: > > > > The use case is dynamic overset mesh coupling. I can guarantee that all > matrices will fit since I have all the information I need to do so at > initialization. The component meshes do not themselves change, only the > overset assembly changes. I have to over-allocate memory to ensure the > matrix always fits to cover the range of motion. > > > > I did figure out a way to avoid deleting the matrix every time I solve > in my system, so that provides some level of an optimization if PETSc > doesn?t support restructuring sparse matrices out of the box. With this > approach I can also avoid over-allocating memory. However, my use case is > unlikely to hit the memory limits, even if I have to over-subscribe the > preallocation to cover the entire range of motion. So I think there will > still be a clear benefit to doing the allocation once, and avoiding the > reallocation each time structure changes. > > > > Not sure if this figure will get filtered out by the mailing list, but > this shows a basic overset matrix structure. The green portions are the > overset coupling regions, which may move around the matrix as the body > moves. The number of coupling coefficients has a maximum N which is known, > so I can preallocate to cover the entire range of motion. Hope that helps > describe the use case. > > > > > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > > Sent: Wednesday, March 16, 2016 10:37 AM > > To: Smith, Kevin R. > > Cc: Barry Smith; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > On Wed, Mar 16, 2016 at 8:23 AM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > Barry - I need to avoid dynamic allocation because it results in too > much of a slowdown. > > Matt - The MatSetType thing did not work for some reason. It seems like > PETSc wants me to set the preallocation after calling this method > (suggested by the error below). I'm trying to reuse the original block of > preallocated memory for the sparse matrix and blow away the structure each > time. This would let me avoid repeated deallocate/allocate calls throughout > my simulation. > > > > We need to understand the use case better. How can you guarantee that > > your new matrix will fit in the old matrix memroy? If you cannot, then > we have to reallocate anyway. > > > > Thanks, > > > > Matt > > > > PETSc Function: MatGetOwnershipRange > > PETSc File: > > /share/maul-data/smithkr1/src/petsc-3.6.0/src/mat/interface/matrix.c > > PETSc Line: 6289 > > PETSc Error Code: 73 > > PETSc Error reference: > > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html > > PETSc Message: Must call MatXXXSetPreallocation() or MatSetUp() on > > argument 1 "mat" before MatGetOwnershipRange() > > > > Kind regards, > > Kevin > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: Tuesday, March 15, 2016 2:02 PM > > To: Smith, Kevin R. > > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > > > On Mar 15, 2016, at 12:56 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > > > > > > > > Barry - Yeah, I suspected it was doing this. In my original > implementation I would get allocation errors. > > > > You need to call a MatSetOption() to tell the matrix that it is > allowed to allocate new nonzeros. > > > > > > > > > > Matt - Thanks, I will try this solution out. > > > > > > Thanks for your help, > > > > > > Kevin > > > > > > -----Original Message----- > > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > > Sent: Tuesday, March 15, 2016 1:48 PM > > > To: Matthew Knepley > > > Cc: Smith, Kevin R.; petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] Matrix reuse with changing structure > > > > > > > > >> On Mar 15, 2016, at 12:28 PM, Matthew Knepley > wrote: > > >> > > >> On Tue, Mar 15, 2016 at 12:24 PM, Smith, Kevin R. < > Kevin.R.Smith at jhuapl.edu> wrote: > > >> Hello, > > >> > > >> > > >> > > >> Is it possible to reuse a sparse matrix and not reuse its non-zero > structure? I have a matrix whose sparse structure changes each time. I?m > hoping to avoid destroying and allocating new matrices each time the > structure changes. > > > > > > Sure you can just add new nonzero locations at a later time. But it > won't take out any current entries even if they are zero. So effectively > the nonzero structure grows over time. > > > > > > Barry > > > > > >> > > >> > > >> Hmm, I can't find a toplevel API that does this (it would be > > >> something like MatReset()). You can get this effect using > > >> > > >> MatSetType(A, MATSHELL) > > >> MatSetType(A, ) > > >> > > >> A little messy but it will work. > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> > > >> > > >> Thanks, > > >> > > >> Kevin > > >> > > >> > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >> -- Norbert Wiener > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Mar 17 10:24:25 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 17 Mar 2016 11:24:25 -0400 Subject: [petsc-users] libtools mismatch on osx In-Reply-To: References: Message-ID: Mark, I remember having similar libtool-related issues before when configuring p4est on OS X. The way I ended up fixing it is using libtool from homebrew. Just a note that homebrew creates 'glibtool' and 'glibtoolize' to avoid conflict with Apple. As a side note p4est itself (along with petsc and many of useful external packages) are also available through homebrew/science tap as binaries. Hope this helps. Mohammad On Thu, Mar 17, 2016 at 9:56 AM, Mark Adams wrote: > I manually installed libtools yesterday and all seemed well but I cloned a > new PETSc just now and now get this error. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Mar 17 11:08:51 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 17 Mar 2016 11:08:51 -0500 Subject: [petsc-users] libtools mismatch on osx In-Reply-To: References: Message-ID: After installing 'libtool, autoconf, automake' from brew - --download-p4est=1 worked for me. Satish On Thu, 17 Mar 2016, Mark Adams wrote: > I manually installed libtools yesterday and all seemed well but I cloned a > new PETSc just now and now get this error. > From mfadams at lbl.gov Thu Mar 17 11:22:33 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 17 Mar 2016 12:22:33 -0400 Subject: [petsc-users] libtools mismatch on osx In-Reply-To: References: Message-ID: OK, I tied homebrew, it did not work, I tried manual installation. The problem was that I did not install autoconf and automake with brew. Seems to be working, Thanks On Thu, Mar 17, 2016 at 12:08 PM, Satish Balay wrote: > After installing 'libtool, autoconf, automake' from brew - > --download-p4est=1 worked for me. > > Satish > > On Thu, 17 Mar 2016, Mark Adams wrote: > > > I manually installed libtools yesterday and all seemed well but I cloned > a > > new PETSc just now and now get this error. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doougsini at gmail.com Thu Mar 17 19:20:58 2016 From: doougsini at gmail.com (Seungbum Koo) Date: Thu, 17 Mar 2016 19:20:58 -0500 Subject: [petsc-users] Will unassigned slots of Mat Created by MatCreateAIJ be removed during MatAssemblyBegin - End? Message-ID: I used to create matrix by calling MatCreateAIJ without knowing the exact nonzero pattern of the matrix. So I preallocate maximum nonzeros that can appear when creating a matrix. After setting nonzero values by MatSetValues I call MatAssemblyBegin and MatAssemblyEnd for actual linear algebra. Now, will the unassigned, or assigned by zero preallocated slots in matrix be removed? Or Will it stay with containing zeros? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 17 19:33:11 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Mar 2016 19:33:11 -0500 Subject: [petsc-users] Will unassigned slots of Mat Created by MatCreateAIJ be removed during MatAssemblyBegin - End? In-Reply-To: References: Message-ID: On Thu, Mar 17, 2016 at 7:20 PM, Seungbum Koo wrote: > I used to create matrix by calling MatCreateAIJ without knowing the > exact nonzero pattern of the matrix. So I preallocate maximum nonzeros that > can appear when creating a matrix. > > After setting nonzero values by MatSetValues I call MatAssemblyBegin and > MatAssemblyEnd for actual linear algebra. Now, will the unassigned, or > assigned by zero preallocated slots in matrix be removed? Or Will it stay > with containing zeros? > If you explicitly put a 0 in there, it will stay. However, extra space you allocated that is not filled up will be removed on MatAssembly. MAtt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tabrezali at gmail.com Thu Mar 17 21:33:25 2016 From: tabrezali at gmail.com (Tabrez Ali) Date: Thu, 17 Mar 2016 21:33:25 -0500 Subject: [petsc-users] New matrix with same distribution as an old submatrix Message-ID: <56EB68F5.3040809@gmail.com> Hello Lets say I create a submatrix K_sub from a larger matrix K using MatGetSubMatrix. Now how do I create a new matrix M which has the same parallel distribution as K_sub? Thanks in advance. Tabrez -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Mar 17 21:25:50 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 17 Mar 2016 21:25:50 -0500 Subject: [petsc-users] New matrix with same distribution as an old submatrix In-Reply-To: <56EB68F5.3040809@gmail.com> References: <56EB68F5.3040809@gmail.com> Message-ID: <02425F84-803B-4949-9E17-0F9CB51DB84B@mcs.anl.gov> Use MatGetLocalSizes() in the K_sub and then create the new matrix with those local sizes. > On Mar 17, 2016, at 9:33 PM, Tabrez Ali wrote: > > Hello > > Lets say I create a submatrix K_sub from a larger matrix K using MatGetSubMatrix. Now how do I create a new matrix M which has the same parallel distribution as K_sub? > > Thanks in advance. > > Tabrez From overholt at capesim.com Fri Mar 18 09:49:02 2016 From: overholt at capesim.com (Matthew Overholt) Date: Fri, 18 Mar 2016 10:49:02 -0400 Subject: [petsc-users] makefile for compiling multiple sources Message-ID: <003801d18125$51af21a0$f50d64e0$@capesim.com> Hi, I'm just getting started with PETSc and have been able to configure and run the examples, but as I'm starting to put together a more substantial code with multiple source files I haven't been able to find or create a makefile which works and follows PETSc guidelines. I've configured and am building with the Intel compilers and MKL for blas & lapack, as per the installation example, and with the C++ language. Here's my makefile -------------------------------------- # # makefile for femex1 - PETSc/kde/ex1.c adapted for FEA # # Usage: make all # make clean # # PETSc was configured using the Intel Compilers, MKL, and C++ # CFLAGS = CPPFLAGS = CLFLAGS = LIBFILES = ${PETSC_KSP_LIB} TARGET = femex1 CLEANFILES = $(TARGET) include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules OBJFILES = femex1.o \ shapes.o all: $(TARGET) $(TARGET) : $(OBJFILES) $(LIBFILES) ${CLINKER} -o $(TARGET) $(OBJFILES) $(LIBFILES) # femex1.cpp has 'main' and PETSc calls femex1.o: femex1.cpp lists.h shapes.h chkopts ${PETSC_CXXCOMPILE} femex1.cpp # shapes.cpp does not have any PETSc calls shapes.o: shapes.cpp shapes.h icpc -c shapes.cpp And here's the result ------------------------------------ [Matt at HPCL1 mycode]$ make all /opt/petsc/petsc-3.6.3/linux-gnu-intel/bin/mpicxx -c -wd1572 -g -fPIC -I/opt/petsc/petsc-3.6.3/include -I/opt/petsc/petsc-3.6.3/linux-gnu-intel/include femex1.cpp icpc -c shapes.cpp make: *** No rule to make target `-Wl,-rpath,/opt/petsc/petsc-3.6.3/linux-gnu-intel/lib', needed by `femex1'. Stop. --------------------------------------------------------------- So it compiles the two sources, but then fails on the linking step. Thanks in advance. Matt Overholt CapeSym, Inc. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 18 10:04:17 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 18 Mar 2016 10:04:17 -0500 Subject: [petsc-users] makefile for compiling multiple sources In-Reply-To: <003801d18125$51af21a0$f50d64e0$@capesim.com> References: <003801d18125$51af21a0$f50d64e0$@capesim.com> Message-ID: On Fri, Mar 18, 2016 at 9:49 AM, Matthew Overholt wrote: > Hi, > > > > I?m just getting started with PETSc and have been able to configure and > run the examples, but as I?m starting to put together a more substantial > code with multiple source files I haven?t been able to find or create a > makefile which works and follows PETSc guidelines. > > > > I?ve configured and am building with the Intel compilers and MKL for blas > & lapack, as per the installation example, and with the C++ language. > Use this: CFLAGS = CPPFLAGS = LIBFILES = TARGET = femex1 OBJFILES = femex1.o shapes.o CLEANFILES = $(TARGET) include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules all: $(TARGET) $(TARGET) : $(OBJFILES) ${CLINKER} -o $(TARGET) $(OBJFILES) ${PETSC_KSP_LIB} and do not mess around with putting in header file dependencies yourself, use https://ccache.samba.org/ Thanks, Matt > Here?s my makefile -------------------------------------- > > # > > # makefile for femex1 - PETSc/kde/ex1.c adapted for FEA > > # > > # Usage: make all > > # make clean > > # > > # PETSc was configured using the Intel Compilers, MKL, and C++ > > # > > CFLAGS = > > CPPFLAGS = > > CLFLAGS = > > LIBFILES = ${PETSC_KSP_LIB} > > TARGET = femex1 > > CLEANFILES = $(TARGET) > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > OBJFILES = femex1.o \ > > shapes.o > > > > all: $(TARGET) > > > > $(TARGET) : $(OBJFILES) $(LIBFILES) > > ${CLINKER} -o $(TARGET) $(OBJFILES) $(LIBFILES) > > > > # femex1.cpp has 'main' and PETSc calls > > femex1.o: femex1.cpp lists.h shapes.h chkopts > > ${PETSC_CXXCOMPILE} femex1.cpp > > > > # shapes.cpp does not have any PETSc calls > > shapes.o: shapes.cpp shapes.h > > icpc -c shapes.cpp > > > > And here?s the result ------------------------------------ > > > > [Matt at HPCL1 mycode]$ make all > > /opt/petsc/petsc-3.6.3/linux-gnu-intel/bin/mpicxx -c -wd1572 -g -fPIC > -I/opt/petsc/petsc-3.6.3/include > -I/opt/petsc/petsc-3.6.3/linux-gnu-intel/include femex1.cpp > > icpc -c shapes.cpp > > make: *** No rule to make target > `-Wl,-rpath,/opt/petsc/petsc-3.6.3/linux-gnu-intel/lib', needed by > `femex1'. Stop. > > > > --------------------------------------------------------------- > > So it compiles the two sources, but then fails on the linking step. > > > > Thanks in advance. > > Matt Overholt > > CapeSym, Inc. > > > This > email is safe. www.avast.com > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From overholt at capesim.com Fri Mar 18 10:17:40 2016 From: overholt at capesim.com (Matthew Overholt) Date: Fri, 18 Mar 2016 11:17:40 -0400 Subject: [petsc-users] makefile for compiling multiple sources In-Reply-To: References: <003801d18125$51af21a0$f50d64e0$@capesim.com> Message-ID: <004c01d18129$51acb3d0$f5061b70$@capesim.com> Thank you, Matt, removing the library from the target dependencies did the trick. Thanks also for the tip on ccache. Matt? From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Friday, March 18, 2016 11:04 AM To: overholt at capesim.com Cc: PETSc Subject: Re: [petsc-users] makefile for compiling multiple sources On Fri, Mar 18, 2016 at 9:49 AM, Matthew Overholt wrote: Hi, I?m just getting started with PETSc and have been able to configure and run the examples, but as I?m starting to put together a more substantial code with multiple source files I haven?t been able to find or create a makefile which works and follows PETSc guidelines. I?ve configured and am building with the Intel compilers and MKL for blas & lapack, as per the installation example, and with the C++ language. Use this: CFLAGS = CPPFLAGS = LIBFILES = TARGET = femex1 OBJFILES = femex1.o shapes.o CLEANFILES = $(TARGET) include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules all: $(TARGET) $(TARGET) : $(OBJFILES) ${CLINKER} -o $(TARGET) $(OBJFILES) ${PETSC_KSP_LIB} and do not mess around with putting in header file dependencies yourself, use https://ccache.samba.org/ Thanks, Matt Here?s my makefile -------------------------------------- # # makefile for femex1 - PETSc/kde/ex1.c adapted for FEA # # Usage: make all # make clean # # PETSc was configured using the Intel Compilers, MKL, and C++ # CFLAGS = CPPFLAGS = CLFLAGS = LIBFILES = ${PETSC_KSP_LIB} TARGET = femex1 CLEANFILES = $(TARGET) include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules OBJFILES = femex1.o \ shapes.o all: $(TARGET) $(TARGET) : $(OBJFILES) $(LIBFILES) ${CLINKER} -o $(TARGET) $(OBJFILES) $(LIBFILES) # femex1.cpp has 'main' and PETSc calls femex1.o: femex1.cpp lists.h shapes.h chkopts ${PETSC_CXXCOMPILE} femex1.cpp # shapes.cpp does not have any PETSc calls shapes.o: shapes.cpp shapes.h icpc -c shapes.cpp And here?s the result ------------------------------------ [Matt at HPCL1 mycode]$ make all /opt/petsc/petsc-3.6.3/linux-gnu-intel/bin/mpicxx -c -wd1572 -g -fPIC -I/opt/petsc/petsc-3.6.3/include -I/opt/petsc/petsc-3.6.3/linux-gnu-intel/include femex1.cpp icpc -c shapes.cpp make: *** No rule to make target `-Wl,-rpath,/opt/petsc/petsc-3.6.3/linux-gnu-intel/lib', needed by `femex1'. Stop. --------------------------------------------------------------- So it compiles the two sources, but then fails on the linking step. Thanks in advance. Matt Overholt CapeSym, Inc. This email is safe. www.avast.com -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Mar 18 11:11:53 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 18 Mar 2016 11:11:53 -0500 Subject: [petsc-users] makefile for compiling multiple sources In-Reply-To: References: <003801d18125$51af21a0$f50d64e0$@capesim.com> Message-ID: On Fri, 18 Mar 2016, Matthew Knepley wrote: > On Fri, Mar 18, 2016 at 9:49 AM, Matthew Overholt > wrote: > > > Hi, > > > > > > > > I?m just getting started with PETSc and have been able to configure and > > run the examples, but as I?m starting to put together a more substantial > > code with multiple source files I haven?t been able to find or create a > > makefile which works and follows PETSc guidelines. > > > > > > > > I?ve configured and am building with the Intel compilers and MKL for blas > > & lapack, as per the installation example, and with the C++ language. > > > > Use this: > > CFLAGS = > CPPFLAGS = > LIBFILES = > TARGET = femex1 > OBJFILES = femex1.o shapes.o > CLEANFILES = $(TARGET) > > include ${PETSC_DIR}/lib/petsc/conf/variables > include ${PETSC_DIR}/lib/petsc/conf/rules > > all: $(TARGET) > > $(TARGET) : $(OBJFILES) > ${CLINKER} -o $(TARGET) $(OBJFILES) ${PETSC_KSP_LIB} > > > and do not mess around with putting in header file dependencies yourself, > use https://ccache.samba.org/ Not sure how ccache helps with header dependencies. One can add in the dependencies to the makefile femex1.o: femex1.cpp lists.h shapes.h shapes.o: shapes.cpp shapes.h Satish > > Thanks, > > Matt > > > > Here?s my makefile -------------------------------------- > > > > # > > > > # makefile for femex1 - PETSc/kde/ex1.c adapted for FEA > > > > # > > > > # Usage: make all > > > > # make clean > > > > # > > > > # PETSc was configured using the Intel Compilers, MKL, and C++ > > > > # > > > > CFLAGS = > > > > CPPFLAGS = > > > > CLFLAGS = > > > > LIBFILES = ${PETSC_KSP_LIB} > > > > TARGET = femex1 > > > > CLEANFILES = $(TARGET) > > > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > > > OBJFILES = femex1.o \ > > > > shapes.o > > > > > > > > all: $(TARGET) > > > > > > > > $(TARGET) : $(OBJFILES) $(LIBFILES) > > > > ${CLINKER} -o $(TARGET) $(OBJFILES) $(LIBFILES) > > > > > > > > # femex1.cpp has 'main' and PETSc calls > > > > femex1.o: femex1.cpp lists.h shapes.h chkopts > > > > ${PETSC_CXXCOMPILE} femex1.cpp > > > > > > > > # shapes.cpp does not have any PETSc calls > > > > shapes.o: shapes.cpp shapes.h > > > > icpc -c shapes.cpp > > > > > > > > And here?s the result ------------------------------------ > > > > > > > > [Matt at HPCL1 mycode]$ make all > > > > /opt/petsc/petsc-3.6.3/linux-gnu-intel/bin/mpicxx -c -wd1572 -g -fPIC > > -I/opt/petsc/petsc-3.6.3/include > > -I/opt/petsc/petsc-3.6.3/linux-gnu-intel/include femex1.cpp > > > > icpc -c shapes.cpp > > > > make: *** No rule to make target > > `-Wl,-rpath,/opt/petsc/petsc-3.6.3/linux-gnu-intel/lib', needed by > > `femex1'. Stop. > > > > > > > > --------------------------------------------------------------- > > > > So it compiles the two sources, but then fails on the linking step. > > > > > > > > Thanks in advance. > > > > Matt Overholt > > > > CapeSym, Inc. > > > > > > This > > email is safe. www.avast.com > > > > > > > > From davydden at gmail.com Fri Mar 18 12:14:23 2016 From: davydden at gmail.com (Denis Davydov) Date: Fri, 18 Mar 2016 18:14:23 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX Message-ID: Dear all, Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. Log is attached. Kind regards, Denis -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.zip Type: application/zip Size: 17045 bytes Desc: not available URL: From balay at mcs.anl.gov Fri Mar 18 13:01:35 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 18 Mar 2016 13:01:35 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: your mpicc is barfing stuff on stderr - thus confusing petsc configure. I don't see this issue with the old openmpi I have access to.. What do you have for: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show touch foo.c /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show I'm not sure what to suggest.. Perhaps you can use --download-mpich [or --download-openmpi] - but then I see you are building all these packages using spack - so perhaps you should check with 'spack' folk.. Satish --------- Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c Possible ERROR while running compiler: stderr: clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' On Fri, 18 Mar 2016, Denis Davydov wrote: > Dear all, > > Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > Log is attached. > > Kind regards, > Denis > > From balay at mcs.anl.gov Fri Mar 18 13:13:56 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 18 Mar 2016 13:13:56 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: Or you can workarround by using additional PETSc configure option: CPPFLAGS=-Qunused-arguments [or CFLAGS?] Satish On Fri, 18 Mar 2016, Satish Balay wrote: > your mpicc is barfing stuff on stderr - thus confusing petsc configure. > > I don't see this issue with the old openmpi I have access to.. > > What do you have for: > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > touch foo.c > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > > I'm not sure what to suggest.. Perhaps you can use --download-mpich > [or --download-openmpi] - but then I see you are building all these > packages using spack - so perhaps you should check with 'spack' folk.. > > Satish > > --------- > > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > Possible ERROR while running compiler: > stderr: > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > On Fri, 18 Mar 2016, Denis Davydov wrote: > > > Dear all, > > > > Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > > Log is attached. > > > > Kind regards, > > Denis > > > > > From davydden at gmail.com Fri Mar 18 13:14:35 2016 From: davydden at gmail.com (Denis Davydov) Date: Fri, 18 Mar 2016 19:14:35 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: <101FF0CB-7736-4164-B333-DD3DDE801336@gmail.com> Hi Satish, > On 18 Mar 2016, at 19:01, Satish Balay wrote: > > your mpicc is barfing stuff on stderr - thus confusing petsc configure. yes, i noticed that. I don?t have that issue when using Homebrew?s build OpenMPI, and thus no confusion to petsc config. Also those things are just warnings, why are they reported to stderr is not clear? I will try suppressing it with -Qunused-arguments ... > > I don't see this issue with the old openmpi I have access to.. > > What do you have for: Looks reasonable to me: > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > touch foo.c /usr/bin/clang touch foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include -L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib -lmpi > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show /usr/bin/clang -c foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include > > I'm not sure what to suggest.. Perhaps you can use --download-mpich > [or --download-openmpi] - but then I see you are building all these > packages using spack - so perhaps you should check with 'spack' folk.. Will certainly do that. I am just toying with spack to see how good is the support for OS-X. Regards, Denis. > > Satish > > --------- > > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > Possible ERROR while running compiler: > stderr: > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > On Fri, 18 Mar 2016, Denis Davydov wrote: > >> Dear all, >> >> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. >> Log is attached. >> >> Kind regards, >> Denis >> >> From rupp at iue.tuwien.ac.at Sat Mar 19 02:29:18 2016 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Sat, 19 Mar 2016 08:29:18 +0100 Subject: [petsc-users] Student Travel Grants for PETSc User Meeting 2016 Message-ID: <56ECFFCE.70704@iue.tuwien.ac.at> Dear PETSc users, we are proud to announce the availability of a limited number of student travel grants for the PETSc User Meeting 2016 in Vienna, Austria. Interested students, who want to join the meeting but have no (or only partial) funding available, are welcome to apply by April 20, 2016. Full details are available on the User Meeting webpage: http://www.mcs.anl.gov/petsc/meetings/2016/ Please forward this message to your students as you see fit. Best regards, Karl Rupp and the PETSc team -------- Original Message -------- Subject: PETSc User Meeting 2016 Date: Fri, 5 Feb 2016 07:21:30 +0100 From: Karl Rupp To: petsc-announce at mcs.anl.gov, For users of the development version of PETSc , PETSc users list Dear PETSc users, we would like to invite you to join us for a PETSc User Meeting held in Vienna, Austria, on June 28-30, 2016. The first day consists of tutorials on various aspects and features of PETSc. The second and third day are devoted to exchange, discussions, and a refinement of strategies for the future with our users. We encourage you to present work illustrating your own use of PETSc, for example in applications or in libraries built on top of PETSc. Registration for the PETSc User Meeting 2016 is free of charge. We can host a maximum of 120 participants, so register soon (but no later than May 1). We are still in the process of acquiring money for providing student travel grants; prospective sponsors, please contact us. Website with full information: http://www.mcs.anl.gov/petsc/meetings/2016/ We are looking forward to welcoming you in Vienna! Karl Rupp and the PETSc team From davydden at gmail.com Sat Mar 19 02:37:26 2016 From: davydden at gmail.com (Denis Davydov) Date: Sat, 19 Mar 2016 08:37:26 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: Hi Satish, I think you have a bug in you config system as "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try to explicitly specify "FPPFLAGS=" to nothing : TEST checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) TESTING: checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) Locate a functional Fortran compiler Checking for program /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found Defined make macro "FC" to "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" Pushing language FC Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers -Qunused-arguments /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F Possible ERROR while running compiler: exit code 256 stderr: gfortran: error: unrecognized command line option '-Qunused-arguments? Kind regards, Denis > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > Or you can workarround by using additional PETSc configure option: > > CPPFLAGS=-Qunused-arguments > [or CFLAGS?] > > Satish > > > On Fri, 18 Mar 2016, Satish Balay wrote: > >> your mpicc is barfing stuff on stderr - thus confusing petsc configure. >> >> I don't see this issue with the old openmpi I have access to.. >> >> What do you have for: >> >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show >> touch foo.c >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show >> >> I'm not sure what to suggest.. Perhaps you can use --download-mpich >> [or --download-openmpi] - but then I see you are building all these >> packages using spack - so perhaps you should check with 'spack' folk.. >> >> Satish >> >> --------- >> >> Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c >> Possible ERROR while running compiler: >> stderr: >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' >> >> On Fri, 18 Mar 2016, Denis Davydov wrote: >> >>> Dear all, >>> >>> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. >>> Log is attached. >>> >>> Kind regards, >>> Denis >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Mar 19 11:29:03 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 11:29:03 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: This code is a bit convoluted.. There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. And then base.py has: [which is wrong for 'Cxx', 'FC'?] def getPreprocessorFlagsName(self, language): if language in ['C', 'Cxx', 'FC']: flagsArg = 'CPPFLAGS' elif language == 'CUDA': flagsArg = 'CUDAPPFLAGS' else: raise RuntimeError('Unknown language: '+language) return flagsArg And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: class Preprocessor(config.compile.C.Preprocessor): '''The Fortran preprocessor, which now is just the C preprocessor''' def __init__(self, argDB): config.compile.C.Preprocessor.__init__(self, argDB) Matt - should we have FPP,FPPFLAGS supported here? Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: diff --git a/config/BuildSystem/config/compile/FC.py b/config/BuildSystem/config/compile/FC.py index 3d0bf74..7bae24d 100644 --- a/config/BuildSystem/config/compile/FC.py +++ b/config/BuildSystem/config/compile/FC.py @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): config.compile.C.Preprocessor.__init__(self, argDB) self.language = 'FC' self.targetExtension = '.F' + self.flagsName = '' self.includeDirectories = sets.Set() return And then [I'm not sure where this gets used..] $ git diff config/BuildSystem/config/base.py |cat diff --git a/config/BuildSystem/config/base.py b/config/BuildSystem/config/base.py index b18a173..8b1129d 100644 --- a/config/BuildSystem/config/base.py +++ b/config/BuildSystem/config/base.py @@ -454,8 +454,12 @@ class Configure(script.Script): # Should be static def getPreprocessorFlagsName(self, language): - if language in ['C', 'Cxx', 'FC']: + if language == 'C': flagsArg = 'CPPFLAGS' + elif language == 'Cxx': + flagsArg = 'CXXCPPFLAGS' + elif language == 'FC': + flagsArg = '' elif language == 'CUDA': flagsArg = 'CUDAPPFLAGS' else: Satish On Sat, 19 Mar 2016, Denis Davydov wrote: > Hi Satish, > > I think you have a bug in you config system as "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try to explicitly specify "FPPFLAGS=" to nothing : > > TEST checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > TESTING: checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > Locate a functional Fortran compiler > Checking for program /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > Defined make macro "FC" to "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > Pushing language FC > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers -Qunused-arguments /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > Possible ERROR while running compiler: exit code 256 > stderr: > gfortran: error: unrecognized command line option '-Qunused-arguments? > > Kind regards, > Denis > > > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > > > Or you can workarround by using additional PETSc configure option: > > > > CPPFLAGS=-Qunused-arguments > > [or CFLAGS?] > > > > Satish > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc configure. > >> > >> I don't see this issue with the old openmpi I have access to.. > >> > >> What do you have for: > >> > >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > >> touch foo.c > >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > >> > >> I'm not sure what to suggest.. Perhaps you can use --download-mpich > >> [or --download-openmpi] - but then I see you are building all these > >> packages using spack - so perhaps you should check with 'spack' folk.. > >> > >> Satish > >> > >> --------- > >> > >> Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > >> Possible ERROR while running compiler: > >> stderr: > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > >> > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > >> > >>> Dear all, > >>> > >>> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > >>> Log is attached. > >>> > >>> Kind regards, > >>> Denis > >>> > >>> > >> > > From knepley at gmail.com Sat Mar 19 11:38:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 19 Mar 2016 11:38:32 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay wrote: > This code is a bit convoluted.. > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. > Is this really true? Is the Fortran preprocessors not the C preprocessor? I have never encountered this. If you want flags which are different for C, use CFLAGS. Am I missing something? Matt > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > def getPreprocessorFlagsName(self, language): > if language in ['C', 'Cxx', 'FC']: > flagsArg = 'CPPFLAGS' > elif language == 'CUDA': > flagsArg = 'CUDAPPFLAGS' > else: > raise RuntimeError('Unknown language: '+language) > return flagsArg > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > class Preprocessor(config.compile.C.Preprocessor): > '''The Fortran preprocessor, which now is just the C preprocessor''' > def __init__(self, argDB): > config.compile.C.Preprocessor.__init__(self, argDB) > > Matt - should we have FPP,FPPFLAGS supported here? > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > diff --git a/config/BuildSystem/config/compile/FC.py > b/config/BuildSystem/config/compile/FC.py > index 3d0bf74..7bae24d 100644 > --- a/config/BuildSystem/config/compile/FC.py > +++ b/config/BuildSystem/config/compile/FC.py > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > config.compile.C.Preprocessor.__init__(self, argDB) > self.language = 'FC' > self.targetExtension = '.F' > + self.flagsName = '' > self.includeDirectories = sets.Set() > return > > And then [I'm not sure where this gets used..] > > $ git diff config/BuildSystem/config/base.py |cat > diff --git a/config/BuildSystem/config/base.py > b/config/BuildSystem/config/base.py > index b18a173..8b1129d 100644 > --- a/config/BuildSystem/config/base.py > +++ b/config/BuildSystem/config/base.py > @@ -454,8 +454,12 @@ class Configure(script.Script): > > # Should be static > def getPreprocessorFlagsName(self, language): > - if language in ['C', 'Cxx', 'FC']: > + if language == 'C': > flagsArg = 'CPPFLAGS' > + elif language == 'Cxx': > + flagsArg = 'CXXCPPFLAGS' > + elif language == 'FC': > + flagsArg = '' > elif language == 'CUDA': > flagsArg = 'CUDAPPFLAGS' > else: > > > Satish > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > Hi Satish, > > > > I think you have a bug in you config system as > "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try > to explicitly specify "FPPFLAGS=" to nothing : > > > > TEST checkFortranCompiler from > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > TESTING: checkFortranCompiler from > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > Locate a functional Fortran compiler > > Checking for program > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > Defined make macro "FC" to > "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > Pushing language FC > > Executing: > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 > -c -o > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers > -Qunused-arguments > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > Possible ERROR while running compiler: exit code 256 > > stderr: > > gfortran: error: unrecognized command line option '-Qunused-arguments? > > > > Kind regards, > > Denis > > > > > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > CPPFLAGS=-Qunused-arguments > > > [or CFLAGS?] > > > > > > Satish > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc > configure. > > >> > > >> I don't see this issue with the old openmpi I have access to.. > > >> > > >> What do you have for: > > >> > > >> > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > -show > > >> touch foo.c > > >> > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > -c foo.c -show > > >> > > >> I'm not sure what to suggest.. Perhaps you can use --download-mpich > > >> [or --download-openmpi] - but then I see you are building all these > > >> packages using spack - so perhaps you should check with 'spack' folk.. > > >> > > >> Satish > > >> > > >> --------- > > >> > > >> Executing: > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > -c -o > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > >> Possible ERROR while running compiler: > > >> stderr: > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > >> clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > >> > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > >> > > >>> Dear all, > > >>> > > >>> Although I saw this issue on the mailing list, that was related to > broken compilers, i don?t think it?s the case here as `mpicc -E` seems to > be ok. > > >>> Log is attached. > > >>> > > >>> Kind regards, > > >>> Denis > > >>> > > >>> > > >> > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Mar 19 11:49:10 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 11:49:10 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: On Sat, 19 Mar 2016, Matthew Knepley wrote: > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay wrote: > > > This code is a bit convoluted.. > > > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. > > > > Is this really true? Is the Fortran preprocessors not the C preprocessor? I > have never encountered this. > If you want flags which are different for C, use CFLAGS. Am I missing > something? Even if we assume CPPFLAGS are -D only - it wont' work with all FC compilers. For ex: xlf does not recognize -D option - its -WF,-D. So we cannont assume FC always uses CPP [or CPPFLAGS] And some flags have to be used at compile time only - and some can be used at both compile & linktime - and some at linktime only. So we have CPPFLAGS, CFLAGS, LDFLAGS [not compiler specific?] In this case -Qunused-arguments can go into CFLAGS - but presumably there are other compiler flags that cannot be used at link time - so have to use with CPPFLAGS only? And this whole thread started with clang barfing on using a link flag during compile time.. Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c Possible ERROR while running compiler: stderr: clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' So having proper support for CFLAGS, CPPFLAGS, FLAGS, FPPFLAGS might be the correct thing to do.. Satish > > Matt > > > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > > > def getPreprocessorFlagsName(self, language): > > if language in ['C', 'Cxx', 'FC']: > > flagsArg = 'CPPFLAGS' > > elif language == 'CUDA': > > flagsArg = 'CUDAPPFLAGS' > > else: > > raise RuntimeError('Unknown language: '+language) > > return flagsArg > > > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > > > class Preprocessor(config.compile.C.Preprocessor): > > '''The Fortran preprocessor, which now is just the C preprocessor''' > > def __init__(self, argDB): > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > Matt - should we have FPP,FPPFLAGS supported here? > > > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > > > diff --git a/config/BuildSystem/config/compile/FC.py > > b/config/BuildSystem/config/compile/FC.py > > index 3d0bf74..7bae24d 100644 > > --- a/config/BuildSystem/config/compile/FC.py > > +++ b/config/BuildSystem/config/compile/FC.py > > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > > config.compile.C.Preprocessor.__init__(self, argDB) > > self.language = 'FC' > > self.targetExtension = '.F' > > + self.flagsName = '' > > self.includeDirectories = sets.Set() > > return > > > > And then [I'm not sure where this gets used..] > > > > $ git diff config/BuildSystem/config/base.py |cat > > diff --git a/config/BuildSystem/config/base.py > > b/config/BuildSystem/config/base.py > > index b18a173..8b1129d 100644 > > --- a/config/BuildSystem/config/base.py > > +++ b/config/BuildSystem/config/base.py > > @@ -454,8 +454,12 @@ class Configure(script.Script): > > > > # Should be static > > def getPreprocessorFlagsName(self, language): > > - if language in ['C', 'Cxx', 'FC']: > > + if language == 'C': > > flagsArg = 'CPPFLAGS' > > + elif language == 'Cxx': > > + flagsArg = 'CXXCPPFLAGS' > > + elif language == 'FC': > > + flagsArg = '' > > elif language == 'CUDA': > > flagsArg = 'CUDAPPFLAGS' > > else: > > > > > > Satish > > > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > > Hi Satish, > > > > > > I think you have a bug in you config system as > > "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try > > to explicitly specify "FPPFLAGS=" to nothing : > > > > > > TEST checkFortranCompiler from > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > TESTING: checkFortranCompiler from > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > Locate a functional Fortran compiler > > > Checking for program > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > > Defined make macro "FC" to > > "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > > Pushing language FC > > > Executing: > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 > > -c -o > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers > > -Qunused-arguments > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > > Possible ERROR while running compiler: exit code 256 > > > stderr: > > > gfortran: error: unrecognized command line option '-Qunused-arguments? > > > > > > Kind regards, > > > Denis > > > > > > > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > > > CPPFLAGS=-Qunused-arguments > > > > [or CFLAGS?] > > > > > > > > Satish > > > > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc > > configure. > > > >> > > > >> I don't see this issue with the old openmpi I have access to.. > > > >> > > > >> What do you have for: > > > >> > > > >> > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > -show > > > >> touch foo.c > > > >> > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > -c foo.c -show > > > >> > > > >> I'm not sure what to suggest.. Perhaps you can use --download-mpich > > > >> [or --download-openmpi] - but then I see you are building all these > > > >> packages using spack - so perhaps you should check with 'spack' folk.. > > > >> > > > >> Satish > > > >> > > > >> --------- > > > >> > > > >> Executing: > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > -c -o > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > >> Possible ERROR while running compiler: > > > >> stderr: > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > > >> clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > >> > > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > > >> > > > >>> Dear all, > > > >>> > > > >>> Although I saw this issue on the mailing list, that was related to > > broken compilers, i don?t think it?s the case here as `mpicc -E` seems to > > be ok. > > > >>> Log is attached. > > > >>> > > > >>> Kind regards, > > > >>> Denis > > > >>> > > > >>> > > > >> > > > > > > > > > > > > From knepley at gmail.com Sat Mar 19 11:52:23 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 19 Mar 2016 11:52:23 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: On Sat, Mar 19, 2016 at 11:49 AM, Satish Balay wrote: > On Sat, 19 Mar 2016, Matthew Knepley wrote: > > > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > wrote: > > > > > This code is a bit convoluted.. > > > > > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or > FPPFLAGS.. > > > > > > > Is this really true? Is the Fortran preprocessors not the C > preprocessor? I > > have never encountered this. > > If you want flags which are different for C, use CFLAGS. Am I missing > > something? > > Even if we assume CPPFLAGS are -D only - it wont' work with all FC > compilers. For ex: xlf does not recognize -D option - its -WF,-D. > So we cannont assume FC always uses CPP [or CPPFLAGS] > > And some flags have to be used at compile time only - and some can be > used at both compile & linktime - and some at linktime only. > > So we have CPPFLAGS, CFLAGS, LDFLAGS [not compiler specific?] > > In this case -Qunused-arguments can go into CFLAGS - but presumably > there are other compiler flags that cannot be used at link time - so > have to use with CPPFLAGS only? > > And this whole thread started with clang barfing on using a link flag > during compile time.. > > Executing: > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > -c -o > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > Possible ERROR while running compiler: > stderr: > clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > clang: warning: argument unused during compilation: > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > So having proper support for CFLAGS, CPPFLAGS, FLAGS, FPPFLAGS might be > the correct thing to do.. Won't that be incredibly confusing for the 99% of people who use Fortran that uses the C preprocessor? Matt > > Satish > > > > > Matt > > > > > > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > > > > > def getPreprocessorFlagsName(self, language): > > > if language in ['C', 'Cxx', 'FC']: > > > flagsArg = 'CPPFLAGS' > > > elif language == 'CUDA': > > > flagsArg = 'CUDAPPFLAGS' > > > else: > > > raise RuntimeError('Unknown language: '+language) > > > return flagsArg > > > > > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > > > > > class Preprocessor(config.compile.C.Preprocessor): > > > '''The Fortran preprocessor, which now is just the C preprocessor''' > > > def __init__(self, argDB): > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > > > Matt - should we have FPP,FPPFLAGS supported here? > > > > > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > > > > > diff --git a/config/BuildSystem/config/compile/FC.py > > > b/config/BuildSystem/config/compile/FC.py > > > index 3d0bf74..7bae24d 100644 > > > --- a/config/BuildSystem/config/compile/FC.py > > > +++ b/config/BuildSystem/config/compile/FC.py > > > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > self.language = 'FC' > > > self.targetExtension = '.F' > > > + self.flagsName = '' > > > self.includeDirectories = sets.Set() > > > return > > > > > > And then [I'm not sure where this gets used..] > > > > > > $ git diff config/BuildSystem/config/base.py |cat > > > diff --git a/config/BuildSystem/config/base.py > > > b/config/BuildSystem/config/base.py > > > index b18a173..8b1129d 100644 > > > --- a/config/BuildSystem/config/base.py > > > +++ b/config/BuildSystem/config/base.py > > > @@ -454,8 +454,12 @@ class Configure(script.Script): > > > > > > # Should be static > > > def getPreprocessorFlagsName(self, language): > > > - if language in ['C', 'Cxx', 'FC']: > > > + if language == 'C': > > > flagsArg = 'CPPFLAGS' > > > + elif language == 'Cxx': > > > + flagsArg = 'CXXCPPFLAGS' > > > + elif language == 'FC': > > > + flagsArg = '' > > > elif language == 'CUDA': > > > flagsArg = 'CUDAPPFLAGS' > > > else: > > > > > > > > > Satish > > > > > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > > > > Hi Satish, > > > > > > > > I think you have a bug in you config system as > > > "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when > I try > > > to explicitly specify "FPPFLAGS=" to nothing : > > > > > > > > TEST checkFortranCompiler from > > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > TESTING: checkFortranCompiler from > > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > Locate a functional Fortran compiler > > > > Checking for program > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > > > Defined make macro "FC" to > > > > "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > > > Pushing language FC > > > > Executing: > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 > > > -c -o > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers > > > -Qunused-arguments > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > > > Possible ERROR while running compiler: exit code 256 > > > > stderr: > > > > gfortran: error: unrecognized command line option > '-Qunused-arguments? > > > > > > > > Kind regards, > > > > Denis > > > > > > > > > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > > > > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > > > > > CPPFLAGS=-Qunused-arguments > > > > > [or CFLAGS?] > > > > > > > > > > Satish > > > > > > > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc > > > configure. > > > > >> > > > > >> I don't see this issue with the old openmpi I have access to.. > > > > >> > > > > >> What do you have for: > > > > >> > > > > >> > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -show > > > > >> touch foo.c > > > > >> > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -c foo.c -show > > > > >> > > > > >> I'm not sure what to suggest.. Perhaps you can use > --download-mpich > > > > >> [or --download-openmpi] - but then I see you are building all > these > > > > >> packages using spack - so perhaps you should check with 'spack' > folk.. > > > > >> > > > > >> Satish > > > > >> > > > > >> --------- > > > > >> > > > > >> Executing: > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -c -o > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > > >> Possible ERROR while running compiler: > > > > >> stderr: > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > > > >> clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > > >> > > > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > > > >> > > > > >>> Dear all, > > > > >>> > > > > >>> Although I saw this issue on the mailing list, that was related > to > > > broken compilers, i don?t think it?s the case here as `mpicc -E` seems > to > > > be ok. > > > > >>> Log is attached. > > > > >>> > > > > >>> Kind regards, > > > > >>> Denis > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From davydden at gmail.com Sat Mar 19 11:52:27 2016 From: davydden at gmail.com (Denis Davydov) Date: Sat, 19 Mar 2016 17:52:27 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: <7986BE1A-3EC6-440D-8746-8A59ED13FE51@gmail.com> > On 19 Mar 2016, at 17:38, Matthew Knepley wrote: > > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > wrote: > This code is a bit convoluted.. > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. > > Is this really true? Is the Fortran preprocessors not the C preprocessor? I have never encountered this. > If you want flags which are different for C, use CFLAGS. Am I missing something? AFAIK CFLAGS are not use in PETSc config system to check working C preprocessor and settings CFLAGS=-Qunused-arguments still leads to Cannot find a C preprocessor . which is the original issue. Regards, Denis. > > Matt > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > def getPreprocessorFlagsName(self, language): > if language in ['C', 'Cxx', 'FC']: > flagsArg = 'CPPFLAGS' > elif language == 'CUDA': > flagsArg = 'CUDAPPFLAGS' > else: > raise RuntimeError('Unknown language: '+language) > return flagsArg > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > class Preprocessor(config.compile.C.Preprocessor): > '''The Fortran preprocessor, which now is just the C preprocessor''' > def __init__(self, argDB): > config.compile.C.Preprocessor.__init__(self, argDB) > > Matt - should we have FPP,FPPFLAGS supported here? > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > diff --git a/config/BuildSystem/config/compile/FC.py b/config/BuildSystem/config/compile/FC.py > index 3d0bf74..7bae24d 100644 > --- a/config/BuildSystem/config/compile/FC.py > +++ b/config/BuildSystem/config/compile/FC.py > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > config.compile.C.Preprocessor.__init__(self, argDB) > self.language = 'FC' > self.targetExtension = '.F' > + self.flagsName = '' > self.includeDirectories = sets.Set() > return > > And then [I'm not sure where this gets used..] > > $ git diff config/BuildSystem/config/base.py |cat > diff --git a/config/BuildSystem/config/base.py b/config/BuildSystem/config/base.py > index b18a173..8b1129d 100644 > --- a/config/BuildSystem/config/base.py > +++ b/config/BuildSystem/config/base.py > @@ -454,8 +454,12 @@ class Configure(script.Script): > > # Should be static > def getPreprocessorFlagsName(self, language): > - if language in ['C', 'Cxx', 'FC']: > + if language == 'C': > flagsArg = 'CPPFLAGS' > + elif language == 'Cxx': > + flagsArg = 'CXXCPPFLAGS' > + elif language == 'FC': > + flagsArg = '' > elif language == 'CUDA': > flagsArg = 'CUDAPPFLAGS' > else: > > > Satish > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > Hi Satish, > > > > I think you have a bug in you config system as "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try to explicitly specify "FPPFLAGS=" to nothing : > > > > TEST checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > TESTING: checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > Locate a functional Fortran compiler > > Checking for program /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > Defined make macro "FC" to "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > Pushing language FC > > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers -Qunused-arguments /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > Possible ERROR while running compiler: exit code 256 > > stderr: > > gfortran: error: unrecognized command line option '-Qunused-arguments? > > > > Kind regards, > > Denis > > > > > On 18 Mar 2016, at 19:13, Satish Balay > wrote: > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > CPPFLAGS=-Qunused-arguments > > > [or CFLAGS?] > > > > > > Satish > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc configure. > > >> > > >> I don't see this issue with the old openmpi I have access to.. > > >> > > >> What do you have for: > > >> > > >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > > >> touch foo.c > > >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > > >> > > >> I'm not sure what to suggest.. Perhaps you can use --download-mpich > > >> [or --download-openmpi] - but then I see you are building all these > > >> packages using spack - so perhaps you should check with 'spack' folk.. > > >> > > >> Satish > > >> > > >> --------- > > >> > > >> Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > >> Possible ERROR while running compiler: > > >> stderr: > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > >> > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > >> > > >>> Dear all, > > >>> > > >>> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > > >>> Log is attached. > > >>> > > >>> Kind regards, > > >>> Denis > > >>> > > >>> > > >> > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From davydden at gmail.com Sat Mar 19 11:57:45 2016 From: davydden at gmail.com (Denis Davydov) Date: Sat, 19 Mar 2016 17:57:45 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: > On 19 Mar 2016, at 17:52, Matthew Knepley wrote: > > On Sat, Mar 19, 2016 at 11:49 AM, Satish Balay > wrote: > On Sat, 19 Mar 2016, Matthew Knepley wrote: > > > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > wrote: > > > > > This code is a bit convoluted.. > > > > > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. > > > > > > > Is this really true? Is the Fortran preprocessors not the C preprocessor? I > > have never encountered this. > > If you want flags which are different for C, use CFLAGS. Am I missing > > something? > > Even if we assume CPPFLAGS are -D only - it wont' work with all FC > compilers. For ex: xlf does not recognize -D option - its -WF,-D. > So we cannont assume FC always uses CPP [or CPPFLAGS] > > And some flags have to be used at compile time only - and some can be > used at both compile & linktime - and some at linktime only. > > So we have CPPFLAGS, CFLAGS, LDFLAGS [not compiler specific?] > > In this case -Qunused-arguments can go into CFLAGS - but presumably > there are other compiler flags that cannot be used at link time - so > have to use with CPPFLAGS only? > > And this whole thread started with clang barfing on using a link flag during compile time.. > > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > Possible ERROR while running compiler: > stderr: > clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > clang: warning: argument unused during compilation: > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > So having proper support for CFLAGS, CPPFLAGS, FLAGS, FPPFLAGS might be the correct thing to do.. > > Won't that be incredibly confusing for the 99% of people who use Fortran that uses the C preprocessor? one can always set FPPFLAGS to CPPFLAGS inside if the former is not specified. Not confusion for anyone in that case. Regards, Denis. > > Matt > > > Satish > > > > > Matt > > > > > > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > > > > > def getPreprocessorFlagsName(self, language): > > > if language in ['C', 'Cxx', 'FC']: > > > flagsArg = 'CPPFLAGS' > > > elif language == 'CUDA': > > > flagsArg = 'CUDAPPFLAGS' > > > else: > > > raise RuntimeError('Unknown language: '+language) > > > return flagsArg > > > > > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > > > > > class Preprocessor(config.compile.C.Preprocessor): > > > '''The Fortran preprocessor, which now is just the C preprocessor''' > > > def __init__(self, argDB): > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > > > Matt - should we have FPP,FPPFLAGS supported here? > > > > > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > > > > > diff --git a/config/BuildSystem/config/compile/FC.py > > > b/config/BuildSystem/config/compile/FC.py > > > index 3d0bf74..7bae24d 100644 > > > --- a/config/BuildSystem/config/compile/FC.py > > > +++ b/config/BuildSystem/config/compile/FC.py > > > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > self.language = 'FC' > > > self.targetExtension = '.F' > > > + self.flagsName = '' > > > self.includeDirectories = sets.Set() > > > return > > > > > > And then [I'm not sure where this gets used..] > > > > > > $ git diff config/BuildSystem/config/base.py |cat > > > diff --git a/config/BuildSystem/config/base.py > > > b/config/BuildSystem/config/base.py > > > index b18a173..8b1129d 100644 > > > --- a/config/BuildSystem/config/base.py > > > +++ b/config/BuildSystem/config/base.py > > > @@ -454,8 +454,12 @@ class Configure(script.Script): > > > > > > # Should be static > > > def getPreprocessorFlagsName(self, language): > > > - if language in ['C', 'Cxx', 'FC']: > > > + if language == 'C': > > > flagsArg = 'CPPFLAGS' > > > + elif language == 'Cxx': > > > + flagsArg = 'CXXCPPFLAGS' > > > + elif language == 'FC': > > > + flagsArg = '' > > > elif language == 'CUDA': > > > flagsArg = 'CUDAPPFLAGS' > > > else: > > > > > > > > > Satish > > > > > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > > > > Hi Satish, > > > > > > > > I think you have a bug in you config system as > > > "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try > > > to explicitly specify "FPPFLAGS=" to nothing : > > > > > > > > TEST checkFortranCompiler from > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > TESTING: checkFortranCompiler from > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > Locate a functional Fortran compiler > > > > Checking for program > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > > > Defined make macro "FC" to > > > "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > > > Pushing language FC > > > > Executing: > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 > > > -c -o > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers > > > -Qunused-arguments > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > > > Possible ERROR while running compiler: exit code 256 > > > > stderr: > > > > gfortran: error: unrecognized command line option '-Qunused-arguments? > > > > > > > > Kind regards, > > > > Denis > > > > > > > > > On 18 Mar 2016, at 19:13, Satish Balay > wrote: > > > > > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > > > > > CPPFLAGS=-Qunused-arguments > > > > > [or CFLAGS?] > > > > > > > > > > Satish > > > > > > > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc > > > configure. > > > > >> > > > > >> I don't see this issue with the old openmpi I have access to.. > > > > >> > > > > >> What do you have for: > > > > >> > > > > >> > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -show > > > > >> touch foo.c > > > > >> > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -c foo.c -show > > > > >> > > > > >> I'm not sure what to suggest.. Perhaps you can use --download-mpich > > > > >> [or --download-openmpi] - but then I see you are building all these > > > > >> packages using spack - so perhaps you should check with 'spack' folk.. > > > > >> > > > > >> Satish > > > > >> > > > > >> --------- > > > > >> > > > > >> Executing: > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -c -o > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > > >> Possible ERROR while running compiler: > > > > >> stderr: > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > > > >> clang: warning: argument unused during compilation: > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > > >> > > > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > > > >> > > > > >>> Dear all, > > > > >>> > > > > >>> Although I saw this issue on the mailing list, that was related to > > > broken compilers, i don?t think it?s the case here as `mpicc -E` seems to > > > be ok. > > > > >>> Log is attached. > > > > >>> > > > > >>> Kind regards, > > > > >>> Denis > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Mar 19 11:58:40 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 11:58:40 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: On Sat, 19 Mar 2016, Matthew Knepley wrote: > On Sat, Mar 19, 2016 at 11:49 AM, Satish Balay wrote: > > > On Sat, 19 Mar 2016, Matthew Knepley wrote: > > > > > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > > wrote: > > > > > > > This code is a bit convoluted.. > > > > > > > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or > > FPPFLAGS.. > > > > > > > > > > Is this really true? Is the Fortran preprocessors not the C > > preprocessor? I > > > have never encountered this. > > > If you want flags which are different for C, use CFLAGS. Am I missing > > > something? > > > > Even if we assume CPPFLAGS are -D only - it wont' work with all FC > > compilers. For ex: xlf does not recognize -D option - its -WF,-D. > > So we cannont assume FC always uses CPP [or CPPFLAGS] > > > > And some flags have to be used at compile time only - and some can be > > used at both compile & linktime - and some at linktime only. > > > > So we have CPPFLAGS, CFLAGS, LDFLAGS [not compiler specific?] > > > > In this case -Qunused-arguments can go into CFLAGS - but presumably > > there are other compiler flags that cannot be used at link time - so > > have to use with CPPFLAGS only? > > > > And this whole thread started with clang barfing on using a link flag > > during compile time.. > > > > Executing: > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > -c -o > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > Possible ERROR while running compiler: > > stderr: > > clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > clang: warning: argument unused during compilation: > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > > > > So having proper support for CFLAGS, CPPFLAGS, FLAGS, FPPFLAGS might be > > the correct thing to do.. > > > Won't that be incredibly confusing for the 99% of people who use Fortran > that uses the C preprocessor? I don't see how. We use CPPFLAGS, FPPFLAGS separatey in our makefiles [presumably petsc users using petscmakefile format are already using this notation..] I think you are thinking more in terms of CPP vs FPP [CPPFLAGS vs FPPFLAGS]. I'm thinking more in terms of 'cc -c [CPPFLAGS]' and 'fc -c [FPPFLAGS'] - which are how the compile targets use these flags.. Satish > > Matt > > > > > > Satish > > > > > > > > Matt > > > > > > > > > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > > > > > > > def getPreprocessorFlagsName(self, language): > > > > if language in ['C', 'Cxx', 'FC']: > > > > flagsArg = 'CPPFLAGS' > > > > elif language == 'CUDA': > > > > flagsArg = 'CUDAPPFLAGS' > > > > else: > > > > raise RuntimeError('Unknown language: '+language) > > > > return flagsArg > > > > > > > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > > > > > > > class Preprocessor(config.compile.C.Preprocessor): > > > > '''The Fortran preprocessor, which now is just the C preprocessor''' > > > > def __init__(self, argDB): > > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > > > > > Matt - should we have FPP,FPPFLAGS supported here? > > > > > > > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > > > > > > > diff --git a/config/BuildSystem/config/compile/FC.py > > > > b/config/BuildSystem/config/compile/FC.py > > > > index 3d0bf74..7bae24d 100644 > > > > --- a/config/BuildSystem/config/compile/FC.py > > > > +++ b/config/BuildSystem/config/compile/FC.py > > > > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > self.language = 'FC' > > > > self.targetExtension = '.F' > > > > + self.flagsName = '' > > > > self.includeDirectories = sets.Set() > > > > return > > > > > > > > And then [I'm not sure where this gets used..] > > > > > > > > $ git diff config/BuildSystem/config/base.py |cat > > > > diff --git a/config/BuildSystem/config/base.py > > > > b/config/BuildSystem/config/base.py > > > > index b18a173..8b1129d 100644 > > > > --- a/config/BuildSystem/config/base.py > > > > +++ b/config/BuildSystem/config/base.py > > > > @@ -454,8 +454,12 @@ class Configure(script.Script): > > > > > > > > # Should be static > > > > def getPreprocessorFlagsName(self, language): > > > > - if language in ['C', 'Cxx', 'FC']: > > > > + if language == 'C': > > > > flagsArg = 'CPPFLAGS' > > > > + elif language == 'Cxx': > > > > + flagsArg = 'CXXCPPFLAGS' > > > > + elif language == 'FC': > > > > + flagsArg = '' > > > > elif language == 'CUDA': > > > > flagsArg = 'CUDAPPFLAGS' > > > > else: > > > > > > > > > > > > Satish > > > > > > > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > > > > > > Hi Satish, > > > > > > > > > > I think you have a bug in you config system as > > > > "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when > > I try > > > > to explicitly specify "FPPFLAGS=" to nothing : > > > > > > > > > > TEST checkFortranCompiler from > > > > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > > TESTING: checkFortranCompiler from > > > > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > > Locate a functional Fortran compiler > > > > > Checking for program > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > > > > Defined make macro "FC" to > > > > > > "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > > > > Pushing language FC > > > > > Executing: > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 > > > > -c -o > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o > > > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers > > > > -Qunused-arguments > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > > > > Possible ERROR while running compiler: exit code 256 > > > > > stderr: > > > > > gfortran: error: unrecognized command line option > > '-Qunused-arguments? > > > > > > > > > > Kind regards, > > > > > Denis > > > > > > > > > > > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > > > > > > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > > > > > > > CPPFLAGS=-Qunused-arguments > > > > > > [or CFLAGS?] > > > > > > > > > > > > Satish > > > > > > > > > > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > > > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc > > > > configure. > > > > > >> > > > > > >> I don't see this issue with the old openmpi I have access to.. > > > > > >> > > > > > >> What do you have for: > > > > > >> > > > > > >> > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > > -show > > > > > >> touch foo.c > > > > > >> > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > > -c foo.c -show > > > > > >> > > > > > >> I'm not sure what to suggest.. Perhaps you can use > > --download-mpich > > > > > >> [or --download-openmpi] - but then I see you are building all > > these > > > > > >> packages using spack - so perhaps you should check with 'spack' > > folk.. > > > > > >> > > > > > >> Satish > > > > > >> > > > > > >> --------- > > > > > >> > > > > > >> Executing: > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > > -c -o > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > > > >> Possible ERROR while running compiler: > > > > > >> stderr: > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > > > > >> clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > > > >> > > > > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > > > > >> > > > > > >>> Dear all, > > > > > >>> > > > > > >>> Although I saw this issue on the mailing list, that was related > > to > > > > broken compilers, i don?t think it?s the case here as `mpicc -E` seems > > to > > > > be ok. > > > > > >>> Log is attached. > > > > > >>> > > > > > >>> Kind regards, > > > > > >>> Denis > > > > > >>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From balay at mcs.anl.gov Sat Mar 19 12:09:20 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 12:09:20 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: Message-ID: On Sat, 19 Mar 2016, Satish Balay wrote: > On Sat, 19 Mar 2016, Matthew Knepley wrote: > > > On Sat, Mar 19, 2016 at 11:49 AM, Satish Balay wrote: > > > > > On Sat, 19 Mar 2016, Matthew Knepley wrote: > > > > > > > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > > > wrote: > > > > > > > > > This code is a bit convoluted.. > > > > > > > > > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or > > > FPPFLAGS.. > > > > > > > > > > > > > Is this really true? Is the Fortran preprocessors not the C > > > preprocessor? I > > > > have never encountered this. > > > > If you want flags which are different for C, use CFLAGS. Am I missing > > > > something? > > > > > > Even if we assume CPPFLAGS are -D only - it wont' work with all FC > > > compilers. For ex: xlf does not recognize -D option - its -WF,-D. > > > So we cannont assume FC always uses CPP [or CPPFLAGS] > > > > > > And some flags have to be used at compile time only - and some can be > > > used at both compile & linktime - and some at linktime only. > > > > > > So we have CPPFLAGS, CFLAGS, LDFLAGS [not compiler specific?] > > > > > > In this case -Qunused-arguments can go into CFLAGS - but presumably > > > there are other compiler flags that cannot be used at link time - so > > > have to use with CPPFLAGS only? > > > > > > And this whole thread started with clang barfing on using a link flag > > > during compile time.. > > > > > > Executing: > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > -c -o > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > Possible ERROR while running compiler: > > > stderr: > > > clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > clang: warning: argument unused during compilation: > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > > > > > > > So having proper support for CFLAGS, CPPFLAGS, FLAGS, FPPFLAGS might be > > > the correct thing to do.. > > > > > > Won't that be incredibly confusing for the 99% of people who use Fortran > > that uses the C preprocessor? > > I don't see how. We use CPPFLAGS, FPPFLAGS separatey in our makefiles [presumably petsc users > using petscmakefile format are already using this notation..] > > I think you are thinking more in terms of CPP vs FPP [CPPFLAGS vs FPPFLAGS]. I'm thinking more > in terms of 'cc -c [CPPFLAGS]' and 'fc -c [FPPFLAGS'] - which are how the compile targets > use these flags.. gnumake appears to use CPPFLAGS with FC :( balay at asterix /home/balay/junk $ touch makefile balay at asterix /home/balay/junk $ make -p |grep PREPROCESS make: *** No targets. Stop. PREPROCESS.F = $(FC) $(FFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -F PREPROCESS.r = $(FC) $(FFLAGS) $(RFLAGS) $(TARGET_ARCH) -F PREPROCESS.S = $(CC) -E $(CPPFLAGS) $(PREPROCESS.F) $(OUTPUT_OPTION) $< $(PREPROCESS.r) $(OUTPUT_OPTION) $< $(PREPROCESS.S) $< > $@ $(PREPROCESS.r) $(OUTPUT_OPTION) $< $(PREPROCESS.S) $< > $@ $(PREPROCESS.F) $(OUTPUT_OPTION) $< Satish > > Satish > > > > > Matt > > > > > > > > > > Satish > > > > > > > > > > > Matt > > > > > > > > > > > > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > > > > > > > > > def getPreprocessorFlagsName(self, language): > > > > > if language in ['C', 'Cxx', 'FC']: > > > > > flagsArg = 'CPPFLAGS' > > > > > elif language == 'CUDA': > > > > > flagsArg = 'CUDAPPFLAGS' > > > > > else: > > > > > raise RuntimeError('Unknown language: '+language) > > > > > return flagsArg > > > > > > > > > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > > > > > > > > > class Preprocessor(config.compile.C.Preprocessor): > > > > > '''The Fortran preprocessor, which now is just the C preprocessor''' > > > > > def __init__(self, argDB): > > > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > > > > > > > Matt - should we have FPP,FPPFLAGS supported here? > > > > > > > > > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > > > > > > > > > diff --git a/config/BuildSystem/config/compile/FC.py > > > > > b/config/BuildSystem/config/compile/FC.py > > > > > index 3d0bf74..7bae24d 100644 > > > > > --- a/config/BuildSystem/config/compile/FC.py > > > > > +++ b/config/BuildSystem/config/compile/FC.py > > > > > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > > > > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > > self.language = 'FC' > > > > > self.targetExtension = '.F' > > > > > + self.flagsName = '' > > > > > self.includeDirectories = sets.Set() > > > > > return > > > > > > > > > > And then [I'm not sure where this gets used..] > > > > > > > > > > $ git diff config/BuildSystem/config/base.py |cat > > > > > diff --git a/config/BuildSystem/config/base.py > > > > > b/config/BuildSystem/config/base.py > > > > > index b18a173..8b1129d 100644 > > > > > --- a/config/BuildSystem/config/base.py > > > > > +++ b/config/BuildSystem/config/base.py > > > > > @@ -454,8 +454,12 @@ class Configure(script.Script): > > > > > > > > > > # Should be static > > > > > def getPreprocessorFlagsName(self, language): > > > > > - if language in ['C', 'Cxx', 'FC']: > > > > > + if language == 'C': > > > > > flagsArg = 'CPPFLAGS' > > > > > + elif language == 'Cxx': > > > > > + flagsArg = 'CXXCPPFLAGS' > > > > > + elif language == 'FC': > > > > > + flagsArg = '' > > > > > elif language == 'CUDA': > > > > > flagsArg = 'CUDAPPFLAGS' > > > > > else: > > > > > > > > > > > > > > > Satish > > > > > > > > > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > > > > > > > > Hi Satish, > > > > > > > > > > > > I think you have a bug in you config system as > > > > > "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when > > > I try > > > > > to explicitly specify "FPPFLAGS=" to nothing : > > > > > > > > > > > > TEST checkFortranCompiler from > > > > > > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > > > TESTING: checkFortranCompiler from > > > > > > > > config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > > > > Locate a functional Fortran compiler > > > > > > Checking for program > > > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > > > > > Defined make macro "FC" to > > > > > > > > "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > > > > > Pushing language FC > > > > > > Executing: > > > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 > > > > > -c -o > > > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o > > > > > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers > > > > > -Qunused-arguments > > > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > > > > > Possible ERROR while running compiler: exit code 256 > > > > > > stderr: > > > > > > gfortran: error: unrecognized command line option > > > '-Qunused-arguments? > > > > > > > > > > > > Kind regards, > > > > > > Denis > > > > > > > > > > > > > On 18 Mar 2016, at 19:13, Satish Balay wrote: > > > > > > > > > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > > > > > > > > > CPPFLAGS=-Qunused-arguments > > > > > > > [or CFLAGS?] > > > > > > > > > > > > > > Satish > > > > > > > > > > > > > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > > > > > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc > > > > > configure. > > > > > > >> > > > > > > >> I don't see this issue with the old openmpi I have access to.. > > > > > > >> > > > > > > >> What do you have for: > > > > > > >> > > > > > > >> > > > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > > > -show > > > > > > >> touch foo.c > > > > > > >> > > > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > > > -c foo.c -show > > > > > > >> > > > > > > >> I'm not sure what to suggest.. Perhaps you can use > > > --download-mpich > > > > > > >> [or --download-openmpi] - but then I see you are building all > > > these > > > > > > >> packages using spack - so perhaps you should check with 'spack' > > > folk.. > > > > > > >> > > > > > > >> Satish > > > > > > >> > > > > > > >> --------- > > > > > > >> > > > > > > >> Executing: > > > > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc > > > > > -c -o > > > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o > > > > > > > > -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers > > > > > > > > /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > > > > >> Possible ERROR while running compiler: > > > > > > >> stderr: > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > > > > > >> clang: warning: argument unused during compilation: > > > > > > > > '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > > > > >> > > > > > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > > > > > >> > > > > > > >>> Dear all, > > > > > > >>> > > > > > > >>> Although I saw this issue on the mailing list, that was related > > > to > > > > > broken compilers, i don?t think it?s the case here as `mpicc -E` seems > > > to > > > > > be ok. > > > > > > >>> Log is attached. > > > > > > >>> > > > > > > >>> Kind regards, > > > > > > >>> Denis > > > > > > >>> > > > > > > >>> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From balay at mcs.anl.gov Sat Mar 19 12:15:13 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 12:15:13 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: <7986BE1A-3EC6-440D-8746-8A59ED13FE51@gmail.com> References: <7986BE1A-3EC6-440D-8746-8A59ED13FE51@gmail.com> Message-ID: On Sat, 19 Mar 2016, Denis Davydov wrote: > > > On 19 Mar 2016, at 17:38, Matthew Knepley wrote: > > > > On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > wrote: > > This code is a bit convoluted.. > > > > There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. > > > > Is this really true? Is the Fortran preprocessors not the C preprocessor? I have never encountered this. > > If you want flags which are different for C, use CFLAGS. Am I missing something? > > AFAIK CFLAGS are not use in PETSc config system to check working C preprocessor and settings > CFLAGS=-Qunused-arguments > still leads to Cannot find a C preprocessor . > which is the original issue. > oh well. you could use --with-cc='mpicc -Qunused-arguments' Satish > Regards, > Denis. > > > > Matt > > > > And then base.py has: [which is wrong for 'Cxx', 'FC'?] > > > > def getPreprocessorFlagsName(self, language): > > if language in ['C', 'Cxx', 'FC']: > > flagsArg = 'CPPFLAGS' > > elif language == 'CUDA': > > flagsArg = 'CUDAPPFLAGS' > > else: > > raise RuntimeError('Unknown language: '+language) > > return flagsArg > > > > And config/compile/FC.py has [which reuses CPP,CPPFLAGS for FC]: > > > > class Preprocessor(config.compile.C.Preprocessor): > > '''The Fortran preprocessor, which now is just the C preprocessor''' > > def __init__(self, argDB): > > config.compile.C.Preprocessor.__init__(self, argDB) > > > > Matt - should we have FPP,FPPFLAGS supported here? > > > > Perhaps using CPPFLAGS with FC is a bug? So we should atleast do: > > > > diff --git a/config/BuildSystem/config/compile/FC.py b/config/BuildSystem/config/compile/FC.py > > index 3d0bf74..7bae24d 100644 > > --- a/config/BuildSystem/config/compile/FC.py > > +++ b/config/BuildSystem/config/compile/FC.py > > @@ -13,6 +13,7 @@ class Preprocessor(config.compile.C.Preprocessor): > > config.compile.C.Preprocessor.__init__(self, argDB) > > self.language = 'FC' > > self.targetExtension = '.F' > > + self.flagsName = '' > > self.includeDirectories = sets.Set() > > return > > > > And then [I'm not sure where this gets used..] > > > > $ git diff config/BuildSystem/config/base.py |cat > > diff --git a/config/BuildSystem/config/base.py b/config/BuildSystem/config/base.py > > index b18a173..8b1129d 100644 > > --- a/config/BuildSystem/config/base.py > > +++ b/config/BuildSystem/config/base.py > > @@ -454,8 +454,12 @@ class Configure(script.Script): > > > > # Should be static > > def getPreprocessorFlagsName(self, language): > > - if language in ['C', 'Cxx', 'FC']: > > + if language == 'C': > > flagsArg = 'CPPFLAGS' > > + elif language == 'Cxx': > > + flagsArg = 'CXXCPPFLAGS' > > + elif language == 'FC': > > + flagsArg = '' > > elif language == 'CUDA': > > flagsArg = 'CUDAPPFLAGS' > > else: > > > > > > Satish > > > > On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > > Hi Satish, > > > > > > I think you have a bug in you config system as "CPPFLAGS=-Qunused-arguments" propagate to fortran compiler even when I try to explicitly specify "FPPFLAGS=" to nothing : > > > > > > TEST checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > TESTING: checkFortranCompiler from config.setCompilers(/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jNBTET/petsc-3.6.3/config/BuildSystem/config/setCompilers.py:919) > > > Locate a functional Fortran compiler > > > Checking for program /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90...found > > > Defined make macro "FC" to "/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90" > > > Pushing language FC > > > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpif90 -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers -Qunused-arguments /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-zwyqnZ/config.setCompilers/conftest.F > > > Possible ERROR while running compiler: exit code 256 > > > stderr: > > > gfortran: error: unrecognized command line option '-Qunused-arguments? > > > > > > Kind regards, > > > Denis > > > > > > > On 18 Mar 2016, at 19:13, Satish Balay > wrote: > > > > > > > > Or you can workarround by using additional PETSc configure option: > > > > > > > > CPPFLAGS=-Qunused-arguments > > > > [or CFLAGS?] > > > > > > > > Satish > > > > > > > > > > > > On Fri, 18 Mar 2016, Satish Balay wrote: > > > > > > > >> your mpicc is barfing stuff on stderr - thus confusing petsc configure. > > > >> > > > >> I don't see this issue with the old openmpi I have access to.. > > > >> > > > >> What do you have for: > > > >> > > > >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > > > >> touch foo.c > > > >> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > > > >> > > > >> I'm not sure what to suggest.. Perhaps you can use --download-mpich > > > >> [or --download-openmpi] - but then I see you are building all these > > > >> packages using spack - so perhaps you should check with 'spack' folk.. > > > >> > > > >> Satish > > > >> > > > >> --------- > > > >> > > > >> Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > > >> Possible ERROR while running compiler: > > > >> stderr: > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > > >> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > >> > > > >> On Fri, 18 Mar 2016, Denis Davydov wrote: > > > >> > > > >>> Dear all, > > > >>> > > > >>> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > > > >>> Log is attached. > > > >>> > > > >>> Kind regards, > > > >>> Denis > > > >>> > > > >>> > > > >> > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > From balay at mcs.anl.gov Sat Mar 19 12:20:44 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 12:20:44 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: <101FF0CB-7736-4164-B333-DD3DDE801336@gmail.com> References: <101FF0CB-7736-4164-B333-DD3DDE801336@gmail.com> Message-ID: If '-show' is not listing these -L options - where is mpicc picking them up from? Satish On Fri, 18 Mar 2016, Denis Davydov wrote: > Hi Satish, > > > On 18 Mar 2016, at 19:01, Satish Balay wrote: > > > > your mpicc is barfing stuff on stderr - thus confusing petsc configure. > > yes, i noticed that. I don?t have that issue when using Homebrew?s build OpenMPI, and thus no confusion to petsc config. > Also those things are just warnings, why are they reported to stderr is not clear? > I will try suppressing it with -Qunused-arguments ... > > > > > I don't see this issue with the old openmpi I have access to.. > > > > What do you have for: > > Looks reasonable to me: > > > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > > touch foo.c > /usr/bin/clang touch foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include -L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib -lmpi > > > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > > /usr/bin/clang -c foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include > > > > > I'm not sure what to suggest.. Perhaps you can use --download-mpich > > [or --download-openmpi] - but then I see you are building all these > > packages using spack - so perhaps you should check with 'spack' folk.. > > Will certainly do that. I am just toying with spack to see how good is the support for OS-X. > > Regards, > Denis. > > > > > Satish > > > > --------- > > > > Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > > Possible ERROR while running compiler: > > stderr: > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > > clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > > > > On Fri, 18 Mar 2016, Denis Davydov wrote: > > > >> Dear all, > >> > >> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > >> Log is attached. > >> > >> Kind regards, > >> Denis > >> > >> > > From davydden at gmail.com Sat Mar 19 13:30:43 2016 From: davydden at gmail.com (Denis Davydov) Date: Sat, 19 Mar 2016 19:30:43 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: <7986BE1A-3EC6-440D-8746-8A59ED13FE51@gmail.com> Message-ID: > On 19 Mar 2016, at 18:15, Satish Balay wrote: > > On Sat, 19 Mar 2016, Denis Davydov wrote: > >> >>> On 19 Mar 2016, at 17:38, Matthew Knepley wrote: >>> >>> On Sat, Mar 19, 2016 at 11:29 AM, Satish Balay > wrote: >>> This code is a bit convoluted.. >>> >>> There is CPP, CPPFLAGS, CXXCPP, CXXCPPFLAGS. But then no FPP or FPPFLAGS.. >>> >>> Is this really true? Is the Fortran preprocessors not the C preprocessor? I have never encountered this. >>> If you want flags which are different for C, use CFLAGS. Am I missing something? >> >> AFAIK CFLAGS are not use in PETSc config system to check working C preprocessor and settings >> CFLAGS=-Qunused-arguments >> still leads to Cannot find a C preprocessor . >> which is the original issue. >> > > oh well. you could use --with-cc='mpicc -Qunused-arguments? Ended up doing this, both for cc and cxx. That did the trick. Regards, Denis. From davydden at gmail.com Sat Mar 19 13:32:05 2016 From: davydden at gmail.com (Denis Davydov) Date: Sat, 19 Mar 2016 19:32:05 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: <101FF0CB-7736-4164-B333-DD3DDE801336@gmail.com> Message-ID: > On 19 Mar 2016, at 18:20, Satish Balay wrote: > > If '-show' is not listing these -L options - where is mpicc picking them up from? that?s how Spack does things somewhere behind the curtains. Regards, Denis. > > Satish > > On Fri, 18 Mar 2016, Denis Davydov wrote: > >> Hi Satish, >> >>> On 18 Mar 2016, at 19:01, Satish Balay wrote: >>> >>> your mpicc is barfing stuff on stderr - thus confusing petsc configure. >> >> yes, i noticed that. I don?t have that issue when using Homebrew?s build OpenMPI, and thus no confusion to petsc config. >> Also those things are just warnings, why are they reported to stderr is not clear? >> I will try suppressing it with -Qunused-arguments ... >> >>> >>> I don't see this issue with the old openmpi I have access to.. >>> >>> What do you have for: >> >> Looks reasonable to me: >> >>> >>> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show >>> touch foo.c >> /usr/bin/clang touch foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include -L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib -lmpi >> >>> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show >> >> /usr/bin/clang -c foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include >> >>> >>> I'm not sure what to suggest.. Perhaps you can use --download-mpich >>> [or --download-openmpi] - but then I see you are building all these >>> packages using spack - so perhaps you should check with 'spack' folk.. >> >> Will certainly do that. I am just toying with spack to see how good is the support for OS-X. >> >> Regards, >> Denis. >> >>> >>> Satish >>> >>> --------- >>> >>> Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c >>> Possible ERROR while running compiler: >>> stderr: >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' >>> >>> On Fri, 18 Mar 2016, Denis Davydov wrote: >>> >>>> Dear all, >>>> >>>> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. >>>> Log is attached. >>>> >>>> Kind regards, >>>> Denis >>>> >>>> >> >> From balay at mcs.anl.gov Sat Mar 19 14:00:47 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 19 Mar 2016 14:00:47 -0500 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: <101FF0CB-7736-4164-B333-DD3DDE801336@gmail.com> Message-ID: But I would think 'mpicc -show' should show this. [if spack is setting up these options via MPI wrappers] Sometimes compilers pick up extra config from env variables - or from default config locations. I don't see anything in env thats significant.. > CC=/Users/davydden/spack/lib/spack/env/clang/clang Ok - spack is building its own clang. > /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > /usr/bin/clang -c foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include But mpicc is using apple's clang. [so spack can't be modifying apple's clang's config file..] What do you get for: mpicc -c -v foo.c [for any simple source file - say src/benchmarks/sizeof.c] Satish On Sat, 19 Mar 2016, Denis Davydov wrote: > > > > On 19 Mar 2016, at 18:20, Satish Balay wrote: > > > > If '-show' is not listing these -L options - where is mpicc picking them up from? > > that?s how Spack does things somewhere behind the curtains. > > Regards, > Denis. > > > > > Satish > > > > On Fri, 18 Mar 2016, Denis Davydov wrote: > > > >> Hi Satish, > >> > >>> On 18 Mar 2016, at 19:01, Satish Balay wrote: > >>> > >>> your mpicc is barfing stuff on stderr - thus confusing petsc configure. > >> > >> yes, i noticed that. I don?t have that issue when using Homebrew?s build OpenMPI, and thus no confusion to petsc config. > >> Also those things are just warnings, why are they reported to stderr is not clear? > >> I will try suppressing it with -Qunused-arguments ... > >> > >>> > >>> I don't see this issue with the old openmpi I have access to.. > >>> > >>> What do you have for: > >> > >> Looks reasonable to me: > >> > >>> > >>> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -show > >>> touch foo.c > >> /usr/bin/clang touch foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include -L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib -lmpi > >> > >>> /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c foo.c -show > >> > >> /usr/bin/clang -c foo.c -I/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/include > >> > >>> > >>> I'm not sure what to suggest.. Perhaps you can use --download-mpich > >>> [or --download-openmpi] - but then I see you are building all these > >>> packages using spack - so perhaps you should check with 'spack' folk.. > >> > >> Will certainly do that. I am just toying with spack to see how good is the support for OS-X. > >> > >> Regards, > >> Denis. > >> > >>> > >>> Satish > >>> > >>> --------- > >>> > >>> Executing: /Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/bin/mpicc -c -o /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.o -I/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers /var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/petsc-8RPaEA/config.setCompilers/conftest.c > >>> Possible ERROR while running compiler: > >>> stderr: > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/boost-1.60.0-eo7nn3v27nxa7lxqv5tttjzikshwt56i/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/bzip2-1.0.6-leelnsg3humpngfeofkrdpgtsofrm5ya/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openmpi-1.10.2-37xieeupgsteaq6btru6wmhxfi44xqtn/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hwloc-1.11.2-pxsmp4nhfdjc3jb7odj5lhppu7wqna5b/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/libpciaccess-0.13.4-erc6tr3ghndi5ed3gbj6wtvw6zl4vobf/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/zlib-1.2.8-cyvcqvrzlgurne424y55hxvfucvz2354/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hdf5-1.8.16-diujq2w7ew4qviquh67b3lkcqsxtf77y/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/hypre-2.10.1-4kql32qmjpp7ict2qczkyyv6a4mbrgbl/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openblas-0.2.16-qplsxnxlbaqnz2iltdo7fdhlayvuaxel/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/metis-5.1.0-i5y5b6r5ca4f77u5tketagpkn6joxasp/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/ncurses-6.0-sabhdwxbdbbapfo6wodglfmyo6u3c3hj/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/openssl-1.0.2g-answvmhu3lodpmgulgzryygkkimmsn34/lib' > >>> clang: warning: argument unused during compilation: '-L/Users/davydden/spack/opt/spack/darwin-x86_64/clang-7.0.2-apple/parmetis-4.0.3-auubrjcwhqmqnpoqjwgwgz4bcjnxzunx/lib' > >>> > >>> On Fri, 18 Mar 2016, Denis Davydov wrote: > >>> > >>>> Dear all, > >>>> > >>>> Although I saw this issue on the mailing list, that was related to broken compilers, i don?t think it?s the case here as `mpicc -E` seems to be ok. > >>>> Log is attached. > >>>> > >>>> Kind regards, > >>>> Denis > >>>> > >>>> > >> > >> > > From davydden at gmail.com Mon Mar 21 03:37:04 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 21 Mar 2016 09:37:04 +0100 Subject: [petsc-users] Cannot find a C preprocessor OSX In-Reply-To: References: <101FF0CB-7736-4164-B333-DD3DDE801336@gmail.com> Message-ID: <5BC88654-4005-46A3-91C0-3187C31F5F23@gmail.com> > On 19 Mar 2016, at 20:00, Satish Balay wrote: > > But I would think 'mpicc -show' should show this. [if spack is setting > up these options via MPI wrappers] they set up compiler wrappers (clang in this case), where they pass -L and -I flags for all libraries used to build a package http://software.llnl.gov/spack/packaging_guide.html#compiler-interceptors And then compiled OpenMPI is set to use those wrappers. That?s why there are those unused flags. I would try to add -Qunused-arguments to their wrappers in case of clang compiler so that clang does not complain about unused flags in any package. Regards, Denis. From Sander.Arens at UGent.be Mon Mar 21 03:49:28 2016 From: Sander.Arens at UGent.be (Sander Arens) Date: Mon, 21 Mar 2016 09:49:28 +0100 Subject: [petsc-users] PetscDSSetJacobianPreconditioner causing DIVERGED_LINE_SEARCH for multi-field problem In-Reply-To: References: Message-ID: Matt, I think the problem is in DMCreateSubDM_Section_Private. The block size can be counted there (very similar as in DMCreateMatrix_Plex) and then passed down to the IS. I can create a pull request for this if you'd like (or should I just add this to my other pull request for the Neumann bc's)? On 3 March 2016 at 18:02, Sander Arens wrote: > Ok, thanks for clearing that up! > > On 3 March 2016 at 18:00, Matthew Knepley wrote: > >> On Thu, Mar 3, 2016 at 10:48 AM, Sander Arens >> wrote: >> >>> So is the block size here refering to the dimension of the >>> near-nullspace or has it something to do with the coarsening? >>> >>> I would have thought to also see the block size in this part: >>> >> >> Dang, that is a problem. It should have the correct block size when FS >> pulls it out of the overall matrix. This should be >> passed down using the block size of the IS. Something has been broken >> here. I will put it on my list of things to look at. >> >> Thanks, >> >> Matt >> >> >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=1701, cols=1701 >>> total: nonzeros=109359, allocated nonzeros=109359 >>> total number of mallocs used during MatSetValues calls >>> =0 >>> has attached near null space >>> using I-node routines: found 567 nodes, limit used >>> is 5 >>> >>> Thanks, >>> Sander >>> >>> >>> On 3 March 2016 at 17:29, Matthew Knepley wrote: >>> >>>> You can see the block size in the output >>>> >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=12, cols=12, bs=6 >>>> total: nonzeros=144, allocated nonzeros=144 >>>> total number of mallocs used during MatSetValues >>>> calls =0 >>>> using I-node routines: found 3 nodes, limit used >>>> is 5 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=12, cols=12, bs=6 >>>> total: nonzeros=144, allocated nonzeros=144 >>>> total number of mallocs used during MatSetValues calls >>>> =0 >>>> using I-node routines: found 3 nodes, limit used is 5 >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Thu, Mar 3, 2016 at 10:28 AM, Sander Arens >>>> wrote: >>>> >>>>> Sure, here it is. >>>>> >>>>> Thanks, >>>>> Sander >>>>> >>>>> On 3 March 2016 at 15:33, Matthew Knepley wrote: >>>>> >>>>>> On Thu, Mar 3, 2016 at 8:05 AM, Sander Arens >>>>>> wrote: >>>>>> >>>>>>> Yes, the matrix is created by one of the assembly functions in Plex, >>>>>>> so that answers my question. I was confused by this because I couldn't see >>>>>>> this information when using -snes_view. >>>>>>> >>>>>> >>>>>> Hmm, then something is wrong because the block size should be printed >>>>>> for the matrix at the end of the solver in -snes_view, Can you >>>>>> send that to me? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks for the helpful replies, >>>>>>> Sander >>>>>>> >>>>>>> On 3 March 2016 at 14:52, Matthew Knepley wrote: >>>>>>> >>>>>>>> On Thu, Mar 3, 2016 at 7:49 AM, Sander Arens >>>>>>> > wrote: >>>>>>>> >>>>>>>>> And how can I do this? Because when I look at all the options with >>>>>>>>> -help I can strangely enough only find -fieldsplit_pressure_mat_block_size >>>>>>>>> and not -fieldsplit_displacement_mat_block_size. >>>>>>>>> >>>>>>>> >>>>>>>> Maybe I am missing something here. The matrix from which you >>>>>>>> calculate the preconditioner using GAMG must be created somewhere. Why >>>>>>>> is the block size not specified at creation time? If the matrix is >>>>>>>> created by one of the assembly functions in Plex, and the PetscSection has >>>>>>>> a number of field components, then the block size will already be >>>>>>>> set. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Sander >>>>>>>>> >>>>>>>>> On 3 March 2016 at 14:21, Matthew Knepley >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Thu, Mar 3, 2016 at 6:20 AM, Sander Arens < >>>>>>>>>> Sander.Arens at ugent.be> wrote: >>>>>>>>>> >>>>>>>>>>> Ok, I forgot to call SNESSetJacobian(snes, J, P, NULL, NULL) >>>>>>>>>>> with J != P, which caused to write the mass matrix into the (otherwise >>>>>>>>>>> zero) (1,1) block of the Jacobian and which was the reason for the >>>>>>>>>>> linesearch to fail. >>>>>>>>>>> However, after fixing that and trying to solve it with >>>>>>>>>>> FieldSplit with LU factorization for the (0,0) block it failed because >>>>>>>>>>> there were zero pivots for all rows. >>>>>>>>>>> >>>>>>>>>>> Anyway, I found out that attaching the mass matrix to the >>>>>>>>>>> Lagrange multiplier field also worked. >>>>>>>>>>> >>>>>>>>>>> Another related question for my elasticity problem: after >>>>>>>>>>> creating the rigid body modes with DMPlexCreateRigidBody and attaching it >>>>>>>>>>> to the displacement field, does the matrix block size of the (0,0) block >>>>>>>>>>> still have to be set for good performance with gamg? If so, how can I do >>>>>>>>>>> this? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, it should be enough to set the block size of the >>>>>>>>>> preconditioner matrix. >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Sander >>>>>>>>>>> >>>>>>>>>>> On 2 March 2016 at 12:25, Matthew Knepley >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Wed, Mar 2, 2016 at 5:13 AM, Sander Arens < >>>>>>>>>>>> Sander.Arens at ugent.be> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I'm trying to set a mass matrix preconditioner for the Schur >>>>>>>>>>>>> complement of an incompressible finite elasticity problem. I tried using >>>>>>>>>>>>> the command PetscDSSetJacobianPreconditioner(prob, 1, 1, g0_pre_mass_pp, >>>>>>>>>>>>> NULL, NULL, NULL) (field 1 is the Lagrange multiplier field). >>>>>>>>>>>>> However, this causes a DIVERGED_LINE_SEARCH due to to Nan or >>>>>>>>>>>>> Inf in the function evaluation after Newton iteration 1. (Btw, I'm using >>>>>>>>>>>>> the next branch). >>>>>>>>>>>>> >>>>>>>>>>>>> Is this because I didn't use PetscDSSetJacobianPreconditioner >>>>>>>>>>>>> for the other blocks (which uses the Jacobian itself for preconditioning)? >>>>>>>>>>>>> If so, how can I tell Petsc to use the Jacobian for those blocks? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1) I put that code in very recently, and do not even have >>>>>>>>>>>> sufficient test, so it may be buggy >>>>>>>>>>>> >>>>>>>>>>>> 2) If you are using FieldSplit, you can control which blocks >>>>>>>>>>>> come from A and which come from the preconditioner P >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetDiagUseAmat.html#PCFieldSplitSetDiagUseAmat >>>>>>>>>>>> >>>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetOffDiagUseAmat.html#PCFieldSplitSetOffDiagUseAmat >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> I guess when using PetscDSSetJacobianPreconditioner the >>>>>>>>>>>>> preconditioner is recomputed at every Newton step, so for a constant mass >>>>>>>>>>>>> matrix this might not be ideal. How can I avoid recomputing this at every >>>>>>>>>>>>> Newton iteration? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Maybe we need another flag like >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html >>>>>>>>>>>> >>>>>>>>>>>> or we need to expand >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html >>>>>>>>>>>> >>>>>>>>>>>> to separately cover the preconditioner matrix. However, both >>>>>>>>>>>> matrices are computed by one call so this would >>>>>>>>>>>> involve interface changes to user code, which we do not like to >>>>>>>>>>>> do. Right now it seems like a small optimization. >>>>>>>>>>>> I would want to wait and see whether it would really be >>>>>>>>>>>> maningful. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Sander >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s1humahd at stmail.uni-bayreuth.de Mon Mar 21 09:51:34 2016 From: s1humahd at stmail.uni-bayreuth.de (s1humahd) Date: Mon, 21 Mar 2016 07:51:34 -0700 Subject: [petsc-users] Error when Using KSP with Matrix -free Message-ID: Hello All, I'm trying to run a very simple program to get familiar with solving linear system using matrix free structure before I use it in my implementation. I already created a sparse matrix A and a vector b and set values to them. then I created a free-matrix matrix using MatCreateShell() and I set the function which provide the operation which currently it is merely multiply matrix A to the vector b, ( it is just an attemp to get familiar,similar to the example ex14f.F in KSP ). However I'm getting the flowing error when I call the KSPSolve() routine. I will be grateful if you could help me to recognize the reason for this error. also the related parts of my code is here. MatCreate(PETSC_COMM_WORLD,&A); MatSetSizes(A,nlocal,nlocal,8,8); . . . VecCreate(PETSC_COMM_WORLD,&b); VecSetSizes(x,PETSC_DECIDE,8); . . MatCreateShell(PETSC_COMM_WORLD,nlocal,nlocal,8,8, (void *)&A,&j_free); MatShellSetOperation(j_free,MATOP_MULT, (void(*) (void)) (my_mult)(j_free,b,x) ); . . . KSPCreate(PETSC_COMM_WORLD,&ksp); KSPSetOperators(ksp,j_free,A); KSPSetFromOptions(ksp); KSPSolve(ksp,x,sol); . . . PetscErrorCode my_mult(Mat j_free, Vec b, Vec x) { void *ptr; Mat *ptr2; MatShellGetContext(j_free, &ptr); // the context is matrix A ptr2 = (Mat*) ptr; MatMult(*ptr2,b,x); return 0; } The error as follow: [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: This matrix type does not have a multiply defined [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown [0]PETSC ERROR: ./jac_free on a arch-linux2-c-debug named humam-VirtualBox by humam Mon Mar 21 23:37:14 2016 [0]PETSC ERROR: Configure options --download-hdf5=1 --with-blas-lapack-dir=/usr/lib --with-mpi-dir=/usr [0]PETSC ERROR: #1 MatMult() line 2223 in /home/humam/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #2 PCApplyBAorAB() line 727 in /home/humam/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #3 KSP_PCApplyBAorAB() line 272 in /home/humam/petsc/include/petsc/private/kspimpl.h [0]PETSC ERROR: #4 KSPGMRESCycle() line 155 in /home/humam/petsc/src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: #5 KSPSolve_GMRES() line 236 in /home/humam/petsc/src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: #6 KSPSolve() line 604 in /home/humam/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #7 main() line 293 in /home/humam/jac_free.c [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0 Thanks, Humam From bsmith at mcs.anl.gov Mon Mar 21 12:58:51 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 21 Mar 2016 12:58:51 -0500 Subject: [petsc-users] Error when Using KSP with Matrix -free In-Reply-To: References: Message-ID: > On Mar 21, 2016, at 9:51 AM, s1humahd wrote: > > Hello All, > > I'm trying to run a very simple program to get familiar with solving linear system using matrix free structure before I use it in my implementation. > > I already created a sparse matrix A and a vector b and set values to them. then I created a free-matrix matrix using MatCreateShell() and I set the function which provide the operation which currently it is merely multiply matrix A to the vector b, ( it is just an attemp to get familiar,similar to the example ex14f.F in KSP ). > However I'm getting the flowing error when I call the KSPSolve() routine. I will be grateful if you could help me to recognize the reason for this error. also the related parts of my code is here. > > MatCreate(PETSC_COMM_WORLD,&A); > MatSetSizes(A,nlocal,nlocal,8,8); > > . > . > . > > VecCreate(PETSC_COMM_WORLD,&b); > VecSetSizes(x,PETSC_DECIDE,8); > > . > . > MatCreateShell(PETSC_COMM_WORLD,nlocal,nlocal,8,8, (void *)&A,&j_free); > MatShellSetOperation(j_free,MATOP_MULT, (void(*) (void)) (my_mult)(j_free,b,x) ); (void(*) (void)) (my_mult)(j_free,b,x) should be just (void(*) (void)) my_mult as written you are actually calling the function my_mult here which returns a 0 (since it ran correctly) and so it is setting a 0 as the operation for MATOP_MULT > > . > . > . > KSPCreate(PETSC_COMM_WORLD,&ksp); > KSPSetOperators(ksp,j_free,A); > KSPSetFromOptions(ksp); > KSPSolve(ksp,x,sol); > . > . > . > > PetscErrorCode my_mult(Mat j_free, Vec b, Vec x) > { > void *ptr; > Mat *ptr2; > > MatShellGetContext(j_free, &ptr); // the context is matrix A > ptr2 = (Mat*) ptr; > MatMult(*ptr2,b,x); > > return 0; > > } > > > > The error as follow: > > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: This matrix type does not have a multiply defined > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [0]PETSC ERROR: ./jac_free on a arch-linux2-c-debug named humam-VirtualBox by humam Mon Mar 21 23:37:14 2016 > [0]PETSC ERROR: Configure options --download-hdf5=1 --with-blas-lapack-dir=/usr/lib --with-mpi-dir=/usr > [0]PETSC ERROR: #1 MatMult() line 2223 in /home/humam/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 PCApplyBAorAB() line 727 in /home/humam/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #3 KSP_PCApplyBAorAB() line 272 in /home/humam/petsc/include/petsc/private/kspimpl.h > [0]PETSC ERROR: #4 KSPGMRESCycle() line 155 in /home/humam/petsc/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #5 KSPSolve_GMRES() line 236 in /home/humam/petsc/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #6 KSPSolve() line 604 in /home/humam/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 main() line 293 in /home/humam/jac_free.c > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0 > > > Thanks, > Humam From rongliang.chan at gmail.com Mon Mar 21 22:10:21 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Tue, 22 Mar 2016 11:10:21 +0800 Subject: [petsc-users] question about the PetscFVLeastSquaresPseudoInverseSVD Message-ID: <56F0B79D.8030408@gmail.com> Hello All, Can anyone help me to understand the output of PetscFVLeastSquaresPseudoInverseSVD? I tested six matrices and the results confused me. The results are followed and the source codes are attached. For the cases m>n, the results make sense for me, and for the m<=n cases, I do not understand the results. For example, for the m=n=3 cases, I think I should get an identity matrix from this function and I get two difference outputs for the same input matrix. Best regards, Rongliang ----------------------------------- Initialize the matrix A (is a 3 x 4 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 ----------------------------------- Initialize the matrix A (is a 3 x 4 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 ----------------------------------- Initialize the matrix A (is a 3 x 3 matrix): 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 0.500000 0.000000 -0.500000 0.500000 0.000000 -0.500000 0.000000 0.000000 1.000000 ----------------------------------- Initialize the matrix A (is a 3 x 3 matrix): 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 0.500000 -0.707107 0.000000 0.500000 -0.707107 0.000000 -0.000000 1.414214 0.000000 ----------------------------------- Initialize the matrix A (is a 3 x 2 matrix): 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 -0.838728 0.000000 0.000000 -0.000000 1.305169 ----------------------------------- Initialize the matrix A (is a 3 x 2 matrix): 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 -0.293611 0.000000 0.000000 -0.000000 1.000000 -------------- next part -------------- #include #include #include #include #include #include static char help[] = "Test the SVD function.\n"; #undef __FUNCT__ #define __FUNCT__ "PetscFVLeastSquaresPseudoInverseSVD" /* Overwrites A. Can handle degenerate problems and m Hi, Is it correct that a norm of a preconditioned RHS vector is used to compute a relative error in BCGS? I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial right hand side norm" is 2.223619476717e+10 (see below) but an actual norm of the RHS I passed is 4.059007e-02. If yes, is there any way to get a norm of a preconditioned RHS? [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 2.036064453512e-02 is less than relative tolerance 9.999999960042e-13 times initial right hand side norm 2.223619476717e+10 at iteration 6 Regards, Nori -- Norihiro Watanabe From jychang48 at gmail.com Tue Mar 22 03:28:35 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 22 Mar 2016 03:28:35 -0500 Subject: [petsc-users] Status on parallel mesh reader in DMPlex In-Reply-To: References: Message-ID: Hi Matt, Is this still on track to come out in the summer? Or is it going to be here sooner than that? Thanks, Justin On Fri, Dec 18, 2015 at 6:05 PM, Matthew Knepley wrote: > I am doing it with Michael Lange. We plan to have it done by the summer. > There is not much left to do. > > Thanks, > > Matt > > On Fri, Dec 18, 2015 at 8:21 AM, Justin Chang wrote: > >> Hi all, >> >> What's the status on the implementation of the parallel >> mesh reader/generator for DMPlex meshes? Is anyone actively working on >> this? If so is there a branch that I can peek into? >> >> Thanks, >> Justin >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 22 07:51:27 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Mar 2016 07:51:27 -0500 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe wrote: > Hi, > > Is it correct that a norm of a preconditioned RHS vector is used to > compute a relative error in BCGS? > Yes, but you can verify this using -ksp_view > I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial right > hand side norm" is 2.223619476717e+10 (see below) but an actual norm > of the RHS I passed is 4.059007e-02. If yes, is there any way to get a > norm of a preconditioned RHS? > Do you mean unpreconditioned? You can try http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html or use -ksp_monitor_true_residual Thanks, Matt > [0] KSPConvergedDefault(): Linear solver has converged. Residual norm > 2.036064453512e-02 is less than relative tolerance 9.999999960042e-13 > times initial right hand side norm 2.223619476717e+10 at iteration 6 > > > Regards, > Nori > > -- > Norihiro Watanabe > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 22 07:57:33 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Mar 2016 07:57:33 -0500 Subject: [petsc-users] Status on parallel mesh reader in DMPlex In-Reply-To: References: Message-ID: On Tue, Mar 22, 2016 at 3:28 AM, Justin Chang wrote: > Hi Matt, > > Is this still on track to come out in the summer? Or is it going to be > here sooner than that? > I don't think we will have it sooner than that. Thanks, Matt > Thanks, > Justin > > On Fri, Dec 18, 2015 at 6:05 PM, Matthew Knepley > wrote: > >> I am doing it with Michael Lange. We plan to have it done by the summer. >> There is not much left to do. >> >> Thanks, >> >> Matt >> >> On Fri, Dec 18, 2015 at 8:21 AM, Justin Chang >> wrote: >> >>> Hi all, >>> >>> What's the status on the implementation of the parallel >>> mesh reader/generator for DMPlex meshes? Is anyone actively working on >>> this? If so is there a branch that I can peek into? >>> >>> Thanks, >>> Justin >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From norihiro.w at gmail.com Tue Mar 22 08:25:21 2016 From: norihiro.w at gmail.com (Norihiro Watanabe) Date: Tue, 22 Mar 2016 14:25:21 +0100 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: Thank you Matt! Actually I don't want to change a norm type used in a convergence check. I just want to output a relative error which PETSc actually used for a convergence check (for log output in my program without -ksp_*) and thought I need to have a norm of a preconditioned RHS to compute it by myself. Or is there any function available in PETSc which returns the relative error or the tolerance multiplied by the norm of a preconditioned RHS? I couldn't find it. Best, Nori On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley wrote: > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe > wrote: >> >> Hi, >> >> Is it correct that a norm of a preconditioned RHS vector is used to >> compute a relative error in BCGS? > > > Yes, but you can verify this using -ksp_view > >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial right >> hand side norm" is 2.223619476717e+10 (see below) but an actual norm >> of the RHS I passed is 4.059007e-02. If yes, is there any way to get a >> norm of a preconditioned RHS? > > > Do you mean unpreconditioned? You can try > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html > > or use > > -ksp_monitor_true_residual > > Thanks, > > Matt > >> >> [0] KSPConvergedDefault(): Linear solver has converged. Residual norm >> 2.036064453512e-02 is less than relative tolerance 9.999999960042e-13 >> times initial right hand side norm 2.223619476717e+10 at iteration 6 >> >> >> Regards, >> Nori >> >> -- >> Norihiro Watanabe > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -- Norihiro Watanabe From knepley at gmail.com Tue Mar 22 08:45:50 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Mar 2016 08:45:50 -0500 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe wrote: > Thank you Matt! > > Actually I don't want to change a norm type used in a convergence > check. I just want to output a relative error which PETSc actually > used for a convergence check (for log output in my program without > -ksp_*) and thought I need to have a norm of a preconditioned RHS to > compute it by myself. Or is there any function available in PETSc > which returns the relative error or the tolerance multiplied by the > norm of a preconditioned RHS? I couldn't find it. > If you want the action of the preconditioner, you can pull it out KSPGetPC() and apply it PCApply() but I still do not understand why you want this. Do you want to check the norms yourself? The PCApply() could be expensive to calculate again. Thanks, Matt > > Best, > Nori > > > On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley > wrote: > > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe > > > wrote: > >> > >> Hi, > >> > >> Is it correct that a norm of a preconditioned RHS vector is used to > >> compute a relative error in BCGS? > > > > > > Yes, but you can verify this using -ksp_view > > > >> > >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial right > >> hand side norm" is 2.223619476717e+10 (see below) but an actual norm > >> of the RHS I passed is 4.059007e-02. If yes, is there any way to get a > >> norm of a preconditioned RHS? > > > > > > Do you mean unpreconditioned? You can try > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html > > > > or use > > > > -ksp_monitor_true_residual > > > > Thanks, > > > > Matt > > > >> > >> [0] KSPConvergedDefault(): Linear solver has converged. Residual norm > >> 2.036064453512e-02 is less than relative tolerance 9.999999960042e-13 > >> times initial right hand side norm 2.223619476717e+10 at iteration 6 > >> > >> > >> Regards, > >> Nori > >> > >> -- > >> Norihiro Watanabe > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > > -- > Norihiro Watanabe > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From norihiro.w at gmail.com Tue Mar 22 09:02:02 2016 From: norihiro.w at gmail.com (Norihiro Watanabe) Date: Tue, 22 Mar 2016 15:02:02 +0100 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: What I wanted to do is displaying final converged errors without using -ksp_monitor. Because my problem includes a lot of time steps and nonlinear iterations, log output from -ksp_monitor for each linear solve is sometimes too much. But you are right. It doesn't make sense to call the expensive function just for the log output. Thanks, Nori On Tue, Mar 22, 2016 at 2:45 PM, Matthew Knepley wrote: > On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe > wrote: >> >> Thank you Matt! >> >> Actually I don't want to change a norm type used in a convergence >> check. I just want to output a relative error which PETSc actually >> used for a convergence check (for log output in my program without >> -ksp_*) and thought I need to have a norm of a preconditioned RHS to >> compute it by myself. Or is there any function available in PETSc >> which returns the relative error or the tolerance multiplied by the >> norm of a preconditioned RHS? I couldn't find it. > > > If you want the action of the preconditioner, you can pull it out > > KSPGetPC() > > and apply it > > PCApply() > > but I still do not understand why you want this. Do you want to check the > norms > yourself? The PCApply() could be expensive to calculate again. > > Thanks, > > Matt > >> >> >> Best, >> Nori >> >> >> On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley >> wrote: >> > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe >> > >> > wrote: >> >> >> >> Hi, >> >> >> >> Is it correct that a norm of a preconditioned RHS vector is used to >> >> compute a relative error in BCGS? >> > >> > >> > Yes, but you can verify this using -ksp_view >> > >> >> >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial right >> >> hand side norm" is 2.223619476717e+10 (see below) but an actual norm >> >> of the RHS I passed is 4.059007e-02. If yes, is there any way to get a >> >> norm of a preconditioned RHS? >> > >> > >> > Do you mean unpreconditioned? You can try >> > >> > >> > >> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html >> > >> > or use >> > >> > -ksp_monitor_true_residual >> > >> > Thanks, >> > >> > Matt >> > >> >> >> >> [0] KSPConvergedDefault(): Linear solver has converged. Residual norm >> >> 2.036064453512e-02 is less than relative tolerance 9.999999960042e-13 >> >> times initial right hand side norm 2.223619476717e+10 at iteration 6 >> >> >> >> >> >> Regards, >> >> Nori >> >> >> >> -- >> >> Norihiro Watanabe >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments >> > is infinitely more interesting than any results to which their >> > experiments >> > lead. >> > -- Norbert Wiener >> >> >> >> -- >> Norihiro Watanabe > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -- Norihiro Watanabe From knepley at gmail.com Tue Mar 22 09:06:37 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Mar 2016 09:06:37 -0500 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: On Tue, Mar 22, 2016 at 9:02 AM, Norihiro Watanabe wrote: > What I wanted to do is displaying final converged errors without using > -ksp_monitor. Because my problem includes a lot of time steps and > nonlinear iterations, log output from -ksp_monitor for each linear > solve is sometimes too much. But you are right. It doesn't make sense > to call the expensive function just for the log output. > Maybe something like -ksp_converged_reason? Thanks, Matt > Thanks, > Nori > > On Tue, Mar 22, 2016 at 2:45 PM, Matthew Knepley > wrote: > > On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe > > > wrote: > >> > >> Thank you Matt! > >> > >> Actually I don't want to change a norm type used in a convergence > >> check. I just want to output a relative error which PETSc actually > >> used for a convergence check (for log output in my program without > >> -ksp_*) and thought I need to have a norm of a preconditioned RHS to > >> compute it by myself. Or is there any function available in PETSc > >> which returns the relative error or the tolerance multiplied by the > >> norm of a preconditioned RHS? I couldn't find it. > > > > > > If you want the action of the preconditioner, you can pull it out > > > > KSPGetPC() > > > > and apply it > > > > PCApply() > > > > but I still do not understand why you want this. Do you want to check the > > norms > > yourself? The PCApply() could be expensive to calculate again. > > > > Thanks, > > > > Matt > > > >> > >> > >> Best, > >> Nori > >> > >> > >> On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley > >> wrote: > >> > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe > >> > > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> Is it correct that a norm of a preconditioned RHS vector is used to > >> >> compute a relative error in BCGS? > >> > > >> > > >> > Yes, but you can verify this using -ksp_view > >> > > >> >> > >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial right > >> >> hand side norm" is 2.223619476717e+10 (see below) but an actual norm > >> >> of the RHS I passed is 4.059007e-02. If yes, is there any way to get > a > >> >> norm of a preconditioned RHS? > >> > > >> > > >> > Do you mean unpreconditioned? You can try > >> > > >> > > >> > > >> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html > >> > > >> > or use > >> > > >> > -ksp_monitor_true_residual > >> > > >> > Thanks, > >> > > >> > Matt > >> > > >> >> > >> >> [0] KSPConvergedDefault(): Linear solver has converged. Residual norm > >> >> 2.036064453512e-02 is less than relative tolerance 9.999999960042e-13 > >> >> times initial right hand side norm 2.223619476717e+10 at iteration 6 > >> >> > >> >> > >> >> Regards, > >> >> Nori > >> >> > >> >> -- > >> >> Norihiro Watanabe > >> > > >> > > >> > > >> > > >> > -- > >> > What most experimenters take for granted before they begin their > >> > experiments > >> > is infinitely more interesting than any results to which their > >> > experiments > >> > lead. > >> > -- Norbert Wiener > >> > >> > >> > >> -- > >> Norihiro Watanabe > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > > -- > Norihiro Watanabe > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From norihiro.w at gmail.com Tue Mar 22 09:08:53 2016 From: norihiro.w at gmail.com (Norihiro Watanabe) Date: Tue, 22 Mar 2016 15:08:53 +0100 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: Unfortunately -ksp_converged_reason prints the number of iterations but no information about final errors. Thanks, Nori On Tue, Mar 22, 2016 at 3:06 PM, Matthew Knepley wrote: > On Tue, Mar 22, 2016 at 9:02 AM, Norihiro Watanabe > wrote: >> >> What I wanted to do is displaying final converged errors without using >> -ksp_monitor. Because my problem includes a lot of time steps and >> nonlinear iterations, log output from -ksp_monitor for each linear >> solve is sometimes too much. But you are right. It doesn't make sense >> to call the expensive function just for the log output. > > > Maybe something like -ksp_converged_reason? > > Thanks, > > Matt > >> >> Thanks, >> Nori >> >> On Tue, Mar 22, 2016 at 2:45 PM, Matthew Knepley >> wrote: >> > On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe >> > >> > wrote: >> >> >> >> Thank you Matt! >> >> >> >> Actually I don't want to change a norm type used in a convergence >> >> check. I just want to output a relative error which PETSc actually >> >> used for a convergence check (for log output in my program without >> >> -ksp_*) and thought I need to have a norm of a preconditioned RHS to >> >> compute it by myself. Or is there any function available in PETSc >> >> which returns the relative error or the tolerance multiplied by the >> >> norm of a preconditioned RHS? I couldn't find it. >> > >> > >> > If you want the action of the preconditioner, you can pull it out >> > >> > KSPGetPC() >> > >> > and apply it >> > >> > PCApply() >> > >> > but I still do not understand why you want this. Do you want to check >> > the >> > norms >> > yourself? The PCApply() could be expensive to calculate again. >> > >> > Thanks, >> > >> > Matt >> > >> >> >> >> >> >> Best, >> >> Nori >> >> >> >> >> >> On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley >> >> wrote: >> >> > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe >> >> > >> >> > wrote: >> >> >> >> >> >> Hi, >> >> >> >> >> >> Is it correct that a norm of a preconditioned RHS vector is used to >> >> >> compute a relative error in BCGS? >> >> > >> >> > >> >> > Yes, but you can verify this using -ksp_view >> >> > >> >> >> >> >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial >> >> >> right >> >> >> hand side norm" is 2.223619476717e+10 (see below) but an actual norm >> >> >> of the RHS I passed is 4.059007e-02. If yes, is there any way to get >> >> >> a >> >> >> norm of a preconditioned RHS? >> >> > >> >> > >> >> > Do you mean unpreconditioned? You can try >> >> > >> >> > >> >> > >> >> > >> >> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html >> >> > >> >> > or use >> >> > >> >> > -ksp_monitor_true_residual >> >> > >> >> > Thanks, >> >> > >> >> > Matt >> >> > >> >> >> >> >> >> [0] KSPConvergedDefault(): Linear solver has converged. Residual >> >> >> norm >> >> >> 2.036064453512e-02 is less than relative tolerance >> >> >> 9.999999960042e-13 >> >> >> times initial right hand side norm 2.223619476717e+10 at iteration 6 >> >> >> >> >> >> >> >> >> Regards, >> >> >> Nori >> >> >> >> >> >> -- >> >> >> Norihiro Watanabe >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > What most experimenters take for granted before they begin their >> >> > experiments >> >> > is infinitely more interesting than any results to which their >> >> > experiments >> >> > lead. >> >> > -- Norbert Wiener >> >> >> >> >> >> >> >> -- >> >> Norihiro Watanabe >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments >> > is infinitely more interesting than any results to which their >> > experiments >> > lead. >> > -- Norbert Wiener >> >> >> >> -- >> Norihiro Watanabe > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -- Norihiro Watanabe From knepley at gmail.com Tue Mar 22 09:20:17 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Mar 2016 09:20:17 -0500 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: On Tue, Mar 22, 2016 at 9:08 AM, Norihiro Watanabe wrote: > Unfortunately -ksp_converged_reason prints the number of iterations > but no information about final errors. > If you want the actual residuals (not errrors), you could use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetResidualHistory.html Matt > Thanks, > Nori > > On Tue, Mar 22, 2016 at 3:06 PM, Matthew Knepley > wrote: > > On Tue, Mar 22, 2016 at 9:02 AM, Norihiro Watanabe > > > wrote: > >> > >> What I wanted to do is displaying final converged errors without using > >> -ksp_monitor. Because my problem includes a lot of time steps and > >> nonlinear iterations, log output from -ksp_monitor for each linear > >> solve is sometimes too much. But you are right. It doesn't make sense > >> to call the expensive function just for the log output. > > > > > > Maybe something like -ksp_converged_reason? > > > > Thanks, > > > > Matt > > > >> > >> Thanks, > >> Nori > >> > >> On Tue, Mar 22, 2016 at 2:45 PM, Matthew Knepley > >> wrote: > >> > On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe > >> > > >> > wrote: > >> >> > >> >> Thank you Matt! > >> >> > >> >> Actually I don't want to change a norm type used in a convergence > >> >> check. I just want to output a relative error which PETSc actually > >> >> used for a convergence check (for log output in my program without > >> >> -ksp_*) and thought I need to have a norm of a preconditioned RHS to > >> >> compute it by myself. Or is there any function available in PETSc > >> >> which returns the relative error or the tolerance multiplied by the > >> >> norm of a preconditioned RHS? I couldn't find it. > >> > > >> > > >> > If you want the action of the preconditioner, you can pull it out > >> > > >> > KSPGetPC() > >> > > >> > and apply it > >> > > >> > PCApply() > >> > > >> > but I still do not understand why you want this. Do you want to check > >> > the > >> > norms > >> > yourself? The PCApply() could be expensive to calculate again. > >> > > >> > Thanks, > >> > > >> > Matt > >> > > >> >> > >> >> > >> >> Best, > >> >> Nori > >> >> > >> >> > >> >> On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley > >> >> wrote: > >> >> > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe > >> >> > > >> >> > wrote: > >> >> >> > >> >> >> Hi, > >> >> >> > >> >> >> Is it correct that a norm of a preconditioned RHS vector is used > to > >> >> >> compute a relative error in BCGS? > >> >> > > >> >> > > >> >> > Yes, but you can verify this using -ksp_view > >> >> > > >> >> >> > >> >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial > >> >> >> right > >> >> >> hand side norm" is 2.223619476717e+10 (see below) but an actual > norm > >> >> >> of the RHS I passed is 4.059007e-02. If yes, is there any way to > get > >> >> >> a > >> >> >> norm of a preconditioned RHS? > >> >> > > >> >> > > >> >> > Do you mean unpreconditioned? You can try > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html > >> >> > > >> >> > or use > >> >> > > >> >> > -ksp_monitor_true_residual > >> >> > > >> >> > Thanks, > >> >> > > >> >> > Matt > >> >> > > >> >> >> > >> >> >> [0] KSPConvergedDefault(): Linear solver has converged. Residual > >> >> >> norm > >> >> >> 2.036064453512e-02 is less than relative tolerance > >> >> >> 9.999999960042e-13 > >> >> >> times initial right hand side norm 2.223619476717e+10 at > iteration 6 > >> >> >> > >> >> >> > >> >> >> Regards, > >> >> >> Nori > >> >> >> > >> >> >> -- > >> >> >> Norihiro Watanabe > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > What most experimenters take for granted before they begin their > >> >> > experiments > >> >> > is infinitely more interesting than any results to which their > >> >> > experiments > >> >> > lead. > >> >> > -- Norbert Wiener > >> >> > >> >> > >> >> > >> >> -- > >> >> Norihiro Watanabe > >> > > >> > > >> > > >> > > >> > -- > >> > What most experimenters take for granted before they begin their > >> > experiments > >> > is infinitely more interesting than any results to which their > >> > experiments > >> > lead. > >> > -- Norbert Wiener > >> > >> > >> > >> -- > >> Norihiro Watanabe > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > > -- > Norihiro Watanabe > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From norihiro.w at gmail.com Tue Mar 22 09:23:15 2016 From: norihiro.w at gmail.com (Norihiro Watanabe) Date: Tue, 22 Mar 2016 15:23:15 +0100 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: Can't I use KSPGetResidualNorm()? I mean if I'm interested only in the last residual. On Tue, Mar 22, 2016 at 3:20 PM, Matthew Knepley wrote: > On Tue, Mar 22, 2016 at 9:08 AM, Norihiro Watanabe > wrote: >> >> Unfortunately -ksp_converged_reason prints the number of iterations >> but no information about final errors. > > > If you want the actual residuals (not errrors), you could use > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetResidualHistory.html > > Matt > >> >> Thanks, >> Nori >> >> On Tue, Mar 22, 2016 at 3:06 PM, Matthew Knepley >> wrote: >> > On Tue, Mar 22, 2016 at 9:02 AM, Norihiro Watanabe >> > >> > wrote: >> >> >> >> What I wanted to do is displaying final converged errors without using >> >> -ksp_monitor. Because my problem includes a lot of time steps and >> >> nonlinear iterations, log output from -ksp_monitor for each linear >> >> solve is sometimes too much. But you are right. It doesn't make sense >> >> to call the expensive function just for the log output. >> > >> > >> > Maybe something like -ksp_converged_reason? >> > >> > Thanks, >> > >> > Matt >> > >> >> >> >> Thanks, >> >> Nori >> >> >> >> On Tue, Mar 22, 2016 at 2:45 PM, Matthew Knepley >> >> wrote: >> >> > On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe >> >> > >> >> > wrote: >> >> >> >> >> >> Thank you Matt! >> >> >> >> >> >> Actually I don't want to change a norm type used in a convergence >> >> >> check. I just want to output a relative error which PETSc actually >> >> >> used for a convergence check (for log output in my program without >> >> >> -ksp_*) and thought I need to have a norm of a preconditioned RHS to >> >> >> compute it by myself. Or is there any function available in PETSc >> >> >> which returns the relative error or the tolerance multiplied by the >> >> >> norm of a preconditioned RHS? I couldn't find it. >> >> > >> >> > >> >> > If you want the action of the preconditioner, you can pull it out >> >> > >> >> > KSPGetPC() >> >> > >> >> > and apply it >> >> > >> >> > PCApply() >> >> > >> >> > but I still do not understand why you want this. Do you want to check >> >> > the >> >> > norms >> >> > yourself? The PCApply() could be expensive to calculate again. >> >> > >> >> > Thanks, >> >> > >> >> > Matt >> >> > >> >> >> >> >> >> >> >> >> Best, >> >> >> Nori >> >> >> >> >> >> >> >> >> On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley >> >> >> wrote: >> >> >> > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe >> >> >> > >> >> >> > wrote: >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> >> >> Is it correct that a norm of a preconditioned RHS vector is used >> >> >> >> to >> >> >> >> compute a relative error in BCGS? >> >> >> > >> >> >> > >> >> >> > Yes, but you can verify this using -ksp_view >> >> >> > >> >> >> >> >> >> >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial >> >> >> >> right >> >> >> >> hand side norm" is 2.223619476717e+10 (see below) but an actual >> >> >> >> norm >> >> >> >> of the RHS I passed is 4.059007e-02. If yes, is there any way to >> >> >> >> get >> >> >> >> a >> >> >> >> norm of a preconditioned RHS? >> >> >> > >> >> >> > >> >> >> > Do you mean unpreconditioned? You can try >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html >> >> >> > >> >> >> > or use >> >> >> > >> >> >> > -ksp_monitor_true_residual >> >> >> > >> >> >> > Thanks, >> >> >> > >> >> >> > Matt >> >> >> > >> >> >> >> >> >> >> >> [0] KSPConvergedDefault(): Linear solver has converged. Residual >> >> >> >> norm >> >> >> >> 2.036064453512e-02 is less than relative tolerance >> >> >> >> 9.999999960042e-13 >> >> >> >> times initial right hand side norm 2.223619476717e+10 at >> >> >> >> iteration 6 >> >> >> >> >> >> >> >> >> >> >> >> Regards, >> >> >> >> Nori >> >> >> >> >> >> >> >> -- >> >> >> >> Norihiro Watanabe >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > What most experimenters take for granted before they begin their >> >> >> > experiments >> >> >> > is infinitely more interesting than any results to which their >> >> >> > experiments >> >> >> > lead. >> >> >> > -- Norbert Wiener >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Norihiro Watanabe >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > What most experimenters take for granted before they begin their >> >> > experiments >> >> > is infinitely more interesting than any results to which their >> >> > experiments >> >> > lead. >> >> > -- Norbert Wiener >> >> >> >> >> >> >> >> -- >> >> Norihiro Watanabe >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments >> > is infinitely more interesting than any results to which their >> > experiments >> > lead. >> > -- Norbert Wiener >> >> >> >> -- >> Norihiro Watanabe > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -- Norihiro Watanabe From knepley at gmail.com Tue Mar 22 09:26:21 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Mar 2016 09:26:21 -0500 Subject: [petsc-users] Norm of RHS in BCGS In-Reply-To: References: Message-ID: On Tue, Mar 22, 2016 at 9:23 AM, Norihiro Watanabe wrote: > Can't I use KSPGetResidualNorm()? I mean if I'm interested only in the > last residual. > Yes, definitely. Thanks, Matt > On Tue, Mar 22, 2016 at 3:20 PM, Matthew Knepley > wrote: > > On Tue, Mar 22, 2016 at 9:08 AM, Norihiro Watanabe > > > wrote: > >> > >> Unfortunately -ksp_converged_reason prints the number of iterations > >> but no information about final errors. > > > > > > If you want the actual residuals (not errrors), you could use > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetResidualHistory.html > > > > Matt > > > >> > >> Thanks, > >> Nori > >> > >> On Tue, Mar 22, 2016 at 3:06 PM, Matthew Knepley > >> wrote: > >> > On Tue, Mar 22, 2016 at 9:02 AM, Norihiro Watanabe > >> > > >> > wrote: > >> >> > >> >> What I wanted to do is displaying final converged errors without > using > >> >> -ksp_monitor. Because my problem includes a lot of time steps and > >> >> nonlinear iterations, log output from -ksp_monitor for each linear > >> >> solve is sometimes too much. But you are right. It doesn't make sense > >> >> to call the expensive function just for the log output. > >> > > >> > > >> > Maybe something like -ksp_converged_reason? > >> > > >> > Thanks, > >> > > >> > Matt > >> > > >> >> > >> >> Thanks, > >> >> Nori > >> >> > >> >> On Tue, Mar 22, 2016 at 2:45 PM, Matthew Knepley > >> >> wrote: > >> >> > On Tue, Mar 22, 2016 at 8:25 AM, Norihiro Watanabe > >> >> > > >> >> > wrote: > >> >> >> > >> >> >> Thank you Matt! > >> >> >> > >> >> >> Actually I don't want to change a norm type used in a convergence > >> >> >> check. I just want to output a relative error which PETSc actually > >> >> >> used for a convergence check (for log output in my program without > >> >> >> -ksp_*) and thought I need to have a norm of a preconditioned RHS > to > >> >> >> compute it by myself. Or is there any function available in PETSc > >> >> >> which returns the relative error or the tolerance multiplied by > the > >> >> >> norm of a preconditioned RHS? I couldn't find it. > >> >> > > >> >> > > >> >> > If you want the action of the preconditioner, you can pull it out > >> >> > > >> >> > KSPGetPC() > >> >> > > >> >> > and apply it > >> >> > > >> >> > PCApply() > >> >> > > >> >> > but I still do not understand why you want this. Do you want to > check > >> >> > the > >> >> > norms > >> >> > yourself? The PCApply() could be expensive to calculate again. > >> >> > > >> >> > Thanks, > >> >> > > >> >> > Matt > >> >> > > >> >> >> > >> >> >> > >> >> >> Best, > >> >> >> Nori > >> >> >> > >> >> >> > >> >> >> On Tue, Mar 22, 2016 at 1:51 PM, Matthew Knepley < > knepley at gmail.com> > >> >> >> wrote: > >> >> >> > On Tue, Mar 22, 2016 at 2:50 AM, Norihiro Watanabe > >> >> >> > > >> >> >> > wrote: > >> >> >> >> > >> >> >> >> Hi, > >> >> >> >> > >> >> >> >> Is it correct that a norm of a preconditioned RHS vector is > used > >> >> >> >> to > >> >> >> >> compute a relative error in BCGS? > >> >> >> > > >> >> >> > > >> >> >> > Yes, but you can verify this using -ksp_view > >> >> >> > > >> >> >> >> > >> >> >> >> I'm testing BCGS + BoomerAMG. With "-info", PETSc says "initial > >> >> >> >> right > >> >> >> >> hand side norm" is 2.223619476717e+10 (see below) but an actual > >> >> >> >> norm > >> >> >> >> of the RHS I passed is 4.059007e-02. If yes, is there any way > to > >> >> >> >> get > >> >> >> >> a > >> >> >> >> norm of a preconditioned RHS? > >> >> >> > > >> >> >> > > >> >> >> > Do you mean unpreconditioned? You can try > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNormType.html > >> >> >> > > >> >> >> > or use > >> >> >> > > >> >> >> > -ksp_monitor_true_residual > >> >> >> > > >> >> >> > Thanks, > >> >> >> > > >> >> >> > Matt > >> >> >> > > >> >> >> >> > >> >> >> >> [0] KSPConvergedDefault(): Linear solver has converged. > Residual > >> >> >> >> norm > >> >> >> >> 2.036064453512e-02 is less than relative tolerance > >> >> >> >> 9.999999960042e-13 > >> >> >> >> times initial right hand side norm 2.223619476717e+10 at > >> >> >> >> iteration 6 > >> >> >> >> > >> >> >> >> > >> >> >> >> Regards, > >> >> >> >> Nori > >> >> >> >> > >> >> >> >> -- > >> >> >> >> Norihiro Watanabe > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > -- > >> >> >> > What most experimenters take for granted before they begin their > >> >> >> > experiments > >> >> >> > is infinitely more interesting than any results to which their > >> >> >> > experiments > >> >> >> > lead. > >> >> >> > -- Norbert Wiener > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Norihiro Watanabe > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > What most experimenters take for granted before they begin their > >> >> > experiments > >> >> > is infinitely more interesting than any results to which their > >> >> > experiments > >> >> > lead. > >> >> > -- Norbert Wiener > >> >> > >> >> > >> >> > >> >> -- > >> >> Norihiro Watanabe > >> > > >> > > >> > > >> > > >> > -- > >> > What most experimenters take for granted before they begin their > >> > experiments > >> > is infinitely more interesting than any results to which their > >> > experiments > >> > lead. > >> > -- Norbert Wiener > >> > >> > >> > >> -- > >> Norihiro Watanabe > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > > -- > Norihiro Watanabe > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 22 20:49:32 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 22 Mar 2016 20:49:32 -0500 Subject: [petsc-users] Error when Using KSP with Matrix -free In-Reply-To: <5f35f5a10e7640243845733283c0c9c9@stmail.uni-bayreuth.de> References: <5f35f5a10e7640243845733283c0c9c9@stmail.uni-bayreuth.de> Message-ID: Run on one process with the option -start_in_debugger noxterm and put a break point in my_mult and then step through to see what it is doing. Barry > On Mar 22, 2016, at 9:52 AM, s1humahd wrote: > > Thank you very much for your reply, > now it it is working but the solution is completely wrong. I noticed that the vector "x" is still keeping its initial values whereas it should be changed ( inside my_mult() ) to x=Ab . It seems to me like my_mult still not working , do you have any > idea about that? or do I misunderstood something ? > > > Best Regards, > Humam > > On 2016-03-21 18:58, Barry Smith wrote: >>> On Mar 21, 2016, at 9:51 AM, s1humahd wrote: >>> Hello All, >>> I'm trying to run a very simple program to get familiar with solving linear system using matrix free structure before I use it in my implementation. >>> I already created a sparse matrix A and a vector b and set values to them. then I created a free-matrix matrix using MatCreateShell() and I set the function which provide the operation which currently it is merely multiply matrix A to the vector b, ( it is just an attemp to get familiar,similar to the example ex14f.F in KSP ). >>> However I'm getting the flowing error when I call the KSPSolve() routine. I will be grateful if you could help me to recognize the reason for this error. also the related parts of my code is here. >>> MatCreate(PETSC_COMM_WORLD,&A); >>> MatSetSizes(A,nlocal,nlocal,8,8); >>> . >>> . >>> . >>> VecCreate(PETSC_COMM_WORLD,&b); >>> VecSetSizes(x,PETSC_DECIDE,8); >>> . >>> . >>> MatCreateShell(PETSC_COMM_WORLD,nlocal,nlocal,8,8, (void *)&A,&j_free); >>> MatShellSetOperation(j_free,MATOP_MULT, (void(*) (void)) (my_mult)(j_free,b,x) ); >> (void(*) (void)) (my_mult)(j_free,b,x) should be just (void(*) >> (void)) my_mult as written you are actually calling the function >> my_mult here which returns a 0 (since it ran correctly) and so it is >> setting a 0 as the operation for MATOP_MULT >>> . >>> . >>> . >>> KSPCreate(PETSC_COMM_WORLD,&ksp); >>> KSPSetOperators(ksp,j_free,A); >>> KSPSetFromOptions(ksp); >>> KSPSolve(ksp,x,sol); >>> . >>> . >>> . >>> PetscErrorCode my_mult(Mat j_free, Vec b, Vec x) >>> { >>> void *ptr; >>> Mat *ptr2; >>> MatShellGetContext(j_free, &ptr); // the context is matrix A >>> ptr2 = (Mat*) ptr; >>> MatMult(*ptr2,b,x); >>> return 0; >>> } >>> The error as follow: >>> [0]PETSC ERROR: No support for this operation for this object type >>> [0]PETSC ERROR: This matrix type does not have a multiply defined >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown >>> [0]PETSC ERROR: ./jac_free on a arch-linux2-c-debug named humam-VirtualBox by humam Mon Mar 21 23:37:14 2016 >>> [0]PETSC ERROR: Configure options --download-hdf5=1 --with-blas-lapack-dir=/usr/lib --with-mpi-dir=/usr >>> [0]PETSC ERROR: #1 MatMult() line 2223 in /home/humam/petsc/src/mat/interface/matrix.c >>> [0]PETSC ERROR: #2 PCApplyBAorAB() line 727 in /home/humam/petsc/src/ksp/pc/interface/precon.c >>> [0]PETSC ERROR: #3 KSP_PCApplyBAorAB() line 272 in /home/humam/petsc/include/petsc/private/kspimpl.h >>> [0]PETSC ERROR: #4 KSPGMRESCycle() line 155 in /home/humam/petsc/src/ksp/ksp/impls/gmres/gmres.c >>> [0]PETSC ERROR: #5 KSPSolve_GMRES() line 236 in /home/humam/petsc/src/ksp/ksp/impls/gmres/gmres.c >>> [0]PETSC ERROR: #6 KSPSolve() line 604 in /home/humam/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: #7 main() line 293 in /home/humam/jac_free.c >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0 >>> Thanks, >>> Humam From steena.hpc at gmail.com Tue Mar 22 20:59:40 2016 From: steena.hpc at gmail.com (Steena Monteiro) Date: Tue, 22 Mar 2016 18:59:40 -0700 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: > So, are you saying that > > 1) You have a matrix with odd total dimension > > 2) You set the block size of the initial matrix to 2 > > 3) You load the matrix > > and there is no error? Can you make a simple example with a matrix of size > 5? > I can put in the relevant error checking. > > > Hi Matt, Thank you for the suggestion. I used cage3, a 5x5 matrix from the UFL collection, converted it to binary, and tested the code for block sizes 1 through 7. I added printfs inside all the MatMult_SeqBAIJ_*s in baij2.c and also logged some counts (blocked rows and blocks). The counts make sense if the matrix is being padded somewhere to accommodate for block sizes that are not completely divisible by matrix dimensions. surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 2 Inside SeqBAIJ_2 surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 3 Inside MatMult_SeqBAIJ_3 ... ... surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 7 Inside MatMult_SeqBAIJ_7 Table for different block sizes listing corresponding number of blocked rows and number of blocks inside the rows for cage3. Block size No. of blocked rows No. of nnz blocks in each blocked row. 1 5 5,3,3,4,4 2 3 3,3,3 3 2 2,2 4 2 2,2 5 1 1 6 1 1 7 1 1 I am attaching cage3.dat and cage3.mtx. Thanks, Steena > > >> ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); >> >> >> ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); >> ierr = VecSet(y,zero); CHKERRQ(ierr); >> ierr = MatMult(A,x,y); CHKERRQ(ierr); >> >> >> ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); >> ierr = MatDestroy(&A);CHKERRQ(ierr); >> ierr = VecDestroy(&x);CHKERRQ(ierr); >> ierr = VecDestroy(&y);CHKERRQ(ierr); >> >> Thanks, >> Steena >> >> >> On 15 March 2016 at 09:15, Matthew Knepley wrote: >> >>> On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro >>> wrote: >>> >>>> I pass a binary, matrix data file at the command line and load it into >>>> the matrix: >>>> >>>> PetscInitialize(&argc,&args,(char*)0,help); >>>> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); >>>> >>>> /* converted mtx to dat file*/ >>>> ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); >>>> CHKERRQ(ierr); >>>> >>>> if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat >>>> file with -f"); >>>> >>>> /* Load matrices */ >>>> ierr = >>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >>>> ierr = >>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >>>> ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); >>>> ierr = MatSetFromOptions(A);CHKERRQ(ierr); >>>> >>> >>> Nothing above loads a matrix. Do you also call MatLoad()? >>> >>> Matt >>> >>> >>>> Thanks, >>>> Steena >>>> >>>> On 15 March 2016 at 08:58, Matthew Knepley wrote: >>>> >>>>> On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro < >>>>> steena.hpc at gmail.com> wrote: >>>>> >>>>>> Thank you, Dave. >>>>>> >>>>>> Matt: I understand the inconsistency but MatMult with non divisible >>>>>> block sizes (here, 2) does not throw any errors and fail, when MatSetSize >>>>>> is commented out. Implying that 1139905 global size does work with block >>>>>> size 2. >>>>>> >>>>> >>>>> If you comment out MatSetSize(), how does it know what size the Mat is? >>>>> >>>>> Matt >>>>> >>>>> >>>>>> On 15 March 2016 at 00:12, Dave May wrote: >>>>>> >>>>>>> >>>>>>> On 15 March 2016 at 04:46, Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro < >>>>>>>> steena.hpc at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am having difficulty getting MatSetSize to work prior to using >>>>>>>>> MatMult. >>>>>>>>> >>>>>>>>> For matrix A with rows=cols=1,139,905 and block size = 2, >>>>>>>>> >>>>>>>> >>>>>>>> It is inconsistent to have a row/col size that is not divisible by >>>>>>>> the block size. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> To be honest, I don't think the error message being thrown clearly >>>>>>> indicates what the actual problem is (hence the email from Steena). What >>>>>>> about >>>>>>> >>>>>>> "Cannot change/reset row sizes to 400000 local 1139906 global after >>>>>>> previously setting them to 400000 local 1139905 global. Local and global >>>>>>> sizes must be divisible by the block size" >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> rank 0 gets 400000 rows and rank 1 739905 rows, like so: >>>>>>>>> >>>>>>>>> /*Matrix setup*/ >>>>>>>>> >>>>>>>>> >>>>>>>>> ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >>>>>>>>> ierr = MatCreate(PETSC_COMM_WORLD,&A); >>>>>>>>> ierr = MatSetFromOptions(A); >>>>>>>>> ierr = MatSetType(A,MATBAIJ); >>>>>>>>> ierr = MatSetBlockSize(A,2); >>>>>>>>> >>>>>>>>> /*Unequal row assignment*/ >>>>>>>>> >>>>>>>>> if (!rank) { >>>>>>>>> ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >>>>>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>>>>> } >>>>>>>>> else { >>>>>>>>> ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >>>>>>>>> 1139905,1139905);CHKERRQ(ierr); >>>>>>>>> } >>>>>>>>> >>>>>>>>> MatMult (A,x,y); >>>>>>>>> >>>>>>>>> /************************************/ >>>>>>>>> >>>>>>>>> Error message: >>>>>>>>> >>>>>>>>> 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for >>>>>>>>> this object type >>>>>>>>> Cannot change/reset row sizes to 400000 local 1139906 global after >>>>>>>>> previously setting them to 400000 local 1139905 global >>>>>>>>> >>>>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to >>>>>>>>> 739905 local 1139906 global after previously setting them to 739905 local >>>>>>>>> 1139905 global >>>>>>>>> >>>>>>>>> -Without messing with row assignment, MatMult works fine on this >>>>>>>>> matrix for block size = 2, presumably because an extra padded row is >>>>>>>>> automatically added to facilitate blocking. >>>>>>>>> >>>>>>>>> -The above code snippet works well for block size = 1. >>>>>>>>> >>>>>>>>> Is it possible to do unequal row distribution *while using >>>>>>>>> blocking*? >>>>>>>>> >>>>>>>>> Thank you for any advice. >>>>>>>>> >>>>>>>>> -Steena >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cage3.dat Type: application/octet-stream Size: 264 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cage3.mtx Type: application/octet-stream Size: 967 bytes Desc: not available URL: From justin.c.droba at nasa.gov Wed Mar 23 09:26:53 2016 From: justin.c.droba at nasa.gov (Justin Droba (JSC-EG3) [JETS]) Date: Wed, 23 Mar 2016 09:26:53 -0500 Subject: [petsc-users] ASM Index Sets Question Message-ID: <56F2A7AD.7060902@nasa.gov> Dear all, Thank you for maintaining this wonderful software and this user mailing list! I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: Image I want to put $Omega_1$ and $Omega_2$ on processor 1 and the other two on the second processor. -- --------------------------------------------- Justin Droba, PhD Applied Aeroscience and CFD Branch (EG3) NASA Lyndon B. Johnson Space Center JETS/Jacobs Technology, Inc. and HX5, LLC *Office:* Building 16, Room 142 *Phone*: 281-483-1451 --------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Untitled.png Type: image/png Size: 164927 bytes Desc: not available URL: From justin.c.droba at nasa.gov Wed Mar 23 09:30:07 2016 From: justin.c.droba at nasa.gov (Justin Droba (JSC-EG3) [JETS]) Date: Wed, 23 Mar 2016 09:30:07 -0500 Subject: [petsc-users] Fwd: ASM Index Sets Question In-Reply-To: <56F2A7AD.7060902@nasa.gov> References: <56F2A7AD.7060902@nasa.gov> Message-ID: <56F2A86F.6040605@nasa.gov> Dear all, SORRY ABOUT THE LAST MESSAGE! Hit send before I meant to! Thank you for maintaining this wonderful software and this user mailing list! I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: Image I want to put $\Omega_1$ and $\Omega_2$ on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? Thank you! Best regards, Justin -- --------------------------------------------- Justin Droba, PhD Applied Aeroscience and CFD Branch (EG3) NASA Lyndon B. Johnson Space Center JETS/Jacobs Technology, Inc. and HX5, LLC *Office:* Building 16, Room 142 *Phone*: 281-483-1451 --------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 164927 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tblatex-5.png Type: image/png Size: 791 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tblatex-4.png Type: image/png Size: 824 bytes Desc: not available URL: From justin.c.droba at nasa.gov Wed Mar 23 09:37:59 2016 From: justin.c.droba at nasa.gov (Justin Droba (JSC-EG3) [JETS]) Date: Wed, 23 Mar 2016 09:37:59 -0500 Subject: [petsc-users] ASM Index Sets [Repost with Correction] In-Reply-To: <56F2A86F.6040605@nasa.gov> References: <56F2A86F.6040605@nasa.gov> Message-ID: <56F2AA47.1050902@nasa.gov> Dear all, Very sorry to flood the mailing list, one final post because the image will not come through if you're on digest or using plain text. Thank you for maintaining this wonderful software and this user mailing list! I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: Image https://farm2.staticflickr.com/1633/25912726941_428d61ae87_o.png I want to put Omega_1 and Omega_2 on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? Thank you! Best regards, Justin -- --------------------------------------------- Justin Droba, PhD Applied Aeroscience and CFD Branch (EG3) NASA Lyndon B. Johnson Space Center JETS/Jacobs Technology, Inc. and HX5, LLC *Office:* Building 16, Room 142 *Phone*: 281-483-1451 --------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 164927 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Mar 23 10:29:57 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 23 Mar 2016 10:29:57 -0500 Subject: [petsc-users] ASM Index Sets [Repost with Correction] In-Reply-To: <56F2AA47.1050902@nasa.gov> References: <56F2A86F.6040605@nasa.gov> <56F2AA47.1050902@nasa.gov> Message-ID: Seems ok. You'll need to be more specific about "isn't quite working." Barry > On Mar 23, 2016, at 9:37 AM, Justin Droba (JSC-EG3) [JETS] wrote: > > Dear all, > > Very sorry to flood the mailing list, one final post because the image will not come through if you're on digest or using plain text. > > Thank you for maintaining this wonderful software and this user mailing list! > > I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. > > Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: > > > > https://farm2.staticflickr.com/1633/25912726941_428d61ae87_o.png > > > I want to put Omega_1 and Omega_2 on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 > > IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } > IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } > > and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? > > Thank you! > > Best regards, > Justin > > -- > --------------------------------------------- > Justin Droba, PhD > Applied Aeroscience and CFD Branch (EG3) > NASA Lyndon B. Johnson Space Center > JETS/Jacobs Technology, Inc. and HX5, LLC > > Office: Building 16, Room 142 > Phone: 281-483-1451 > --------------------------------------------- > > > > From knepley at gmail.com Wed Mar 23 11:04:26 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 Mar 2016 11:04:26 -0500 Subject: [petsc-users] Fwd: ASM Index Sets Question In-Reply-To: <56F2A86F.6040605@nasa.gov> References: <56F2A7AD.7060902@nasa.gov> <56F2A86F.6040605@nasa.gov> Message-ID: On Wed, Mar 23, 2016 at 9:30 AM, Justin Droba (JSC-EG3) [JETS] < justin.c.droba at nasa.gov> wrote: > Dear all, > > SORRY ABOUT THE LAST MESSAGE! Hit send before I meant to! > > Thank you for maintaining this wonderful software and this user mailing > list! > > I am attempting to use an additive Schwartz preconditioner with multiple > blocks per processor. I've been able to get it to work with one block per > processor but not been successful with multiple. I believe I am not > understanding correctly the construction of index sets for this case. > > Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: > > [image: Image] > > > I want to put [image: $\Omega_1$] and [image: $\Omega_2$] on processor 1 > and the other two on the second processor. Do I make my index sets on > processor 1 > > IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } > IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } > > and similarly for the other processor? I tried this and it isn't quite > working out. Do I misunderstand something? > That looks right to me, but you could also just use -pc_asm_blocks 4 unless you need fine control over how the blocks are divided. Thanks, Matt > Thank you! > > Best regards, > Justin > > -- > --------------------------------------------- > Justin Droba, PhD > Applied Aeroscience and CFD Branch (EG3) > NASA Lyndon B. Johnson Space Center > JETS/Jacobs Technology, Inc. and HX5, LLC > > *Office:* Building 16, Room 142 > *Phone*: 281-483-1451 > --------------------------------------------- > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tblatex-5.png Type: image/png Size: 791 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tblatex-4.png Type: image/png Size: 824 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 164927 bytes Desc: not available URL: From jed at jedbrown.org Wed Mar 23 13:37:25 2016 From: jed at jedbrown.org (Jed Brown) Date: Wed, 23 Mar 2016 18:37:25 +0000 Subject: [petsc-users] makefile for compiling multiple sources In-Reply-To: References: <003801d18125$51af21a0$f50d64e0$@capesim.com> Message-ID: <871t71xcpm.fsf@jedbrown.org> Satish Balay writes: > Not sure how ccache helps with header dependencies. It returns the cached version if all the dependencies (including headers) are unchanged. > One can add in the dependencies to the makefile > > femex1.o: femex1.cpp lists.h shapes.h > shapes.o: shapes.cpp shapes.h This is an insane maintenance burden that is doomed to be eternally incorrect and out of date. You can generate correct header dependencies using the -MMD option (or variants for other compilers; defined as C_DEPFLAG in PETSc makefiles and used by gmakefile). It's simple; why not do it right? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Mar 23 17:46:59 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 23 Mar 2016 17:46:59 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: The baij and sbaij MatLoad() do automatically pad the matrix with rows/columns of the identity to make it divisible by the block size. This is why you are seeing what you are seeing. Barry Is this a good idea? Maybe not > On Mar 22, 2016, at 8:59 PM, Steena Monteiro wrote: > > > > > So, are you saying that > > 1) You have a matrix with odd total dimension > > 2) You set the block size of the initial matrix to 2 > > 3) You load the matrix > > and there is no error? Can you make a simple example with a matrix of size 5? > I can put in the relevant error checking. > > > > Hi Matt, > > Thank you for the suggestion. > > I used cage3, a 5x5 matrix from the UFL collection, converted it to binary, and tested the code for block sizes 1 through 7. I added printfs inside all the MatMult_SeqBAIJ_*s in baij2.c and also logged some counts (blocked rows and blocks). The counts make sense if the matrix is being padded somewhere to accommodate for block sizes that are not completely divisible by matrix dimensions. > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 2 > Inside SeqBAIJ_2 > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 3 > Inside MatMult_SeqBAIJ_3 > ... > ... > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 7 > Inside MatMult_SeqBAIJ_7 > > Table for different block sizes listing corresponding number of blocked rows and number of blocks inside the rows for cage3. > > > Block size No. of blocked rows No. of nnz blocks in each blocked row. > 1 5 5,3,3,4,4 > 2 3 3,3,3 > 3 2 2,2 > 4 2 2,2 > 5 1 1 > 6 1 1 > 7 1 1 > > I am attaching cage3.dat and cage3.mtx. > > Thanks, > Steena > > > > > > > ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); > > > ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); > ierr = VecSet(y,zero); CHKERRQ(ierr); > ierr = MatMult(A,x,y); CHKERRQ(ierr); > > > ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); > ierr = MatDestroy(&A);CHKERRQ(ierr); > ierr = VecDestroy(&x);CHKERRQ(ierr); > ierr = VecDestroy(&y);CHKERRQ(ierr); > > Thanks, > Steena > > > On 15 March 2016 at 09:15, Matthew Knepley wrote: > On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro wrote: > I pass a binary, matrix data file at the command line and load it into the matrix: > > PetscInitialize(&argc,&args,(char*)0,help); > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > > /* converted mtx to dat file*/ > ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); > CHKERRQ(ierr); > > if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat file with -f"); > > /* Load matrices */ > ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); > ierr = MatSetFromOptions(A);CHKERRQ(ierr); > > Nothing above loads a matrix. Do you also call MatLoad()? > > Matt > > Thanks, > Steena > > On 15 March 2016 at 08:58, Matthew Knepley wrote: > On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro wrote: > Thank you, Dave. > > Matt: I understand the inconsistency but MatMult with non divisible block sizes (here, 2) does not throw any errors and fail, when MatSetSize is commented out. Implying that 1139905 global size does work with block size 2. > > If you comment out MatSetSize(), how does it know what size the Mat is? > > Matt > > On 15 March 2016 at 00:12, Dave May wrote: > > On 15 March 2016 at 04:46, Matthew Knepley wrote: > On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro wrote: > Hello, > > I am having difficulty getting MatSetSize to work prior to using MatMult. > > For matrix A with rows=cols=1,139,905 and block size = 2, > > It is inconsistent to have a row/col size that is not divisible by the block size. > > > To be honest, I don't think the error message being thrown clearly indicates what the actual problem is (hence the email from Steena). What about > > "Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global. Local and global sizes must be divisible by the block size" > > > Matt > > rank 0 gets 400000 rows and rank 1 739905 rows, like so: > > /*Matrix setup*/ > > ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); > ierr = MatCreate(PETSC_COMM_WORLD,&A); > ierr = MatSetFromOptions(A); > ierr = MatSetType(A,MATBAIJ); > ierr = MatSetBlockSize(A,2); > > /*Unequal row assignment*/ > > if (!rank) { > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); > } > else { > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); > } > > MatMult (A,x,y); > > /************************************/ > > Error message: > > 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this object type > Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global > > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 local 1139906 global after previously setting them to 739905 local 1139905 global > > -Without messing with row assignment, MatMult works fine on this matrix for block size = 2, presumably because an extra padded row is automatically added to facilitate blocking. > > -The above code snippet works well for block size = 1. > > Is it possible to do unequal row distribution while using blocking? > > Thank you for any advice. > > -Steena > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > From steena.hpc at gmail.com Wed Mar 23 19:11:53 2016 From: steena.hpc at gmail.com (Steena Monteiro) Date: Wed, 23 Mar 2016 17:11:53 -0700 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: Thanks, Barry. The block size compatibility was a probe to investigate an error arising when trying to assign unequal numbers of rows across two MPI ranks. To re-visit the initial question, what is going wrong when I try to divide rows unequally across MPI ranks using MatSetSize? For matrix A with rows=cols=1,139,905 and block size = 2, rank 0 gets 400000 rows and rank 1, 739905 rows if (!rank) { ierr = MatSetSizes(A, 400000, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); } else { ierr = MatSetSizes(A, 739905, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); } MatMult (A,x,y); /************************************/ Error message: [1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this object type Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 local 1139906 global after previously setting them to 739905 local 1139905 global On 23 March 2016 at 15:46, Barry Smith wrote: > > The baij and sbaij MatLoad() do automatically pad the matrix with > rows/columns of the identity to make it divisible by the block size. This > is why you are seeing what you are seeing. > > Barry > > Is this a good idea? Maybe not > > > > On Mar 22, 2016, at 8:59 PM, Steena Monteiro > wrote: > > > > > > > > > > So, are you saying that > > > > 1) You have a matrix with odd total dimension > > > > 2) You set the block size of the initial matrix to 2 > > > > 3) You load the matrix > > > > and there is no error? Can you make a simple example with a matrix of > size 5? > > I can put in the relevant error checking. > > > > > > > > Hi Matt, > > > > Thank you for the suggestion. > > > > I used cage3, a 5x5 matrix from the UFL collection, converted it to > binary, and tested the code for block sizes 1 through 7. I added printfs > inside all the MatMult_SeqBAIJ_*s in baij2.c and also logged some counts > (blocked rows and blocks). The counts make sense if the matrix is being > padded somewhere to accommodate for block sizes that are not completely > divisible by matrix dimensions. > > > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat > -matload_block_size 2 > > Inside SeqBAIJ_2 > > > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat > -matload_block_size 3 > > Inside MatMult_SeqBAIJ_3 > > ... > > ... > > > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat > -matload_block_size 7 > > Inside MatMult_SeqBAIJ_7 > > > > Table for different block sizes listing corresponding number of blocked > rows and number of blocks inside the rows for cage3. > > > > > > Block size No. of blocked rows No. of nnz blocks in each blocked > row. > > 1 5 5,3,3,4,4 > > 2 3 3,3,3 > > 3 2 2,2 > > 4 2 2,2 > > 5 1 1 > > 6 1 1 > > 7 1 1 > > > > I am attaching cage3.dat and cage3.mtx. > > > > Thanks, > > Steena > > > > > > > > > > > > > > ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); > > > > > > ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); > > ierr = VecSet(y,zero); CHKERRQ(ierr); > > ierr = MatMult(A,x,y); CHKERRQ(ierr); > > > > > > ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); > > ierr = MatDestroy(&A);CHKERRQ(ierr); > > ierr = VecDestroy(&x);CHKERRQ(ierr); > > ierr = VecDestroy(&y);CHKERRQ(ierr); > > > > Thanks, > > Steena > > > > > > On 15 March 2016 at 09:15, Matthew Knepley wrote: > > On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro > wrote: > > I pass a binary, matrix data file at the command line and load it into > the matrix: > > > > PetscInitialize(&argc,&args,(char*)0,help); > > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > > > > /* converted mtx to dat file*/ > > ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); > > CHKERRQ(ierr); > > > > if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat > file with -f"); > > > > /* Load matrices */ > > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > > ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); > > ierr = MatSetFromOptions(A);CHKERRQ(ierr); > > > > Nothing above loads a matrix. Do you also call MatLoad()? > > > > Matt > > > > Thanks, > > Steena > > > > On 15 March 2016 at 08:58, Matthew Knepley wrote: > > On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro > wrote: > > Thank you, Dave. > > > > Matt: I understand the inconsistency but MatMult with non divisible > block sizes (here, 2) does not throw any errors and fail, when MatSetSize > is commented out. Implying that 1139905 global size does work with block > size 2. > > > > If you comment out MatSetSize(), how does it know what size the Mat is? > > > > Matt > > > > On 15 March 2016 at 00:12, Dave May wrote: > > > > On 15 March 2016 at 04:46, Matthew Knepley wrote: > > On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro > wrote: > > Hello, > > > > I am having difficulty getting MatSetSize to work prior to using MatMult. > > > > For matrix A with rows=cols=1,139,905 and block size = 2, > > > > It is inconsistent to have a row/col size that is not divisible by the > block size. > > > > > > To be honest, I don't think the error message being thrown clearly > indicates what the actual problem is (hence the email from Steena). What > about > > > > "Cannot change/reset row sizes to 400000 local 1139906 global after > previously setting them to 400000 local 1139905 global. Local and global > sizes must be divisible by the block size" > > > > > > Matt > > > > rank 0 gets 400000 rows and rank 1 739905 rows, like so: > > > > /*Matrix setup*/ > > > > ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); > > ierr = MatCreate(PETSC_COMM_WORLD,&A); > > ierr = MatSetFromOptions(A); > > ierr = MatSetType(A,MATBAIJ); > > ierr = MatSetBlockSize(A,2); > > > > /*Unequal row assignment*/ > > > > if (!rank) { > > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, > 1139905,1139905);CHKERRQ(ierr); > > } > > else { > > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, > 1139905,1139905);CHKERRQ(ierr); > > } > > > > MatMult (A,x,y); > > > > /************************************/ > > > > Error message: > > > > 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this > object type > > Cannot change/reset row sizes to 400000 local 1139906 global after > previously setting them to 400000 local 1139905 global > > > > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 > local 1139906 global after previously setting them to 739905 local 1139905 > global > > > > -Without messing with row assignment, MatMult works fine on this matrix > for block size = 2, presumably because an extra padded row is automatically > added to facilitate blocking. > > > > -The above code snippet works well for block size = 1. > > > > Is it possible to do unequal row distribution while using blocking? > > > > Thank you for any advice. > > > > -Steena > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 23 19:19:19 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 Mar 2016 19:19:19 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: On Wed, Mar 23, 2016 at 7:11 PM, Steena Monteiro wrote: > Thanks, Barry. The block size compatibility was a probe to investigate an > error arising when trying to assign unequal numbers of rows across two MPI > ranks. > > To re-visit the initial question, what is going wrong when I try to divide > rows unequally across MPI ranks using MatSetSize? > We definitely require that the blocksize divide the local size. This looks like a problem with our checks. Matt > For matrix A with rows=cols=1,139,905 and block size = 2, > > rank 0 gets 400000 rows and rank 1, 739905 rows > > if (!rank) { > > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, > 1139905,1139905);CHKERRQ(ierr); > > } > > else { > > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, > 1139905,1139905);CHKERRQ(ierr); > > } > > MatMult (A,x,y); > > > /************************************/ > > Error message: > > [1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this > object type > > Cannot change/reset row sizes to 400000 local 1139906 global after > previously setting them to 400000 local 1139905 global > > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 > local 1139906 global after previously setting them to 739905 local 1139905 > global > > > > > > On 23 March 2016 at 15:46, Barry Smith wrote: > >> >> The baij and sbaij MatLoad() do automatically pad the matrix with >> rows/columns of the identity to make it divisible by the block size. This >> is why you are seeing what you are seeing. >> >> Barry >> >> Is this a good idea? Maybe not >> >> >> > On Mar 22, 2016, at 8:59 PM, Steena Monteiro >> wrote: >> > >> > >> > >> > >> > So, are you saying that >> > >> > 1) You have a matrix with odd total dimension >> > >> > 2) You set the block size of the initial matrix to 2 >> > >> > 3) You load the matrix >> > >> > and there is no error? Can you make a simple example with a matrix of >> size 5? >> > I can put in the relevant error checking. >> > >> > >> > >> > Hi Matt, >> > >> > Thank you for the suggestion. >> > >> > I used cage3, a 5x5 matrix from the UFL collection, converted it to >> binary, and tested the code for block sizes 1 through 7. I added printfs >> inside all the MatMult_SeqBAIJ_*s in baij2.c and also logged some counts >> (blocked rows and blocks). The counts make sense if the matrix is being >> padded somewhere to accommodate for block sizes that are not completely >> divisible by matrix dimensions. >> > >> > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat >> -matload_block_size 2 >> > Inside SeqBAIJ_2 >> > >> > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat >> -matload_block_size 3 >> > Inside MatMult_SeqBAIJ_3 >> > ... >> > ... >> > >> > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat >> -matload_block_size 7 >> > Inside MatMult_SeqBAIJ_7 >> > >> > Table for different block sizes listing corresponding number of blocked >> rows and number of blocks inside the rows for cage3. >> > >> > >> > Block size No. of blocked rows No. of nnz blocks in each blocked >> row. >> > 1 5 5,3,3,4,4 >> > 2 3 3,3,3 >> > 3 2 2,2 >> > 4 2 2,2 >> > 5 1 1 >> > 6 1 1 >> > 7 1 1 >> > >> > I am attaching cage3.dat and cage3.mtx. >> > >> > Thanks, >> > Steena >> > >> > >> > >> > >> > >> > >> > ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); >> > >> > >> > ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); >> > ierr = VecSet(y,zero); CHKERRQ(ierr); >> > ierr = MatMult(A,x,y); CHKERRQ(ierr); >> > >> > >> > ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); >> > ierr = MatDestroy(&A);CHKERRQ(ierr); >> > ierr = VecDestroy(&x);CHKERRQ(ierr); >> > ierr = VecDestroy(&y);CHKERRQ(ierr); >> > >> > Thanks, >> > Steena >> > >> > >> > On 15 March 2016 at 09:15, Matthew Knepley wrote: >> > On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro >> wrote: >> > I pass a binary, matrix data file at the command line and load it into >> the matrix: >> > >> > PetscInitialize(&argc,&args,(char*)0,help); >> > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); >> > >> > /* converted mtx to dat file*/ >> > ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); >> > CHKERRQ(ierr); >> > >> > if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat >> file with -f"); >> > >> > /* Load matrices */ >> > ierr = >> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >> > ierr = >> PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); >> > ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); >> > ierr = MatSetFromOptions(A);CHKERRQ(ierr); >> > >> > Nothing above loads a matrix. Do you also call MatLoad()? >> > >> > Matt >> > >> > Thanks, >> > Steena >> > >> > On 15 March 2016 at 08:58, Matthew Knepley wrote: >> > On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro >> wrote: >> > Thank you, Dave. >> > >> > Matt: I understand the inconsistency but MatMult with non divisible >> block sizes (here, 2) does not throw any errors and fail, when MatSetSize >> is commented out. Implying that 1139905 global size does work with block >> size 2. >> > >> > If you comment out MatSetSize(), how does it know what size the Mat is? >> > >> > Matt >> > >> > On 15 March 2016 at 00:12, Dave May wrote: >> > >> > On 15 March 2016 at 04:46, Matthew Knepley wrote: >> > On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro >> wrote: >> > Hello, >> > >> > I am having difficulty getting MatSetSize to work prior to using >> MatMult. >> > >> > For matrix A with rows=cols=1,139,905 and block size = 2, >> > >> > It is inconsistent to have a row/col size that is not divisible by the >> block size. >> > >> > >> > To be honest, I don't think the error message being thrown clearly >> indicates what the actual problem is (hence the email from Steena). What >> about >> > >> > "Cannot change/reset row sizes to 400000 local 1139906 global after >> previously setting them to 400000 local 1139905 global. Local and global >> sizes must be divisible by the block size" >> > >> > >> > Matt >> > >> > rank 0 gets 400000 rows and rank 1 739905 rows, like so: >> > >> > /*Matrix setup*/ >> > >> > ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); >> > ierr = MatCreate(PETSC_COMM_WORLD,&A); >> > ierr = MatSetFromOptions(A); >> > ierr = MatSetType(A,MATBAIJ); >> > ierr = MatSetBlockSize(A,2); >> > >> > /*Unequal row assignment*/ >> > >> > if (!rank) { >> > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, >> 1139905,1139905);CHKERRQ(ierr); >> > } >> > else { >> > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, >> 1139905,1139905);CHKERRQ(ierr); >> > } >> > >> > MatMult (A,x,y); >> > >> > /************************************/ >> > >> > Error message: >> > >> > 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this >> object type >> > Cannot change/reset row sizes to 400000 local 1139906 global after >> previously setting them to 400000 local 1139905 global >> > >> > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 >> local 1139906 global after previously setting them to 739905 local 1139905 >> global >> > >> > -Without messing with row assignment, MatMult works fine on this >> matrix for block size = 2, presumably because an extra padded row is >> automatically added to facilitate blocking. >> > >> > -The above code snippet works well for block size = 1. >> > >> > Is it possible to do unequal row distribution while using blocking? >> > >> > Thank you for any advice. >> > >> > -Steena >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 23 19:32:28 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 23 Mar 2016 19:32:28 -0500 Subject: [petsc-users] MatSetSizes with blocked matrix In-Reply-To: References: Message-ID: <099D52BA-1077-44E5-8959-2708EC2F096F@mcs.anl.gov> BAIJ and SBAIJ will only pad matrices on a MatLoad() if you have not already set the sizes. See the source code for MatLoad_MPIBAIJ(). Barry > On Mar 23, 2016, at 7:11 PM, Steena Monteiro wrote: > > Thanks, Barry. The block size compatibility was a probe to investigate an error arising when trying to assign unequal numbers of rows across two MPI ranks. > > To re-visit the initial question, what is going wrong when I try to divide rows unequally across MPI ranks using MatSetSize? > > For matrix A with rows=cols=1,139,905 and block size = 2, > rank 0 gets 400000 rows and rank 1, 739905 rows > > if (!rank) { > > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); > > } > > else { > > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); > > } > > MatMult (A,x,y); > > > > /************************************/ > > Error message: > > [1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this object type > > Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global > > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 local 1139906 global after previously setting them to 739905 local 1139905 global > > > > > > > On 23 March 2016 at 15:46, Barry Smith wrote: > > The baij and sbaij MatLoad() do automatically pad the matrix with rows/columns of the identity to make it divisible by the block size. This is why you are seeing what you are seeing. > > Barry > > Is this a good idea? Maybe not > > > > On Mar 22, 2016, at 8:59 PM, Steena Monteiro wrote: > > > > > > > > > > So, are you saying that > > > > 1) You have a matrix with odd total dimension > > > > 2) You set the block size of the initial matrix to 2 > > > > 3) You load the matrix > > > > and there is no error? Can you make a simple example with a matrix of size 5? > > I can put in the relevant error checking. > > > > > > > > Hi Matt, > > > > Thank you for the suggestion. > > > > I used cage3, a 5x5 matrix from the UFL collection, converted it to binary, and tested the code for block sizes 1 through 7. I added printfs inside all the MatMult_SeqBAIJ_*s in baij2.c and also logged some counts (blocked rows and blocks). The counts make sense if the matrix is being padded somewhere to accommodate for block sizes that are not completely divisible by matrix dimensions. > > > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 2 > > Inside SeqBAIJ_2 > > > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 3 > > Inside MatMult_SeqBAIJ_3 > > ... > > ... > > > > surface86 at monteiro:./rvector-petsctrain-seqbaij -fin cage3.dat -matload_block_size 7 > > Inside MatMult_SeqBAIJ_7 > > > > Table for different block sizes listing corresponding number of blocked rows and number of blocks inside the rows for cage3. > > > > > > Block size No. of blocked rows No. of nnz blocks in each blocked row. > > 1 5 5,3,3,4,4 > > 2 3 3,3,3 > > 3 2 2,2 > > 4 2 2,2 > > 5 1 1 > > 6 1 1 > > 7 1 1 > > > > I am attaching cage3.dat and cage3.mtx. > > > > Thanks, > > Steena > > > > > > > > > > > > > > ierr = MatCreateVecs(A, &x, &y);CHKERRQ(ierr); > > > > > > ierr = VecSetRandom(x,NULL); CHKERRQ(ierr); > > ierr = VecSet(y,zero); CHKERRQ(ierr); > > ierr = MatMult(A,x,y); CHKERRQ(ierr); > > > > > > ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); > > ierr = MatDestroy(&A);CHKERRQ(ierr); > > ierr = VecDestroy(&x);CHKERRQ(ierr); > > ierr = VecDestroy(&y);CHKERRQ(ierr); > > > > Thanks, > > Steena > > > > > > On 15 March 2016 at 09:15, Matthew Knepley wrote: > > On Tue, Mar 15, 2016 at 11:04 AM, Steena Monteiro wrote: > > I pass a binary, matrix data file at the command line and load it into the matrix: > > > > PetscInitialize(&argc,&args,(char*)0,help); > > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > > > > /* converted mtx to dat file*/ > > ierr = PetscOptionsGetString(NULL,"-f",file,PETSC_MAX_PATH_LEN,&flg); > > CHKERRQ(ierr); > > > > if (!flg) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"specify matrix dat file with -f"); > > > > /* Load matrices */ > > ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > > ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd);CHKERRQ(ierr); > > ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); > > ierr = MatSetFromOptions(A);CHKERRQ(ierr); > > > > Nothing above loads a matrix. Do you also call MatLoad()? > > > > Matt > > > > Thanks, > > Steena > > > > On 15 March 2016 at 08:58, Matthew Knepley wrote: > > On Tue, Mar 15, 2016 at 10:54 AM, Steena Monteiro wrote: > > Thank you, Dave. > > > > Matt: I understand the inconsistency but MatMult with non divisible block sizes (here, 2) does not throw any errors and fail, when MatSetSize is commented out. Implying that 1139905 global size does work with block size 2. > > > > If you comment out MatSetSize(), how does it know what size the Mat is? > > > > Matt > > > > On 15 March 2016 at 00:12, Dave May wrote: > > > > On 15 March 2016 at 04:46, Matthew Knepley wrote: > > On Mon, Mar 14, 2016 at 10:05 PM, Steena Monteiro wrote: > > Hello, > > > > I am having difficulty getting MatSetSize to work prior to using MatMult. > > > > For matrix A with rows=cols=1,139,905 and block size = 2, > > > > It is inconsistent to have a row/col size that is not divisible by the block size. > > > > > > To be honest, I don't think the error message being thrown clearly indicates what the actual problem is (hence the email from Steena). What about > > > > "Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global. Local and global sizes must be divisible by the block size" > > > > > > Matt > > > > rank 0 gets 400000 rows and rank 1 739905 rows, like so: > > > > /*Matrix setup*/ > > > > ierr=PetscViewerBinaryOpen(PETSC_COMM_WORLD,file,FILE_MODE_READ,&fd); > > ierr = MatCreate(PETSC_COMM_WORLD,&A); > > ierr = MatSetFromOptions(A); > > ierr = MatSetType(A,MATBAIJ); > > ierr = MatSetBlockSize(A,2); > > > > /*Unequal row assignment*/ > > > > if (!rank) { > > ierr = MatSetSizes(A, 400000, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); > > } > > else { > > ierr = MatSetSizes(A, 739905, PETSC_DECIDE, 1139905,1139905);CHKERRQ(ierr); > > } > > > > MatMult (A,x,y); > > > > /************************************/ > > > > Error message: > > > > 1]PETSC ERROR: [0]PETSC ERROR: No support for this operation for this object type > > Cannot change/reset row sizes to 400000 local 1139906 global after previously setting them to 400000 local 1139905 global > > > > [1]PETSC ERROR: [0]PETSC ERROR: Cannot change/reset row sizes to 739905 local 1139906 global after previously setting them to 739905 local 1139905 global > > > > -Without messing with row assignment, MatMult works fine on this matrix for block size = 2, presumably because an extra padded row is automatically added to facilitate blocking. > > > > -The above code snippet works well for block size = 1. > > > > Is it possible to do unequal row distribution while using blocking? > > > > Thank you for any advice. > > > > -Steena > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > From justin.c.droba at nasa.gov Thu Mar 24 09:48:52 2016 From: justin.c.droba at nasa.gov (Justin Droba (JSC-EG3) [JETS]) Date: Thu, 24 Mar 2016 09:48:52 -0500 Subject: [petsc-users] ASM Index Sets [Repost with Correction] In-Reply-To: References: <56F2A86F.6040605@nasa.gov> <56F2AA47.1050902@nasa.gov> Message-ID: <56F3FE54.3000901@nasa.gov> Barry, Thank you for your response. I'm running a larger problem than the simple example in my initial message. But here is what happens: (1) One block for each processor, with commands > PCASMSetLocalSubdomains(preconditioner,local_sdom,is,NULL); > PCASMSetOverlap(preconditioner,1); where is = overlapping domain assigned to this processor, then it works nicely. (2) One block for each processor, with commands > PCASMSetLocalSubdomains(preconditioner,local_sdom,is,is_local); > PCASMSetOverlap(preconditioner,0); where is = overlapping domain assigned to this processor and is_local the disjoint domain assigned to this processor, I get the incorrect answer. (I can remove that second statement and I get the same answer.) (3) More than one block for each processor, I get a seg fault with either set of commands above. Best regards, Justin On 3/23/16 10:29 AM, Barry Smith wrote: > Seems ok. You'll need to be more specific about "isn't quite working." > > Barry > >> On Mar 23, 2016, at 9:37 AM, Justin Droba (JSC-EG3) [JETS] wrote: >> >> Dear all, >> >> Very sorry to flood the mailing list, one final post because the image will not come through if you're on digest or using plain text. >> >> Thank you for maintaining this wonderful software and this user mailing list! >> >> I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. >> >> Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: >> >> >> >> https://farm2.staticflickr.com/1633/25912726941_428d61ae87_o.png >> >> >> I want to put Omega_1 and Omega_2 on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 >> >> IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } >> IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } >> >> and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? >> >> Thank you! >> >> Best regards, >> Justin >> >> -- >> --------------------------------------------- >> Justin Droba, PhD >> Applied Aeroscience and CFD Branch (EG3) >> NASA Lyndon B. Johnson Space Center >> JETS/Jacobs Technology, Inc. and HX5, LLC >> >> Office: Building 16, Room 142 >> Phone: 281-483-1451 >> --------------------------------------------- >> >> >> >> -- --------------------------------------------- Justin Droba, PhD Applied Aeroscience and CFD Branch (EG3) NASA Lyndon B. Johnson Space Center JETS/Jacobs Technology, Inc. and HX5, LLC *Office:* Building 16, Room 142 *Phone*: 281-483-1451 --------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 24 10:01:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 24 Mar 2016 10:01:01 -0500 Subject: [petsc-users] ASM Index Sets [Repost with Correction] In-Reply-To: <56F3FE54.3000901@nasa.gov> References: <56F2A86F.6040605@nasa.gov> <56F2AA47.1050902@nasa.gov> <56F3FE54.3000901@nasa.gov> Message-ID: On Thu, Mar 24, 2016 at 9:48 AM, Justin Droba (JSC-EG3) [JETS] < justin.c.droba at nasa.gov> wrote: > Barry, > > Thank you for your response. > > I'm running a larger problem than the simple example in my initial > message. But here is what happens: > (1) One block for each processor, with commands > > PCASMSetLocalSubdomains(preconditioner,local_sdom,is,NULL); > PCASMSetOverlap(preconditioner,1); > > where is = overlapping domain assigned to this processor, then it works > nicely. > Then you must not be doing RASM, and the second statement has no effect. > (2) One block for each processor, with commands > > PCASMSetLocalSubdomains(preconditioner,local_sdom,is,is_local); > PCASMSetOverlap(preconditioner,0); > > where is = overlapping domain assigned to this processor and is_local the > disjoint domain assigned to this processor, I get the incorrect answer. (I > can remove that second statement and I get the same answer.) > How do you know what the correct answer is? The second statement has no effect. > (3) More than one block for each processor, I get a seg fault with either > set of commands above. > Are you providing an array of length local_sdom? Matt > Best regards, > Justin > > > On 3/23/16 10:29 AM, Barry Smith wrote: > > Seems ok. You'll need to be more specific about "isn't quite working." > > Barry > > > On Mar 23, 2016, at 9:37 AM, Justin Droba (JSC-EG3) [JETS] wrote: > > Dear all, > > Very sorry to flood the mailing list, one final post because the image will not come through if you're on digest or using plain text. > > Thank you for maintaining this wonderful software and this user mailing list! > > I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. > > Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: > > > https://farm2.staticflickr.com/1633/25912726941_428d61ae87_o.png > > > I want to put Omega_1 and Omega_2 on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 > > IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } > IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } > > and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? > > Thank you! > > Best regards, > Justin > > -- > --------------------------------------------- > Justin Droba, PhD > Applied Aeroscience and CFD Branch (EG3) > NASA Lyndon B. Johnson Space Center > JETS/Jacobs Technology, Inc. and HX5, LLC > > Office: Building 16, Room 142 > Phone: 281-483-1451 > --------------------------------------------- > > > > > > > -- > --------------------------------------------- > Justin Droba, PhD > Applied Aeroscience and CFD Branch (EG3) > NASA Lyndon B. Johnson Space Center > JETS/Jacobs Technology, Inc. and HX5, LLC > > *Office:* Building 16, Room 142 > *Phone*: 281-483-1451 > --------------------------------------------- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From A.T.T.McRae at bath.ac.uk Thu Mar 24 10:18:22 2016 From: A.T.T.McRae at bath.ac.uk (Andrew McRae) Date: Thu, 24 Mar 2016 15:18:22 +0000 Subject: [petsc-users] SNES + linesearch hackery? Message-ID: I have a finite element discretisation of the following nonlinear equation: m*(phi_xx * phi_yy - phi_xy^2) = const, solving for phi. Unfortunately, the function m depends on phi in a complicated way -- let's assume I need to call my own function to handle this. I'm using PETSc's SNES in Python via petsc4py, within the wider environment of the software Firedrake. Currently I'm hacking in the m update (and various output diagnostics) by writing a Python function "fakemonitor" and calling snes.setMonitor(fakemonitor). This allows me to update m each nonlinear iteration. While this is better than nothing, there's still some problems: if I use e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and 0.0 are calculated without updating m, and so the step length taken is (seemingly) far from optimal. I tried adding a damping parameter, but all this does is change the lambdas used to generate the quadratic fit; it doesn't actually make the step length smaller. Is there some cleaner way to do what I want, perhaps by intercepting the fnorm calculation to update m, rather than abusing a custom monitor routine? Thanks, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From justin.c.droba at nasa.gov Thu Mar 24 10:31:32 2016 From: justin.c.droba at nasa.gov (Justin Droba (JSC-EG3) [JETS]) Date: Thu, 24 Mar 2016 10:31:32 -0500 Subject: [petsc-users] ASM Index Sets [Repost with Correction] In-Reply-To: References: <56F2A86F.6040605@nasa.gov> <56F2AA47.1050902@nasa.gov> <56F3FE54.3000901@nasa.gov> Message-ID: <56F40854.9070508@nasa.gov> Matt and Barry, Thank you for your help! On 3/24/16 10:01 AM, Matthew Knepley wrote: > On Thu, Mar 24, 2016 at 9:48 AM, Justin Droba (JSC-EG3) [JETS] > wrote: > > Barry, > > Thank you for your response. > > I'm running a larger problem than the simple example in my initial > message. But here is what happens: > (1) One block for each processor, with commands >> PCASMSetLocalSubdomains(preconditioner,local_sdom,is,NULL); >> PCASMSetOverlap(preconditioner,1); > where is = overlapping domain assigned to this processor, then it > works nicely. > > > Then you must not be doing RASM, and the second statement has no effect. Well, this would certainly explain why this case works. > > (2) One block for each processor, with commands >> PCASMSetLocalSubdomains(preconditioner,local_sdom,is,is_local); >> PCASMSetOverlap(preconditioner,0); > where is = overlapping domain assigned to this processor and > is_local the disjoint domain assigned to this processor, I get the > incorrect answer. (I can remove that second statement and I get > the same answer.) > > > How do you know what the correct answer is? The second statement has > no effect. The system solves for weights for a radial basis interpolation. I use these weights to compute the interpolated values and in this case, I am way off. I will look through my domains again. I'll also see if I can get your previous suggestion of using pc_asm_blocks. I can just set that in-code with PetscOptionsSetValue("-pc_asm_blocks","25"), correct? > (3) More than one block for each processor, I get a seg fault with > either set of commands above. > > > Are you providing an array of length local_sdom? local_sdom should be the number of domains for the processor, right? If I have 25 blocks and 5 processors (distributed evenly), it should be 5. For one block per processor, it's 1. Best regards, Justin > > Matt > > Best regards, > Justin > > > On 3/23/16 10:29 AM, Barry Smith wrote: >> Seems ok. You'll need to be more specific about "isn't quite working." >> >> Barry >> >>> On Mar 23, 2016, at 9:37 AM, Justin Droba (JSC-EG3) [JETS] wrote: >>> >>> Dear all, >>> >>> Very sorry to flood the mailing list, one final post because the image will not come through if you're on digest or using plain text. >>> >>> Thank you for maintaining this wonderful software and this user mailing list! >>> >>> I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. >>> >>> Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: >>> >>> >>> >>> https://farm2.staticflickr.com/1633/25912726941_428d61ae87_o.png >>> >>> >>> I want to put Omega_1 and Omega_2 on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 >>> >>> IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } >>> IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } >>> >>> and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? >>> >>> Thank you! >>> >>> Best regards, >>> Justin >>> >>> -- >>> --------------------------------------------- >>> Justin Droba, PhD >>> Applied Aeroscience and CFD Branch (EG3) >>> NASA Lyndon B. Johnson Space Center >>> JETS/Jacobs Technology, Inc. and HX5, LLC >>> >>> Office: Building 16, Room 142 >>> Phone: 281-483-1451 >>> --------------------------------------------- >>> >>> >>> >>> > > -- > --------------------------------------------- > Justin Droba, PhD > Applied Aeroscience and CFD Branch (EG3) > NASA Lyndon B. Johnson Space Center > JETS/Jacobs Technology, Inc. and HX5, LLC > > *Office:* Building 16, Room 142 > *Phone*: 281-483-1451 > --------------------------------------------- > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- --------------------------------------------- Justin Droba, PhD Applied Aeroscience and CFD Branch (EG3) NASA Lyndon B. Johnson Space Center JETS/Jacobs Technology, Inc. and HX5, LLC *Office:* Building 16, Room 142 *Phone*: 281-483-1451 --------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Mar 24 12:39:39 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 24 Mar 2016 12:39:39 -0500 Subject: [petsc-users] SNES + linesearch hackery? In-Reply-To: References: Message-ID: <196348E7-D29E-4DEB-ADD9-945F16F751BE@mcs.anl.gov> > On Mar 24, 2016, at 10:18 AM, Andrew McRae wrote: > > I have a finite element discretisation of the following nonlinear equation: > > m*(phi_xx * phi_yy - phi_xy^2) = const, > > solving for phi. Unfortunately, the function m depends on phi in a complicated way -- let's assume I need to call my own function to handle this. Andrew So you are actually solving m(phi)*(phi_xx * phi_yy - phi_xy^2) - const = 0 with finite elements for phi? What are you providing for a Jacobian? > > I'm using PETSc's SNES in Python via petsc4py, within the wider environment of the software Firedrake. > > Currently I'm hacking in the m update (and various output diagnostics) by writing a Python function "fakemonitor" and calling snes.setMonitor(fakemonitor). This allows me to update m each nonlinear iteration. Hmm, I don't understand this. It sounds like you are passing (phi_xx * phi_yy - phi_xy^2) or something to SNES as the SNESFormFunction()? Why is this? Why not pass the entire function to SNES? Barry > > While this is better than nothing, there's still some problems: if I use e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and 0.0 are calculated without updating m, and so the step length taken is (seemingly) far from optimal. I tried adding a damping parameter, but all this does is change the lambdas used to generate the quadratic fit; it doesn't actually make the step length smaller. > > Is there some cleaner way to do what I want, perhaps by intercepting the fnorm calculation to update m, rather than abusing a custom monitor routine? > > Thanks, > Andrew From knepley at gmail.com Thu Mar 24 13:07:06 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 24 Mar 2016 13:07:06 -0500 Subject: [petsc-users] ASM Index Sets [Repost with Correction] In-Reply-To: <56F40854.9070508@nasa.gov> References: <56F2A86F.6040605@nasa.gov> <56F2AA47.1050902@nasa.gov> <56F3FE54.3000901@nasa.gov> <56F40854.9070508@nasa.gov> Message-ID: On Thu, Mar 24, 2016 at 10:31 AM, Justin Droba (JSC-EG3) [JETS] < justin.c.droba at nasa.gov> wrote: > Matt and Barry, > > Thank you for your help! > > On 3/24/16 10:01 AM, Matthew Knepley wrote: > > On Thu, Mar 24, 2016 at 9:48 AM, Justin Droba (JSC-EG3) [JETS] < > justin.c.droba at nasa.gov> wrote: > >> Barry, >> >> Thank you for your response. >> >> I'm running a larger problem than the simple example in my initial >> message. But here is what happens: >> (1) One block for each processor, with commands >> >> PCASMSetLocalSubdomains(preconditioner,local_sdom,is,NULL); >> PCASMSetOverlap(preconditioner,1); >> >> where is = overlapping domain assigned to this processor, then it works >> nicely. >> > > Then you must not be doing RASM, and the second statement has no effect. > > > Well, this would certainly explain why this case works. > > > >> (2) One block for each processor, with commands >> >> PCASMSetLocalSubdomains(preconditioner,local_sdom,is,is_local); >> PCASMSetOverlap(preconditioner,0); >> >> where is = overlapping domain assigned to this processor and is_local the >> disjoint domain assigned to this processor, I get the incorrect answer. (I >> can remove that second statement and I get the same answer.) >> > > How do you know what the correct answer is? The second statement has no > effect. > > The system solves for weights for a radial basis interpolation. I use > these weights to compute the interpolated values and in this case, I am way > off. > > I will look through my domains again. I'll also see if I can get your > previous suggestion of using pc_asm_blocks. I can just set that in-code > with PetscOptionsSetValue("-pc_asm_blocks","25"), correct? > Yes. > > > (3) More than one block for each processor, I get a seg fault with either >> set of commands above. >> > > Are you providing an array of length local_sdom? > > local_sdom should be the number of domains for the processor, right? If I > have 25 blocks and 5 processors (distributed evenly), it should be 5. For > one block per processor, it's 1. > Yes. You could also look at what Rio Yokota did: https://github.com/barbagroup/petrbf Matt > Best regards, > Justin > > > Matt > > >> Best regards, >> Justin >> >> >> On 3/23/16 10:29 AM, Barry Smith wrote: >> >> Seems ok. You'll need to be more specific about "isn't quite working." >> >> Barry >> >> >> On Mar 23, 2016, at 9:37 AM, Justin Droba (JSC-EG3) [JETS] wrote: >> >> Dear all, >> >> Very sorry to flood the mailing list, one final post because the image will not come through if you're on digest or using plain text. >> >> Thank you for maintaining this wonderful software and this user mailing list! >> >> I am attempting to use an additive Schwartz preconditioner with multiple blocks per processor. I've been able to get it to work with one block per processor but not been successful with multiple. I believe I am not understanding correctly the construction of index sets for this case. >> >> Let's consider a simple example: 16 nodes on a square with 4 blocks of 4x4: >> >> >> https://farm2.staticflickr.com/1633/25912726941_428d61ae87_o.png >> >> >> I want to put Omega_1 and Omega_2 on processor 1 and the other two on the second processor. Do I make my index sets on processor 1 >> >> IS = { {1,2,3,5,6,7,9,10,11}, {2,3,4,6,7,8,10,11,12} } >> IS_LOCAL = { {1,2,5,6}, {3,4,7,8} } >> >> and similarly for the other processor? I tried this and it isn't quite working out. Do I misunderstand something? >> >> Thank you! >> >> Best regards, >> Justin >> >> -- >> --------------------------------------------- >> Justin Droba, PhD >> Applied Aeroscience and CFD Branch (EG3) >> NASA Lyndon B. Johnson Space Center >> JETS/Jacobs Technology, Inc. and HX5, LLC >> >> Office: Building 16, Room 142 >> Phone: 281-483-1451 >> --------------------------------------------- >> >> >> >> >> >> >> -- >> --------------------------------------------- >> Justin Droba, PhD >> Applied Aeroscience and CFD Branch (EG3) >> NASA Lyndon B. Johnson Space Center >> JETS/Jacobs Technology, Inc. and HX5, LLC >> >> *Office:* Building 16, Room 142 >> *Phone*: 281-483-1451 >> --------------------------------------------- >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- > --------------------------------------------- > Justin Droba, PhD > Applied Aeroscience and CFD Branch (EG3) > NASA Lyndon B. Johnson Space Center > JETS/Jacobs Technology, Inc. and HX5, LLC > > *Office:* Building 16, Room 142 > *Phone*: 281-483-1451 > --------------------------------------------- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From A.T.T.McRae at bath.ac.uk Thu Mar 24 14:41:17 2016 From: A.T.T.McRae at bath.ac.uk (Andrew McRae) Date: Thu, 24 Mar 2016 19:41:17 +0000 Subject: [petsc-users] SNES + linesearch hackery? In-Reply-To: <368e94bddd204e699f120016143ff9e6@exch07.campus.bath.ac.uk> References: <368e94bddd204e699f120016143ff9e6@exch07.campus.bath.ac.uk> Message-ID: Apologies, in the end it seems this was more of a Firedrake question: with the help of Lawrence Mitchell, I now believe I should simply intercept SNESFormFunction(). On 24 March 2016 at 17:39, Barry Smith wrote: > > > On Mar 24, 2016, at 10:18 AM, Andrew McRae > wrote: > > > > I have a finite element discretisation of the following nonlinear > equation: > > > > m*(phi_xx * phi_yy - phi_xy^2) = const, > > > > solving for phi. Unfortunately, the function m depends on phi in a > complicated way -- let's assume I need to call my own function to handle > this. > > Andrew > > So you are actually solving > > m(phi)*(phi_xx * phi_yy - phi_xy^2) - const = 0 > > with finite elements for phi? > > > What are you providing for a Jacobian? > The Jacobian I give treats m as being independent of phi, so just whatever you get from linearising det(Hessian(phi)). > > > > > > I'm using PETSc's SNES in Python via petsc4py, within the wider > environment of the software Firedrake. > > > > Currently I'm hacking in the m update (and various output diagnostics) > by writing a Python function "fakemonitor" and calling > snes.setMonitor(fakemonitor). This allows me to update m each nonlinear > iteration. > > Hmm, I don't understand this. It sounds like you are passing (phi_xx * > phi_yy - phi_xy^2) or something to SNES as the SNESFormFunction()? Why is > this? Why not pass the entire function to SNES? > I was passing in m(phi^n)(phi_xx * phi_yy - phi_xy^2) - const, i.e., m was effectively frozen from the last nonlinear iteration. As stated above, I think it's as simple as arranging for m to be updated whenever SNESFormFunction() is called, which involves hacking Firedrake code but not PETSc code. Thanks, Andrew > > Barry > > > > > While this is better than nothing, there's still some problems: if I use > e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and 0.0 > are calculated without updating m, and so the step length taken is > (seemingly) far from optimal. I tried adding a damping parameter, but all > this does is change the lambdas used to generate the quadratic fit; it > doesn't actually make the step length smaller. > > > > Is there some cleaner way to do what I want, perhaps by intercepting the > fnorm calculation to update m, rather than abusing a custom monitor routine? > > > > Thanks, > > Andrew > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Mar 24 14:49:30 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 24 Mar 2016 14:49:30 -0500 Subject: [petsc-users] SNES + linesearch hackery? In-Reply-To: References: <368e94bddd204e699f120016143ff9e6@exch07.campus.bath.ac.uk> Message-ID: > On Mar 24, 2016, at 2:41 PM, Andrew McRae wrote: > > Apologies, in the end it seems this was more of a Firedrake question: with the help of Lawrence Mitchell, I now believe I should simply intercept SNESFormFunction(). > > On 24 March 2016 at 17:39, Barry Smith wrote: > > > On Mar 24, 2016, at 10:18 AM, Andrew McRae wrote: > > > > I have a finite element discretisation of the following nonlinear equation: > > > > m*(phi_xx * phi_yy - phi_xy^2) = const, > > > > solving for phi. Unfortunately, the function m depends on phi in a complicated way -- let's assume I need to call my own function to handle this. > > Andrew > > So you are actually solving > > m(phi)*(phi_xx * phi_yy - phi_xy^2) - const = 0 > > with finite elements for phi? > > > What are you providing for a Jacobian? > > The Jacobian I give treats m as being independent of phi, so just whatever you get from linearising det(Hessian(phi)). Ahh, a Picard iteration :-) > > > > > > > I'm using PETSc's SNES in Python via petsc4py, within the wider environment of the software Firedrake. > > > > Currently I'm hacking in the m update (and various output diagnostics) by writing a Python function "fakemonitor" and calling snes.setMonitor(fakemonitor). This allows me to update m each nonlinear iteration. > > Hmm, I don't understand this. It sounds like you are passing (phi_xx * phi_yy - phi_xy^2) or something to SNES as the SNESFormFunction()? Why is this? Why not pass the entire function to SNES? > > I was passing in m(phi^n)(phi_xx * phi_yy - phi_xy^2) - const, i.e., m was effectively frozen from the last nonlinear iteration. As stated above, I think it's as simple as arranging for m to be updated whenever SNESFormFunction() is called, which involves hacking Firedrake code but not PETSc code. > > Thanks, > Andrew > > > Barry > > > > > While this is better than nothing, there's still some problems: if I use e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and 0.0 are calculated without updating m, and so the step length taken is (seemingly) far from optimal. I tried adding a damping parameter, but all this does is change the lambdas used to generate the quadratic fit; it doesn't actually make the step length smaller. > > > > Is there some cleaner way to do what I want, perhaps by intercepting the fnorm calculation to update m, rather than abusing a custom monitor routine? > > > > Thanks, > > Andrew > > From A.T.T.McRae at bath.ac.uk Thu Mar 24 15:06:49 2016 From: A.T.T.McRae at bath.ac.uk (Andrew McRae) Date: Thu, 24 Mar 2016 20:06:49 +0000 Subject: [petsc-users] SNES + linesearch hackery? In-Reply-To: References: <368e94bddd204e699f120016143ff9e6@exch07.campus.bath.ac.uk> Message-ID: On 24 March 2016 at 19:49, Barry Smith wrote: > > > On Mar 24, 2016, at 2:41 PM, Andrew McRae > wrote: > > > > Apologies, in the end it seems this was more of a Firedrake question: > with the help of Lawrence Mitchell, I now believe I should simply intercept > SNESFormFunction(). > > > > On 24 March 2016 at 17:39, Barry Smith wrote: > > > > > On Mar 24, 2016, at 10:18 AM, Andrew McRae > wrote: > > > > > > I have a finite element discretisation of the following nonlinear > equation: > > > > > > m*(phi_xx * phi_yy - phi_xy^2) = const, > > > > > > solving for phi. Unfortunately, the function m depends on phi in a > complicated way -- let's assume I need to call my own function to handle > this. > > > > Andrew > > > > So you are actually solving > > > > m(phi)*(phi_xx * phi_yy - phi_xy^2) - const = 0 > > > > with finite elements for phi? > > > > > > What are you providing for a Jacobian? > > > > The Jacobian I give treats m as being independent of phi, so just > whatever you get from linearising det(Hessian(phi)). > > Ahh, a Picard iteration :-) > Not quite, I think. Ah, that should have read "so just m*(whatever you get from linearising...)", and I was updating m between nonlinear iterations :) > > > > > > > > > > > > > > I'm using PETSc's SNES in Python via petsc4py, within the wider > environment of the software Firedrake. > > > > > > Currently I'm hacking in the m update (and various output diagnostics) > by writing a Python function "fakemonitor" and calling > snes.setMonitor(fakemonitor). This allows me to update m each nonlinear > iteration. > > > > Hmm, I don't understand this. It sounds like you are passing (phi_xx > * phi_yy - phi_xy^2) or something to SNES as the SNESFormFunction()? Why is > this? Why not pass the entire function to SNES? > > > > I was passing in m(phi^n)(phi_xx * phi_yy - phi_xy^2) - const, i.e., m > was effectively frozen from the last nonlinear iteration. As stated above, > I think it's as simple as arranging for m to be updated whenever > SNESFormFunction() is called, which involves hacking Firedrake code but not > PETSc code. > > > > Thanks, > > Andrew > > > > > > Barry > > > > > > > > While this is better than nothing, there's still some problems: if I > use e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and > 0.0 are calculated without updating m, and so the step length taken is > (seemingly) far from optimal. I tried adding a damping parameter, but all > this does is change the lambdas used to generate the quadratic fit; it > doesn't actually make the step length smaller. > > > > > > Is there some cleaner way to do what I want, perhaps by intercepting > the fnorm calculation to update m, rather than abusing a custom monitor > routine? > > > > > > Thanks, > > > Andrew > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 24 15:11:37 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 24 Mar 2016 15:11:37 -0500 Subject: [petsc-users] SNES + linesearch hackery? In-Reply-To: References: <368e94bddd204e699f120016143ff9e6@exch07.campus.bath.ac.uk> Message-ID: On Thu, Mar 24, 2016 at 3:06 PM, Andrew McRae wrote: > On 24 March 2016 at 19:49, Barry Smith wrote: > >> >> > On Mar 24, 2016, at 2:41 PM, Andrew McRae >> wrote: >> > >> > Apologies, in the end it seems this was more of a Firedrake question: >> with the help of Lawrence Mitchell, I now believe I should simply intercept >> SNESFormFunction(). >> > >> > On 24 March 2016 at 17:39, Barry Smith wrote: >> > >> > > On Mar 24, 2016, at 10:18 AM, Andrew McRae >> wrote: >> > > >> > > I have a finite element discretisation of the following nonlinear >> equation: >> > > >> > > m*(phi_xx * phi_yy - phi_xy^2) = const, >> > > >> > > solving for phi. Unfortunately, the function m depends on phi in a >> complicated way -- let's assume I need to call my own function to handle >> this. >> > >> > Andrew >> > >> > So you are actually solving >> > >> > m(phi)*(phi_xx * phi_yy - phi_xy^2) - const = 0 >> > >> > with finite elements for phi? >> > >> > >> > What are you providing for a Jacobian? >> > >> > The Jacobian I give treats m as being independent of phi, so just >> whatever you get from linearising det(Hessian(phi)). >> >> Ahh, a Picard iteration :-) >> > > Not quite, I think. > > Ah, that should have read "so just m*(whatever you get from > linearising...)", and I was updating m between nonlinear iterations :) > I think Barry is right. You can look at it this way. You froze a portion of your system, took the Jacobian of the rest, and used that for the step, then updated the frozen part. That is what lots of people call a Picard step. Matt > >> >> > >> > >> > >> > > >> > > I'm using PETSc's SNES in Python via petsc4py, within the wider >> environment of the software Firedrake. >> > > >> > > Currently I'm hacking in the m update (and various output >> diagnostics) by writing a Python function "fakemonitor" and calling >> snes.setMonitor(fakemonitor). This allows me to update m each nonlinear >> iteration. >> > >> > Hmm, I don't understand this. It sounds like you are passing >> (phi_xx * phi_yy - phi_xy^2) or something to SNES as the >> SNESFormFunction()? Why is this? Why not pass the entire function to SNES? >> > >> > I was passing in m(phi^n)(phi_xx * phi_yy - phi_xy^2) - const, i.e., m >> was effectively frozen from the last nonlinear iteration. As stated above, >> I think it's as simple as arranging for m to be updated whenever >> SNESFormFunction() is called, which involves hacking Firedrake code but not >> PETSc code. >> > >> > Thanks, >> > Andrew >> > >> > >> > Barry >> > >> > > >> > > While this is better than nothing, there's still some problems: if I >> use e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and >> 0.0 are calculated without updating m, and so the step length taken is >> (seemingly) far from optimal. I tried adding a damping parameter, but all >> this does is change the lambdas used to generate the quadratic fit; it >> doesn't actually make the step length smaller. >> > > >> > > Is there some cleaner way to do what I want, perhaps by intercepting >> the fnorm calculation to update m, rather than abusing a custom monitor >> routine? >> > > >> > > Thanks, >> > > Andrew >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From A.T.T.McRae at bath.ac.uk Thu Mar 24 16:21:47 2016 From: A.T.T.McRae at bath.ac.uk (Andrew McRae) Date: Thu, 24 Mar 2016 21:21:47 +0000 Subject: [petsc-users] SNES + linesearch hackery? In-Reply-To: <995c51e3c86a4e089995ca0ae2ec1107@exch07.campus.bath.ac.uk> References: <368e94bddd204e699f120016143ff9e6@exch07.campus.bath.ac.uk> <995c51e3c86a4e089995ca0ae2ec1107@exch07.campus.bath.ac.uk> Message-ID: On 24 March 2016 at 20:11, Matthew Knepley wrote: > On Thu, Mar 24, 2016 at 3:06 PM, Andrew McRae > wrote: > >> On 24 March 2016 at 19:49, Barry Smith wrote: >> >>> >>> > On Mar 24, 2016, at 2:41 PM, Andrew McRae >>> wrote: >>> > >>> > Apologies, in the end it seems this was more of a Firedrake question: >>> with the help of Lawrence Mitchell, I now believe I should simply intercept >>> SNESFormFunction(). >>> > >>> > On 24 March 2016 at 17:39, Barry Smith wrote: >>> > >>> > > On Mar 24, 2016, at 10:18 AM, Andrew McRae >>> wrote: >>> > > >>> > > I have a finite element discretisation of the following nonlinear >>> equation: >>> > > >>> > > m*(phi_xx * phi_yy - phi_xy^2) = const, >>> > > >>> > > solving for phi. Unfortunately, the function m depends on phi in a >>> complicated way -- let's assume I need to call my own function to handle >>> this. >>> > >>> > Andrew >>> > >>> > So you are actually solving >>> > >>> > m(phi)*(phi_xx * phi_yy - phi_xy^2) - const = 0 >>> > >>> > with finite elements for phi? >>> > >>> > >>> > What are you providing for a Jacobian? >>> > >>> > The Jacobian I give treats m as being independent of phi, so just >>> whatever you get from linearising det(Hessian(phi)). >>> >>> Ahh, a Picard iteration :-) >>> >> >> Not quite, I think. >> >> Ah, that should have read "so just m*(whatever you get from >> linearising...)", and I was updating m between nonlinear iterations :) >> > > I think Barry is right. You can look at it this way. You froze a portion > of your system, took the Jacobian of the rest, and > used that for the step, then updated the frozen part. That is what lots of > people call a Picard step. > > Matt > I see, thanks. > > >> >>> >>> > >>> > >>> > >>> > > >>> > > I'm using PETSc's SNES in Python via petsc4py, within the wider >>> environment of the software Firedrake. >>> > > >>> > > Currently I'm hacking in the m update (and various output >>> diagnostics) by writing a Python function "fakemonitor" and calling >>> snes.setMonitor(fakemonitor). This allows me to update m each nonlinear >>> iteration. >>> > >>> > Hmm, I don't understand this. It sounds like you are passing >>> (phi_xx * phi_yy - phi_xy^2) or something to SNES as the >>> SNESFormFunction()? Why is this? Why not pass the entire function to SNES? >>> > >>> > I was passing in m(phi^n)(phi_xx * phi_yy - phi_xy^2) - const, i.e., m >>> was effectively frozen from the last nonlinear iteration. As stated above, >>> I think it's as simple as arranging for m to be updated whenever >>> SNESFormFunction() is called, which involves hacking Firedrake code but not >>> PETSc code. >>> > >>> > Thanks, >>> > Andrew >>> > >>> > >>> > Barry >>> > >>> > > >>> > > While this is better than nothing, there's still some problems: if I >>> use e.g. snes_linesearch_type: "l2", the fnorms for lambda = 1.0, 0.5 and >>> 0.0 are calculated without updating m, and so the step length taken is >>> (seemingly) far from optimal. I tried adding a damping parameter, but all >>> this does is change the lambdas used to generate the quadratic fit; it >>> doesn't actually make the step length smaller. >>> > > >>> > > Is there some cleaner way to do what I want, perhaps by intercepting >>> the fnorm calculation to update m, rather than abusing a custom monitor >>> routine? >>> > > >>> > > Thanks, >>> > > Andrew >>> > >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysjosh.lo at gmail.com Fri Mar 25 23:21:14 2016 From: ysjosh.lo at gmail.com (Josh Lo) Date: Fri, 25 Mar 2016 23:21:14 -0500 Subject: [petsc-users] need more argument in FormFuction in SNESSetFunction Message-ID: Hi, I am trying to SNES with Fortran. In FormFunction in SNESSetFuction, i need to call other subroutine and many array to calculate F(x), but seems like SNESSetFunction only allows one user-defined array as input argument, say, "dummy" in the next line. CALL SNESSetFunction(snes,r,FormFunction,dummy,ierr) How can I pass more than one array through SNESSetFunction to FormFunction? I am a beginner for PETSc, so the question maybe naive. Thanks, Lo -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Fri Mar 25 23:43:14 2016 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Sat, 26 Mar 2016 13:43:14 +0900 Subject: [petsc-users] need more argument in FormFuction in SNESSetFunction In-Reply-To: References: Message-ID: Hi, If you want to use other arrays, you can either put them in modules and then "use" the module at the very top of the subroutine, or create a new type which can contain anything (scalars, integers, arrays but also more complex PETSc objects), and put an instance of this type in the place of your dummy array. For instance module types type levelcontext #include "petsc/finclude/petscdmdef.h" DM :: da PetscScalar :: MyArray(10) end type levelcontext end module types and then in your main you can do use types type(levelcontext) :: ctx Finally you call SNESSetFunction as CALL SNESSetFunction(snes,r,FormFunction,ctx,ierr) Inside SNESSetFunction you can then access your array or your DMDA as ctx%MyArray, ctx%da etc etc. Also, in principle I think you should set the context ctx of SNES with call SNESSetApplicationContext(snes,ctx,ierr) Best Timoth?e 2016-03-26 13:21 GMT+09:00 Josh Lo : > Hi, > > I am trying to SNES with Fortran. > In FormFunction in SNESSetFuction, i need to call other subroutine and > many array to calculate F(x), but seems like SNESSetFunction only allows > one user-defined array as input argument, say, "dummy" in the next line. > > CALL SNESSetFunction(snes,r,FormFunction,dummy,ierr) > > How can I pass more than one array through SNESSetFunction to FormFunction? > > I am a beginner for PETSc, so the question maybe naive. > > Thanks, > Lo > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bikash at umich.edu Sun Mar 27 15:30:51 2016 From: bikash at umich.edu (Bikash Kanungo) Date: Sun, 27 Mar 2016 16:30:51 -0400 Subject: [petsc-users] Real data structure with complex build Message-ID: Hi, Is there a way to define the Mat(s) and Vec(s) to be real while working with a complex build of PETSc? I'm asking because part of my problem is real while the other part is complex. SO I'll be able to gain some performance and reduction in memory if I can specify the PETSc data structures in the real half of my code to be real. Regards, Bikash -- Bikash S. Kanungo PhD Student Computational Materials Physics Group Mechanical Engineering University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Mar 27 15:39:13 2016 From: jed at jedbrown.org (Jed Brown) Date: Sun, 27 Mar 2016 21:39:13 +0100 Subject: [petsc-users] Real data structure with complex build In-Reply-To: References: Message-ID: <87d1qfveoe.fsf@jedbrown.org> Bikash Kanungo writes: > Hi, > > Is there a way to define the Mat(s) and Vec(s) to be real while working > with a complex build of PETSc? No, there is no supported way to do this. > I'm asking because part of my problem is real while the other part is > complex. SO I'll be able to gain some performance and reduction in > memory if I can specify the PETSc data structures in the real half of > my code to be real. Is the real part bigger and more difficult to solve than the complex part? If not, then don't worry about it; there would be little performance to gain anyway. If yes, then it goes on our list of apps that could stand to benefit from supporting this (a difficult proposition). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bikash at umich.edu Sun Mar 27 15:45:51 2016 From: bikash at umich.edu (Bikash Kanungo) Date: Sun, 27 Mar 2016 16:45:51 -0400 Subject: [petsc-users] Real data structure with complex build In-Reply-To: <87d1qfveoe.fsf@jedbrown.org> References: <87d1qfveoe.fsf@jedbrown.org> Message-ID: Hi Jed, Thanks for the info. Yes the real part is the dominant part. It will help me a lot if such support is provided in PETSc. Regards, Bikash On Sun, Mar 27, 2016 at 4:39 PM, Jed Brown wrote: > Bikash Kanungo writes: > > > Hi, > > > > Is there a way to define the Mat(s) and Vec(s) to be real while working > > with a complex build of PETSc? > > No, there is no supported way to do this. > > > I'm asking because part of my problem is real while the other part is > > complex. SO I'll be able to gain some performance and reduction in > > memory if I can specify the PETSc data structures in the real half of > > my code to be real. > > Is the real part bigger and more difficult to solve than the complex > part? If not, then don't worry about it; there would be little > performance to gain anyway. If yes, then it goes on our list of apps > that could stand to benefit from supporting this (a difficult proposition). > -- Bikash S. Kanungo PhD Student Computational Materials Physics Group Mechanical Engineering University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 29 07:12:53 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 29 Mar 2016 13:12:53 +0100 Subject: [petsc-users] question about the PetscFVLeastSquaresPseudoInverseSVD In-Reply-To: <56F0B79D.8030408@gmail.com> References: <56F0B79D.8030408@gmail.com> Message-ID: <87wpolscsa.fsf@jedbrown.org> Rongliang Chen writes: > ----------------------------------- > Initialize the matrix A (is a 3 x 3 matrix): > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 0.500000 0.000000 -0.500000 > 0.500000 0.000000 -0.500000 > 0.000000 0.000000 1.000000 So the above looks like a problem. Let's see how that matrix is constructed. > ierr = PetscPrintf(PETSC_COMM_WORLD,"Initialize the matrix A (is a %d x %d matrix):\n", n, m); > /* initialize to identity */ > for (j=0; j for (i=0; i if (i == j) { > A[i + j*m] = 1.0; > }else{ > A[i + j*m] = 0.0; > } > PetscPrintf(PETSC_COMM_WORLD,"%f ", A[i + j*m]); It's a packed 3x3 matrix (lda=3). > ierr = PetscBLASIntCast(mstride,&lda);CHKERRQ(ierr); And yet mstride=4, so your matrix is not packed correctly. If you're not familiar with BLAS-style packing with lda, please read the documentation. > LAPACKgelss_(&M,&N,&nrhs,A,&lda,Brhs,&ldb, (PetscReal *) tau,&rcond,&irank,tmpwork,&ldwork,&info); If you use A[i + j*mstride] when filling in the entries above, the program outputs the following. maxNumFaces = 4, worksize = 75 ----------------------------------- Initialize the matrix A (is a 3 x 4 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 ----------------------------------- Initialize the matrix A (is a 3 x 4 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 ----------------------------------- Initialize the matrix A (is a 3 x 3 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 ----------------------------------- Initialize the matrix A (is a 3 x 3 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 ----------------------------------- Initialize the matrix A (is a 3 x 2 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 ----------------------------------- Initialize the matrix A (is a 3 x 2 matrix): 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 Initialize the matrix B: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 The output of the SVD based least square: 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From rongliang.chan at gmail.com Wed Mar 30 10:41:50 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Wed, 30 Mar 2016 23:41:50 +0800 Subject: [petsc-users] question about the PetscFVLeastSquaresPseudoInverseSVD In-Reply-To: <87wpolscsa.fsf@jedbrown.org> References: <56F0B79D.8030408@gmail.com> <87wpolscsa.fsf@jedbrown.org> Message-ID: <56FBF3BE.3000902@gmail.com> Hi Jed, Many many thanks for your help! Best regards, Rongliang On 03/29/2016 08:12 PM, Jed Brown wrote: > Rongliang Chen writes: > >> ----------------------------------- >> Initialize the matrix A (is a 3 x 3 matrix): >> 1.000000 0.000000 0.000000 >> 0.000000 1.000000 0.000000 >> 0.000000 0.000000 1.000000 >> >> Initialize the matrix B: >> 1.000000 0.000000 0.000000 >> 0.000000 1.000000 0.000000 >> 0.000000 0.000000 1.000000 >> >> The output of the SVD based least square: >> 0.500000 0.000000 -0.500000 >> 0.500000 0.000000 -0.500000 >> 0.000000 0.000000 1.000000 > So the above looks like a problem. Let's see how that matrix is > constructed. > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"Initialize the matrix A (is a %d x %d matrix):\n", n, m); >> /* initialize to identity */ >> for (j=0; j> for (i=0; i> if (i == j) { >> A[i + j*m] = 1.0; >> }else{ >> A[i + j*m] = 0.0; >> } >> PetscPrintf(PETSC_COMM_WORLD,"%f ", A[i + j*m]); > It's a packed 3x3 matrix (lda=3). > >> ierr = PetscBLASIntCast(mstride,&lda);CHKERRQ(ierr); > And yet mstride=4, so your matrix is not packed correctly. If you're > not familiar with BLAS-style packing with lda, please read the > documentation. > >> LAPACKgelss_(&M,&N,&nrhs,A,&lda,Brhs,&ldb, (PetscReal *) tau,&rcond,&irank,tmpwork,&ldwork,&info); > If you use A[i + j*mstride] when filling in the entries above, the > program outputs the following. > > maxNumFaces = 4, worksize = 75 > ----------------------------------- > Initialize the matrix A (is a 3 x 4 matrix): > 1.000000 0.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 0.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 0.000000 > 0.000000 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 1.000000 0.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 0.000000 > > ----------------------------------- > Initialize the matrix A (is a 3 x 4 matrix): > 1.000000 0.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 0.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 0.000000 > 0.000000 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 1.000000 0.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 0.000000 > > ----------------------------------- > Initialize the matrix A (is a 3 x 3 matrix): > 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 > 0.000000 0.000000 0.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > ----------------------------------- > Initialize the matrix A (is a 3 x 3 matrix): > 1.000000 0.000000 0.000000 > 0.000000 0.000000 1.000000 > 0.000000 0.000000 0.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > ----------------------------------- > Initialize the matrix A (is a 3 x 2 matrix): > 1.000000 0.000000 > 0.000000 0.000000 > 0.000000 1.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 1.000000 0.000000 > 0.000000 1.000000 > 0.000000 0.000000 > > ----------------------------------- > Initialize the matrix A (is a 3 x 2 matrix): > 1.000000 0.000000 > 0.000000 0.000000 > 0.000000 1.000000 > > Initialize the matrix B: > 1.000000 0.000000 0.000000 > 0.000000 1.000000 0.000000 > 0.000000 0.000000 1.000000 > > The output of the SVD based least square: > 1.000000 0.000000 > 0.000000 1.000000 > 0.000000 0.000000 > From cp226 at duke.edu Wed Mar 30 12:54:27 2016 From: cp226 at duke.edu (Christian Peco Regales, Ph.D.) Date: Wed, 30 Mar 2016 17:54:27 +0000 Subject: [petsc-users] TS prestep function Message-ID: I have started to use the TS environment to solve a diffusion problem with XFEM in which, at the beginning of every step, a number of values have to be recomputed in order to properly fill the Jacobian (e.g. change some quadrature weights). However, I see that the the function is expected with the format func(TS ts), with no possibility of getting a user context to perform the computations I need. Is there a way to get around that? Thanks! Christian _______________________________ Christian Peco Regales, Ph.D. Postdoctoral Research Associate Civil and Environmental Engineering Pratt School of Engineering Duke University 2407 CIEMAS (office) Durham, NC 27708, USA email: christian.peco at duke.edu web: http://www.christianpeco.com _______________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Mar 30 13:38:32 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 30 Mar 2016 20:38:32 +0200 Subject: [petsc-users] TS prestep function In-Reply-To: References: Message-ID: On 30 March 2016 at 19:54, Christian Peco Regales, Ph.D. wrote: > I have started to use the TS environment to solve a diffusion problem with > XFEM in which, at the beginning of every step, a number of values have to > be recomputed in order to properly fill the Jacobian (e.g. change some > quadrature weights). However, I see that the the function is expected with > the format func(TS ts), with no possibility of getting a user context to > perform the computations I need. > There are a number of ways to get around this: [1] Set an application context on the TS http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetApplicationContext.html and retrieve via http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSGetApplicationContext.html [2] It sounds like your information is related to geometry, so you could bundle it into a DM and use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetDM.html [3] The "nastiest" way is to use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscContainerCreate.html#PetscContainerCreate and then compose the PetscContainer object with the TS object via http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscObjectCompose.html Thanks, Dave > Is there a way to get around that? Thanks! > > > Christian > > > _______________________________ > Christian Peco Regales, Ph.D. > Postdoctoral Research Associate > Civil and Environmental Engineering > Pratt School of Engineering > Duke University > 2407 CIEMAS (office) > Durham, NC 27708, USA > email: christian.peco at duke.edu > web: http://www.christianpeco.com > _______________________________ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Wed Mar 30 13:40:11 2016 From: hongzhang at anl.gov (Hong Zhang) Date: Wed, 30 Mar 2016 13:40:11 -0500 Subject: [petsc-users] TS prestep function In-Reply-To: References: Message-ID: <2EE7224E-D9C7-4F9A-BE93-EABE8E8D0BB3@anl.gov> You might want to call TSGetApplicatonContext() in your prestep function. Hong > On Mar 30, 2016, at 12:54 PM, Christian Peco Regales, Ph.D. wrote: > > I have started to use the TS environment to solve a diffusion problem with XFEM in which, at the beginning of every step, a number of values have to be recomputed in order to properly fill the Jacobian (e.g. change some quadrature weights). However, I see that the the function is expected with the format func(TS ts), with no possibility of getting a user context to perform the computations I need. > > Is there a way to get around that? Thanks! > > Christian > > _______________________________ > Christian Peco Regales, Ph.D. > Postdoctoral Research Associate > Civil and Environmental Engineering > Pratt School of Engineering > Duke University > 2407 CIEMAS (office) > Durham, NC 27708, USA > email: christian.peco at duke.edu > web: http://www.christianpeco.com > _______________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cp226 at duke.edu Wed Mar 30 14:13:06 2016 From: cp226 at duke.edu (Christian Peco Regales, Ph.D.) Date: Wed, 30 Mar 2016 19:13:06 +0000 Subject: [petsc-users] TS prestep function In-Reply-To: References: , Message-ID: Dave, Hong, thank you. Setting and getting the context from TS is definitely the way to go, as I have to compute an exchange of particle forces and geometry information. Working fine now. Christian _______________________________ Christian Peco Regales, Ph.D. Postdoctoral Research Associate Civil and Environmental Engineering Pratt School of Engineering Duke University 2407 CIEMAS (office) Durham, NC 27708, USA email: christian.peco at duke.edu web: http://www.christianpeco.com _______________________________ ________________________________ From: Dave May Sent: Wednesday, March 30, 2016 2:38 PM To: Christian Peco Regales, Ph.D. Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] TS prestep function On 30 March 2016 at 19:54, Christian Peco Regales, Ph.D. > wrote: I have started to use the TS environment to solve a diffusion problem with XFEM in which, at the beginning of every step, a number of values have to be recomputed in order to properly fill the Jacobian (e.g. change some quadrature weights). However, I see that the the function is expected with the format func(TS ts), with no possibility of getting a user context to perform the computations I need. There are a number of ways to get around this: [1] Set an application context on the TS http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetApplicationContext.html and retrieve via http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSGetApplicationContext.html [2] It sounds like your information is related to geometry, so you could bundle it into a DM and use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetDM.html [3] The "nastiest" way is to use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscContainerCreate.html#PetscContainerCreate and then compose the PetscContainer object with the TS object via http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscObjectCompose.html Thanks, Dave Is there a way to get around that? Thanks! Christian _______________________________ Christian Peco Regales, Ph.D. Postdoctoral Research Associate Civil and Environmental Engineering Pratt School of Engineering Duke University 2407 CIEMAS (office) Durham, NC 27708, USA email: christian.peco at duke.edu web: http://www.christianpeco.com _______________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Mar 30 16:18:46 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 30 Mar 2016 16:18:46 -0500 Subject: [petsc-users] DIVERGED_NANORINF for CG/Bjacobi for transient diffusion Message-ID: Hi all, I am getting this error: Linear solve did not converge due to DIVERGED_NANORINF when solving an FEM for transient diffusion using hexahedron elements. The problem is highly heterogeneous (i.e., dispersion tensor with varying cell-centered velocity) and normally CG/Bjacobi has done well for me. If I did CG/Jacobi, my solver would require nearly 7000 iterations, but would get the "expected" solution. What does the above error mean? And what could I do to address it? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 30 16:23:05 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 30 Mar 2016 16:23:05 -0500 Subject: [petsc-users] DIVERGED_NANORINF for CG/Bjacobi for transient diffusion In-Reply-To: References: Message-ID: On Wed, Mar 30, 2016 at 4:18 PM, Justin Chang wrote: > Hi all, > > I am getting this error: > > Linear solve did not converge due to DIVERGED_NANORINF > > when solving an FEM for transient diffusion using hexahedron elements. The > problem is highly heterogeneous (i.e., dispersion tensor with varying > cell-centered velocity) and normally CG/Bjacobi has done well for me. If I > did CG/Jacobi, my solver would require nearly 7000 iterations, but would > get the "expected" solution. > > What does the above error mean? And what could I do to address it? > Almost certainly ILU craps out. Try using LU as the subsolver instead. Matt > Thanks, > Justin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Mar 30 16:26:17 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 30 Mar 2016 16:26:17 -0500 Subject: [petsc-users] DIVERGED_NANORINF for CG/Bjacobi for transient diffusion In-Reply-To: References: Message-ID: Matt, So do I use: "-pc_type bjacobi -sub_pc_type lu" or this: "-sub_pc_type lu" Thanks, Justin On Wed, Mar 30, 2016 at 4:23 PM, Matthew Knepley wrote: > On Wed, Mar 30, 2016 at 4:18 PM, Justin Chang wrote: > >> Hi all, >> >> I am getting this error: >> >> Linear solve did not converge due to DIVERGED_NANORINF >> >> when solving an FEM for transient diffusion using hexahedron elements. >> The problem is highly heterogeneous (i.e., dispersion tensor with varying >> cell-centered velocity) and normally CG/Bjacobi has done well for me. If I >> did CG/Jacobi, my solver would require nearly 7000 iterations, but would >> get the "expected" solution. >> >> What does the above error mean? And what could I do to address it? >> > > Almost certainly ILU craps out. Try using LU as the subsolver instead. > > Matt > > >> Thanks, >> Justin >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 30 16:27:20 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 30 Mar 2016 16:27:20 -0500 Subject: [petsc-users] DIVERGED_NANORINF for CG/Bjacobi for transient diffusion In-Reply-To: References: Message-ID: On Wed, Mar 30, 2016 at 4:26 PM, Justin Chang wrote: > Matt, > > So do I use: > > "-pc_type bjacobi -sub_pc_type lu" > The above. Matt > or this: > > "-sub_pc_type lu" > > Thanks, > Justin > > On Wed, Mar 30, 2016 at 4:23 PM, Matthew Knepley > wrote: > >> On Wed, Mar 30, 2016 at 4:18 PM, Justin Chang >> wrote: >> >>> Hi all, >>> >>> I am getting this error: >>> >>> Linear solve did not converge due to DIVERGED_NANORINF >>> >>> when solving an FEM for transient diffusion using hexahedron elements. >>> The problem is highly heterogeneous (i.e., dispersion tensor with varying >>> cell-centered velocity) and normally CG/Bjacobi has done well for me. If I >>> did CG/Jacobi, my solver would require nearly 7000 iterations, but would >>> get the "expected" solution. >>> >>> What does the above error mean? And what could I do to address it? >>> >> >> Almost certainly ILU craps out. Try using LU as the subsolver instead. >> >> Matt >> >> >>> Thanks, >>> Justin >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 30 18:57:27 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 30 Mar 2016 19:57:27 -0400 Subject: [petsc-users] DIVERGED_NANORINF for CG/Bjacobi for transient diffusion In-Reply-To: References: Message-ID: <0D14639F-4675-495D-AB6E-B0F61B583733@mcs.anl.gov> > On Mar 30, 2016, at 5:18 PM, Justin Chang wrote: > > Hi all, > > I am getting this error: > > Linear solve did not converge due to DIVERGED_NANORINF > > when solving an FEM for transient diffusion using hexahedron elements. The problem is highly heterogeneous (i.e., dispersion tensor with varying cell-centered velocity) and normally CG/Bjacobi has done well for me. If I did CG/Jacobi, my solver would require nearly 7000 iterations, but would get the "expected" solution. > > What does the above error mean? And what could I do to address it? It is sometimes useful in this case to run with -ksp_error_if_not_converged either with or without -start_in_debugger this gives you an exact stack trace of where the problem was originally detected. For example if Matt is right then it might show an error in MatLUFactorNumeric_SeqAIJ Barry > > Thanks, > Justin From cxchhu at gmail.com Wed Mar 30 21:53:16 2016 From: cxchhu at gmail.com (Cindy Chen) Date: Wed, 30 Mar 2016 22:53:16 -0400 Subject: [petsc-users] Petsc Installation error in OS X Message-ID: Hi, I try to install PETSc-3.5.4 in my macbook (version 10.11.3) but failed. Best, Xiaocui -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2655520 bytes Desc: not available URL: From knepley at gmail.com Wed Mar 30 22:14:16 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 30 Mar 2016 22:14:16 -0500 Subject: [petsc-users] Petsc Installation error in OS X In-Reply-To: References: Message-ID: On Wed, Mar 30, 2016 at 9:53 PM, Cindy Chen wrote: > Hi, > > I try to install PETSc-3.5.4 in my macbook (version 10.11.3) but failed. > This is a Metis bug. I believe we have fixed it in the latest 'master', which you could install or wait for the next release. Thanks, Matt > Best, > Xiaocui > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 30 22:17:09 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 30 Mar 2016 22:17:09 -0500 Subject: [petsc-users] Petsc Installation error in OS X In-Reply-To: References: Message-ID: >>>>> ld: section __DATA/__thread_bss extends beyond end of file, file 'CMakeFiles/metis.dir/__/GKlib/error.c.o' for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) <<<<<< We suspect its xcode 7.3 bug. You can workarround with the following patch to metis https://bitbucket.org/petsc/pkg-metis/commits/c27a7c30921dc587ae07286774ea2735ad21bb80 BTW: We recommend upgrading to using currently supported release [3.6] if possible. Satish On Wed, 30 Mar 2016, Cindy Chen wrote: > Hi, > > I try to install PETSc-3.5.4 in my macbook (version 10.11.3) but failed. > > Best, > Xiaocui > From balay at mcs.anl.gov Wed Mar 30 22:20:55 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 30 Mar 2016 22:20:55 -0500 Subject: [petsc-users] Petsc Installation error in OS X In-Reply-To: References: Message-ID: Another alternative might be to use brew gcc ./configure CC=gcc-5 CXX=g++-5 FC=gfortran ..... Satish On Wed, 30 Mar 2016, Satish Balay wrote: > >>>>> > ld: section __DATA/__thread_bss extends beyond end of file, file 'CMakeFiles/metis.dir/__/GKlib/error.c.o' for architecture x86_64 > clang: error: linker command failed with exit code 1 (use -v to see invocation) > <<<<<< > > We suspect its xcode 7.3 bug. You can workarround with the following > patch to metis > https://bitbucket.org/petsc/pkg-metis/commits/c27a7c30921dc587ae07286774ea2735ad21bb80 > > BTW: We recommend upgrading to using currently supported release [3.6] if possible. > > Satish > > > On Wed, 30 Mar 2016, Cindy Chen wrote: > > > Hi, > > > > I try to install PETSc-3.5.4 in my macbook (version 10.11.3) but failed. > > > > Best, > > Xiaocui > > > > From michael.afanasiev at erdw.ethz.ch Thu Mar 31 10:56:42 2016 From: michael.afanasiev at erdw.ethz.ch (Afanasiev Michael) Date: Thu, 31 Mar 2016 15:56:42 +0000 Subject: [petsc-users] DMPlex - Section. Message-ID: <0E580009-A06F-445A-AEBA-976DC94F03DD@erdw.ethz.ch> Hi, I?d like to define vectors on a subset of points defined by a DM (created by DMPlex). For example, if I?m modelling wave propagation in a model with both solid and fluid regions, I?d like to define ux, uy, uz vectors in the solid part, and p in the acoustic part. Basically, just a varying number of dofs per integration point. Looking through the documentation I?ve found a pretty comprehensive discussion of setting up the PetscSection object. Is an inroad here to define a custom PetscSection on a subset of elements, faces, edges, vertices, etc?.? Or is there a better way to approach this problem? Or is this functionality supported at all? We?re pretty familiar with PLEX by now, just looking for ways to define vectors on subsets of the whole domain. Thanks, Mike. -- Michael Afanasiev Ph.D. Candidate Computational Seismology Institut f?r Geophysik ETH Z?rich Sonneggstrasse 5, NO H 39.2 CH 8092 Z?rich michael.afanasiev at erdw.ethz.ch From knepley at gmail.com Thu Mar 31 11:02:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 31 Mar 2016 11:02:32 -0500 Subject: [petsc-users] DMPlex - Section. In-Reply-To: <0E580009-A06F-445A-AEBA-976DC94F03DD@erdw.ethz.ch> References: <0E580009-A06F-445A-AEBA-976DC94F03DD@erdw.ethz.ch> Message-ID: On Thu, Mar 31, 2016 at 10:56 AM, Afanasiev Michael < michael.afanasiev at erdw.ethz.ch> wrote: > Hi, > > I?d like to define vectors on a subset of points defined by a DM (created > by DMPlex). For example, if I?m modelling wave propagation in a model with > both solid and fluid regions, I?d like to define ux, uy, uz vectors in the > solid part, and p in the acoustic part. Basically, just a varying number of > dofs per integration point. Looking through the documentation I?ve found a > pretty comprehensive discussion of setting up the PetscSection object. Is > an inroad here to define a custom PetscSection on a subset of elements, > faces, edges, vertices, etc?.? Or is there a better way to approach this > problem? Or is this functionality supported at all? We?re pretty familiar > with PLEX by now, just looking for ways to define vectors on subsets of the > whole domain. > You can certainly have 0 dofs on any mesh point, so defining a vector on part of the domain is easy. This is enough for your case I think. If you wanted to do this for many very small pieces, then in order for PetscSection to be efficient, you would have to consecutively number the points on those pieces, which seems too hard. For this case, I would make separate subDMs. Matt > Thanks, > Mike. > -- > Michael Afanasiev > Ph.D. Candidate > Computational Seismology > Institut f?r Geophysik > ETH Z?rich > > Sonneggstrasse 5, NO H 39.2 > CH 8092 Z?rich > michael.afanasiev at erdw.ethz.ch > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From overholt at capesim.com Thu Mar 31 11:03:30 2016 From: overholt at capesim.com (Matthew Overholt) Date: Thu, 31 Mar 2016 12:03:30 -0400 Subject: [petsc-users] MATSBAIJ set up Message-ID: <005501d18b66$df96ebb0$9ec4c310$@capesim.com> Hi, I am just getting started with PETSc and am having difficulty with setting up my MATSBAIJ matrix. I'm adapting the ksp/ex23.c example to my 3D FEM calculation; in my case the equation to solve is K*x = b. For serial execution, the following works fine and gives the correct answer. ierr = MatCreate(PETSC_COMM_WORLD,&K);CHKERRQ(ierr); ierr = MatSetSizes(K,vlocal,vlocal,neqns,neqns);CHKERRQ(ierr); ierr = MatSetType(K,MATSBAIJ);CHKERRQ(ierr); // symmetric, block, sparse ierr = MatSetOption(K,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); // K is symmetric, positive-definite, sparse ierr = MatSetOption(K,MAT_IGNORE_LOWER_TRIANGULAR,PETSC_TRUE);CHKERRQ(ierr); // so only top tri is needed ierr = MatSetUp(K);CHKERRQ(ierr); where vlocal is the result from the call to VecGetLocalSize(x,&vlocal) for the solution vector (obviously the full size in the serial case). However, for parallel execution the above crashes on a 11 SEGV Segmentation Violation Error on entry to the function MatSetOption_MPISBAIJ(), according to TotalView. So instead I have been trying the following for parallel. PetscInt blockSize = 1; // use a block size of 1 since K is NOT block symmetric PetscInt diagNZ = 5; // # of non-zeros per row in upper diagonal portion of local submatrix PetscInt offdiagNZ = 8; // max # of non-zeros per row in off-diagonal portion of local submatrix ierr = MatCreateSBAIJ(PETSC_COMM_WORLD,blockSize,vlocal,vlocal,neqns,neqns,diagNZ,N ULL,offdiagNZ,NULL,&K);CHKERRQ(ierr); ierr = MatSetOption(K,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); // K is symmetric, positive-definite, sparse ierr = MatSetOption(K,MAT_IGNORE_LOWER_TRIANGULAR,PETSC_TRUE);CHKERRQ(ierr); // so only top tri is needed ierr = MatSetUp(K);CHKERRQ(ierr); However, this fails during the process of setting matrix values (MatSetValue()) with the following error: [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: new nonzero at (0,26) caused a malloc . (Note that in this case the matrix size is 96x96 split over 2 processors (vlocal = 48).) If someone would please point me to the correct way to set up the (MATSBAIJ) matrix for a perfectly symmetric, positive-definite, sparse system, I'd appreciate it. Thanks, Matt Overholt CapeSym, Inc. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 31 11:11:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 31 Mar 2016 11:11:32 -0500 Subject: [petsc-users] MATSBAIJ set up In-Reply-To: <005501d18b66$df96ebb0$9ec4c310$@capesim.com> References: <005501d18b66$df96ebb0$9ec4c310$@capesim.com> Message-ID: On Thu, Mar 31, 2016 at 11:03 AM, Matthew Overholt wrote: > Hi, > > > > I am just getting started with PETSc and am having difficulty with setting > up my MATSBAIJ matrix. I?m adapting the ksp/ex23.c example to my 3D FEM > calculation; in my case the equation to solve is K*x = b. > > > > For serial execution, the following works fine and gives the correct > answer. > > ierr = MatCreate(PETSC_COMM_WORLD,&K);CHKERRQ(ierr); > > ierr = MatSetSizes(K,vlocal,vlocal,neqns,neqns);CHKERRQ(ierr); > > ierr = MatSetType(K,MATSBAIJ);CHKERRQ(ierr); // > symmetric, block, sparse > > ierr = MatSetOption(K,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); // K is > symmetric, positive-definite, sparse > > ierr = > MatSetOption(K,MAT_IGNORE_LOWER_TRIANGULAR,PETSC_TRUE);CHKERRQ(ierr); // > so only top tri is needed > > ierr = MatSetUp(K);CHKERRQ(ierr); > > where vlocal is the result from the call to VecGetLocalSize(x,&vlocal) for > the solution vector (obviously the full size in the serial case). > > However, for parallel execution the above crashes on a 11 SEGV > Segmentation Violation Error on entry to the function > MatSetOption_MPISBAIJ(), according to TotalView. > > > > So instead I have been trying the following for parallel. > > PetscInt blockSize = 1; // use a block size of 1 since K is NOT > block symmetric > > PetscInt diagNZ = 5; // # of non-zeros per row in upper > diagonal portion of local submatrix > > PetscInt offdiagNZ = 8; // max # of non-zeros per row in > off-diagonal portion of local submatrix > > ierr = > MatCreateSBAIJ(PETSC_COMM_WORLD,blockSize,vlocal,vlocal,neqns,neqns,diagNZ,NULL,offdiagNZ,NULL,&K);CHKERRQ(ierr); > > ierr = MatSetOption(K,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); // K is > symmetric, positive-definite, sparse > > ierr = > MatSetOption(K,MAT_IGNORE_LOWER_TRIANGULAR,PETSC_TRUE);CHKERRQ(ierr); // > so only top tri is needed > > ierr = MatSetUp(K);CHKERRQ(ierr); > > However, this fails during the process of setting matrix values ( > MatSetValue()) with the following error: > > [0]PETSC ERROR: Argument out of range > > [0]PETSC ERROR: new nonzero at (0,26) caused a malloc > > ? > > (Note that in this case the matrix size is 96x96 split over 2 processors > (vlocal = 48).) > > > > If someone would please point me to the correct way to set up the > (MATSBAIJ) matrix for a perfectly symmetric, positive-definite, sparse > system, I?d appreciate it. > I would start by making AIJ work in parallel. The switch to SBAIJ is then fairly easy. The error appears to say that you have incorrectly allocated the number of nonzeros in the diagonal block. Matt > Thanks, > > Matt Overholt > > CapeSym, Inc. > > > > > Virus-free. > www.avast.com > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: