[petsc-users] Performance question using seq Mat and Vec

Barry Smith bsmith at mcs.anl.gov
Thu Jul 31 16:31:02 CDT 2014


On Jul 31, 2014, at 3:24 PM, Brian Yang <jyang29 at uh.edu> wrote:

> Hi all,
> 
> Here's an abstract of the problem,
> 
> I got src and rec, they are 3D images with the same size, say Z, X, Y.
> 
> We call one (Z, X) is a panel and then there's Y panels for both src and rec. BTW, they hold complex numbers.
> 
> For example, for the first panel (always process the same panel) of src and rec:
> 
> Take the first panel of src as our A (20x20),
> take the first column of first panel of rec as our b (20x1),
> solve the linear system and get x (20x1),
> go to next column of the first panel of rec until finish this panel,
> assemble all the solution x column by column (20x20).
> 
> After finishing the first panel of src and rec, go to next... repeat.
> 
> 
> Hope I explained well of my problem. I used SeqDense matrix for A and Seq vector for b.
> 
> Here's the flow,
> 
> - start
> - all the nodes will share all the Y panels, each node will get part of them
> - each node will read in its own part of src and rec images
> - for each node, take a panel of src and rec
> - create Mat and Vec, fill them
> - create KSP and solve by lsqr
> - get the solution
> - destroy all the petsc object, A, b, x (destroying KSP will give me error here!)

    If you do not destroy the KSP then the other objects won’t get destroyed (since the KSP reference counts them). You need to determine why the KSP cannot be destroyed. What is the error message? Run with valgrind?


> - repeat for the next panel
> 
> 
> Here's the time (seconds) output from node 2 (random choice):
> 
>                            entire time for this panel            solving time
> 
>  processing panel           1 time=  3.2995000E-02 solver=  3.0995002E-02
>  processing panel           2 time=  3.5994001E-02 solver=  3.4995001E-02
>  processing panel           3 time=  3.9994001E-02 solver=  3.8994007E-02
>  processing panel           4 time=  4.4993997E-02 solver=  4.3993995E-02
>  processing panel           5 time=  4.8991993E-02 solver=  4.6992987E-02
>  processing panel           6 time=  5.4991007E-02 solver=  5.3991005E-02
>  processing panel           7 time=  5.8990985E-02 solver=  5.7990998E-02
>  processing panel           8 time=  6.3990027E-02 solver=  6.1990023E-02
>  processing panel           9 time=  6.8989992E-02 solver=  6.6990018E-02
>  processing panel          10 time=  7.3989004E-02 solver=  7.1989000E-02
>  processing panel          11 time=  7.7987969E-02 solver=  7.6987982E-02
>  processing panel          12 time=  8.1988037E-02 solver=  7.9988003E-02
>  processing panel          13 time=  8.8985980E-02 solver=  8.6987019E-02
>  processing panel          14 time=  9.4985008E-02 solver=  9.2984974E-02
>  processing panel          15 time=  0.1009850     solver=  9.8985016E-02
>  processing panel          16 time=  0.1119831     solver=  0.1099830    
>  processing panel          17 time=  0.1269809     solver=  0.1239820    
>  processing panel          18 time=  0.1469780     solver=  0.1439790    
>  processing panel          19 time=  0.1709731     solver=  0.1669741    
>  processing panel          20 time=  0.1909720     solver=  0.1869720    
>  processing panel          21 time=  0.2019690     solver=  0.1979700    
>  processing panel          22 time=  0.2239659     solver=  0.2199659    
>  processing panel          23 time=  0.2369640     solver=  0.2319648    
>  processing panel          24 time=  0.2499621     solver=  0.2449629    
>  processing panel          25 time=  0.2709589     solver=  0.2659600    
>  processing panel          26 time=  0.2869561     solver=  0.2829571    
>  processing panel          27 time=  0.3129530     solver=  0.3059540    
>  processing panel          28 time=  0.3389480     solver=  0.3329499    
>  processing panel          29 time=  0.3719430     solver=  0.3649440    
>  processing panel          30 time=  0.3949399     solver=  0.3879409    
>  processing panel          31 time=  0.4249353     solver=  0.4169374    
>  processing panel          32 time=  0.4549308     solver=  0.4469318    
>  processing panel          33 time=  0.4859262     solver=  0.4759283    
>  processing panel          34 time=  0.5119228     solver=  0.5019240    
>  processing panel          35 time=  0.5449171     solver=  0.5349178    
>  processing panel          36 time=  0.5689130     solver=  0.5579152    
>  processing panel          37 time=  0.5959096     solver=  0.5849104    
>  processing panel          38 time=  0.6199055     solver=  0.6079073
> 
> You could see the time for solving the panels are increasing all the time. The panel number here is the local one. If I start to solve from panel 40 (random choice):
> 
>  processing panel          40 time=  5.5992007E-02 solver=  5.1991999E-02
>  processing panel          41 time=  9.1986001E-02 solver=  9.0986013E-02
>  processing panel          42 time=  0.1309800     solver=  0.1299810    
>  processing panel          43 time=  0.1719730     solver=  0.1709740    
>  processing panel          44 time=  0.2119681     solver=  0.2109680    
>  processing panel          45 time=  0.2529620     solver=  0.2519621    
>  processing panel          46 time=  0.2919550     solver=  0.2909551    
>  processing panel          47 time=  0.3319499     solver=  0.3309500    
>  processing panel          48 time=  0.3719430     solver=  0.3709428    
>  processing panel          49 time=  0.4129372     solver=  0.4109371    
>  processing panel          50 time=  0.4529319     solver=  0.4509320    
>  processing panel          51 time=  0.4929240     solver=  0.4909239    
>  processing panel          52 time=  0.5339203     solver=  0.5319204    
>  processing panel          53 time=  0.5779119     solver=  0.5759130    
>  processing panel          54 time=  0.6199059     solver=  0.6179061    
>  processing panel          55 time=  0.6648979     solver=  0.6628990    
>  processing panel          56 time=  0.7248902     solver=  0.7218900    
>  processing panel          57 time=  0.7938790     solver=  0.7908792    
>  processing panel          58 time=  0.8728676     solver=  0.8698678    
>  processing panel          59 time=  0.9778509     solver=  0.9748516    
>  processing panel          60 time=   1.125830     solver=   1.122829    
>  processing panel          61 time=   1.273806     solver=   1.268806    
>  processing panel          62 time=   1.448780     solver=   1.444779    
>  processing panel          63 time=   1.647749     solver=   1.643749    
>  processing panel          64 time=   1.901712     solver=   1.896712    
>  processing panel          65 time=   2.143673     solver=   2.138674    
>  processing panel          66 time=   2.437630     solver=   2.431629    
>  processing panel          67 time=   2.744583     solver=   2.736586    
>  processing panel          68 time=   3.041536     solver=   3.035538
> 
> The trend is the same, the time is increasing and also starts from a very quick one. 
> 
> 
> Since I have thousands of panels for src and rec, the execution time is unbearable as it goes.
> So I am wondering whether I used the right method? or there's memory issue?
> 
> Thanks.



More information about the petsc-users mailing list