[petsc-users] Performance question using seq Mat and Vec
Matthew Knepley
knepley at gmail.com
Thu Jul 31 15:35:06 CDT 2014
On Thu, Jul 31, 2014 at 3:24 PM, Brian Yang <jyang29 at uh.edu> wrote:
> Hi all,
>
> Here's an abstract of the problem,
>
> I got src and rec, they are 3D images with the same size, say Z, X, Y.
>
> We call one (Z, X) is a panel and then there's Y panels for both src and
> rec. BTW, they hold complex numbers.
>
> For example, for the *first* panel (always process the same panel) of src
> and rec:
>
> Take the first panel of src as our A (20x20),
> take the first column of first panel of rec as our b (20x1),
> solve the linear system and get x (20x1),
> go to next column of the first panel of rec until finish this panel,
> assemble all the solution x column by column (20x20).
>
This is a fine conceptual explanation of the algorithm, however I do not
think you
want to implement it this way. Since you are solving all these panels
independently,
you can just construct the block matrix, with each panel as a block and
solve it all
at once (they clearly fit into memory). This might not be optimal for
multiple rhs.
If the matrices really are dense and you have multiple rhs, then you should
look at
using Elemental. We have an interface to it, although I am not sure we have
hooked up
the multiple rhs solves.
> After finishing the first panel of src and rec, go to next... repeat.
>
>
> Hope I explained well of my problem. I used SeqDense matrix for A and Seq
> vector for b.
>
> Here's the flow,
>
> - start
> - all the nodes will share all the Y panels, each node will get part of
> them
> - each node will read in its own part of src and rec images
> - for each node, take a panel of src and rec
>
> *- create Mat and Vec, fill them*
>
> *- create KSP and solve by lsqr*
>
> *- get the solution*
> *- destroy all the petsc object, A, b, x (destroying KSP will give me
> error here!)*
> - repeat for the next panel
>
>
> Here's the time (seconds) output from node 2 (random choice):
>
> *entire time for this panel* *solving
> time*
>
> processing panel 1 *time*= 3.2995000E-02 *solver*=
>> 3.0995002E-02
>> processing panel 2 time= 3.5994001E-02 solver= 3.4995001E-02
>> processing panel 3 time= 3.9994001E-02 solver= 3.8994007E-02
>> processing panel 4 time= 4.4993997E-02 solver= 4.3993995E-02
>> processing panel 5 time= 4.8991993E-02 solver= 4.6992987E-02
>> processing panel 6 time= 5.4991007E-02 solver= 5.3991005E-02
>> processing panel 7 time= 5.8990985E-02 solver= 5.7990998E-02
>> processing panel 8 time= 6.3990027E-02 solver= 6.1990023E-02
>> processing panel 9 time= 6.8989992E-02 solver= 6.6990018E-02
>> processing panel 10 time= 7.3989004E-02 solver= 7.1989000E-02
>> processing panel 11 time= 7.7987969E-02 solver= 7.6987982E-02
>> processing panel 12 time= 8.1988037E-02 solver= 7.9988003E-02
>> processing panel 13 time= 8.8985980E-02 solver= 8.6987019E-02
>> processing panel 14 time= 9.4985008E-02 solver= 9.2984974E-02
>> processing panel 15 time= 0.1009850 solver= 9.8985016E-02
>> processing panel 16 time= 0.1119831 solver= 0.1099830
>> processing panel 17 time= 0.1269809 solver= 0.1239820
>> processing panel 18 time= 0.1469780 solver= 0.1439790
>> processing panel 19 time= 0.1709731 solver= 0.1669741
>> processing panel 20 time= 0.1909720 solver= 0.1869720
>> processing panel 21 time= 0.2019690 solver= 0.1979700
>> processing panel 22 time= 0.2239659 solver= 0.2199659
>> processing panel 23 time= 0.2369640 solver= 0.2319648
>> processing panel 24 time= 0.2499621 solver= 0.2449629
>> processing panel 25 time= 0.2709589 solver= 0.2659600
>> processing panel 26 time= 0.2869561 solver= 0.2829571
>> processing panel 27 time= 0.3129530 solver= 0.3059540
>> processing panel 28 time= 0.3389480 solver= 0.3329499
>> processing panel 29 time= 0.3719430 solver= 0.3649440
>> processing panel 30 time= 0.3949399 solver= 0.3879409
>> processing panel 31 time= 0.4249353 solver= 0.4169374
>> processing panel 32 time= 0.4549308 solver= 0.4469318
>> processing panel 33 time= 0.4859262 solver= 0.4759283
>> processing panel 34 time= 0.5119228 solver= 0.5019240
>> processing panel 35 time= 0.5449171 solver= 0.5349178
>> processing panel 36 time= 0.5689130 solver= 0.5579152
>> processing panel 37 time= 0.5959096 solver= 0.5849104
>> processing panel 38 time= 0.6199055 solver= 0.6079073
>>
>
> You could see the time for solving the panels are increasing all the time.
> The panel number here is the local one. If I start to solve from panel 40
> (random choice):
>
It certainly looks like you have a growing memory footprint. It is likely
to have happened
when you extracted/replaced parts of the matrix, which I think is
unnecessary as I said above.
Thanks,
Matt
> processing panel 40 time= 5.5992007E-02 solver= 5.1991999E-02
>> processing panel 41 time= 9.1986001E-02 solver= 9.0986013E-02
>> processing panel 42 time= 0.1309800 solver= 0.1299810
>> processing panel 43 time= 0.1719730 solver= 0.1709740
>> processing panel 44 time= 0.2119681 solver= 0.2109680
>> processing panel 45 time= 0.2529620 solver= 0.2519621
>> processing panel 46 time= 0.2919550 solver= 0.2909551
>> processing panel 47 time= 0.3319499 solver= 0.3309500
>> processing panel 48 time= 0.3719430 solver= 0.3709428
>> processing panel 49 time= 0.4129372 solver= 0.4109371
>> processing panel 50 time= 0.4529319 solver= 0.4509320
>> processing panel 51 time= 0.4929240 solver= 0.4909239
>> processing panel 52 time= 0.5339203 solver= 0.5319204
>> processing panel 53 time= 0.5779119 solver= 0.5759130
>> processing panel 54 time= 0.6199059 solver= 0.6179061
>> processing panel 55 time= 0.6648979 solver= 0.6628990
>> processing panel 56 time= 0.7248902 solver= 0.7218900
>> processing panel 57 time= 0.7938790 solver= 0.7908792
>> processing panel 58 time= 0.8728676 solver= 0.8698678
>> processing panel 59 time= 0.9778509 solver= 0.9748516
>> processing panel 60 time= 1.125830 solver= 1.122829
>> processing panel 61 time= 1.273806 solver= 1.268806
>> processing panel 62 time= 1.448780 solver= 1.444779
>> processing panel 63 time= 1.647749 solver= 1.643749
>> processing panel 64 time= 1.901712 solver= 1.896712
>> processing panel 65 time= 2.143673 solver= 2.138674
>> processing panel 66 time= 2.437630 solver= 2.431629
>> processing panel 67 time= 2.744583 solver= 2.736586
>> processing panel 68 time= 3.041536 solver= 3.035538
>>
>
> The trend is the same, the time is increasing and also starts from a very
> quick one.
>
>
> Since I have thousands of panels for src and rec, the execution time is
> unbearable as it goes.
> So I am wondering whether I used the right method? or there's memory issue?
>
> Thanks.
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140731/d1e75269/attachment-0001.html>
More information about the petsc-users
mailing list