[petsc-users] Performance question using seq Mat and Vec
Brian Yang
brianyang1106 at gmail.com
Thu Jul 31 15:58:15 CDT 2014
I will check the block matrix, thanks Mat.
On Thu, Jul 31, 2014 at 3:35 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Thu, Jul 31, 2014 at 3:24 PM, Brian Yang <jyang29 at uh.edu> wrote:
>
>> Hi all,
>>
>> Here's an abstract of the problem,
>>
>> I got src and rec, they are 3D images with the same size, say Z, X, Y.
>>
>> We call one (Z, X) is a panel and then there's Y panels for both src and
>> rec. BTW, they hold complex numbers.
>>
>> For example, for the *first* panel (always process the same panel) of
>> src and rec:
>>
>> Take the first panel of src as our A (20x20),
>> take the first column of first panel of rec as our b (20x1),
>> solve the linear system and get x (20x1),
>> go to next column of the first panel of rec until finish this panel,
>> assemble all the solution x column by column (20x20).
>>
>
> This is a fine conceptual explanation of the algorithm, however I do not
> think you
> want to implement it this way. Since you are solving all these panels
> independently,
> you can just construct the block matrix, with each panel as a block and
> solve it all
> at once (they clearly fit into memory). This might not be optimal for
> multiple rhs.
>
> If the matrices really are dense and you have multiple rhs, then you
> should look at
> using Elemental. We have an interface to it, although I am not sure we
> have hooked up
> the multiple rhs solves.
>
>
>> After finishing the first panel of src and rec, go to next... repeat.
>>
>>
>> Hope I explained well of my problem. I used SeqDense matrix for A and Seq
>> vector for b.
>>
>> Here's the flow,
>>
>> - start
>> - all the nodes will share all the Y panels, each node will get part of
>> them
>> - each node will read in its own part of src and rec images
>> - for each node, take a panel of src and rec
>>
>> *- create Mat and Vec, fill them*
>>
>> *- create KSP and solve by lsqr*
>>
>> *- get the solution*
>> *- destroy all the petsc object, A, b, x (destroying KSP will give me
>> error here!)*
>> - repeat for the next panel
>>
>>
>> Here's the time (seconds) output from node 2 (random choice):
>>
>> *entire time for this panel* *solving
>> time*
>>
>> processing panel 1 *time*= 3.2995000E-02 *solver*=
>>> 3.0995002E-02
>>> processing panel 2 time= 3.5994001E-02 solver= 3.4995001E-02
>>> processing panel 3 time= 3.9994001E-02 solver= 3.8994007E-02
>>> processing panel 4 time= 4.4993997E-02 solver= 4.3993995E-02
>>> processing panel 5 time= 4.8991993E-02 solver= 4.6992987E-02
>>> processing panel 6 time= 5.4991007E-02 solver= 5.3991005E-02
>>> processing panel 7 time= 5.8990985E-02 solver= 5.7990998E-02
>>> processing panel 8 time= 6.3990027E-02 solver= 6.1990023E-02
>>> processing panel 9 time= 6.8989992E-02 solver= 6.6990018E-02
>>> processing panel 10 time= 7.3989004E-02 solver= 7.1989000E-02
>>> processing panel 11 time= 7.7987969E-02 solver= 7.6987982E-02
>>> processing panel 12 time= 8.1988037E-02 solver= 7.9988003E-02
>>> processing panel 13 time= 8.8985980E-02 solver= 8.6987019E-02
>>> processing panel 14 time= 9.4985008E-02 solver= 9.2984974E-02
>>> processing panel 15 time= 0.1009850 solver= 9.8985016E-02
>>> processing panel 16 time= 0.1119831 solver= 0.1099830
>>> processing panel 17 time= 0.1269809 solver= 0.1239820
>>> processing panel 18 time= 0.1469780 solver= 0.1439790
>>> processing panel 19 time= 0.1709731 solver= 0.1669741
>>> processing panel 20 time= 0.1909720 solver= 0.1869720
>>> processing panel 21 time= 0.2019690 solver= 0.1979700
>>> processing panel 22 time= 0.2239659 solver= 0.2199659
>>> processing panel 23 time= 0.2369640 solver= 0.2319648
>>> processing panel 24 time= 0.2499621 solver= 0.2449629
>>> processing panel 25 time= 0.2709589 solver= 0.2659600
>>> processing panel 26 time= 0.2869561 solver= 0.2829571
>>> processing panel 27 time= 0.3129530 solver= 0.3059540
>>> processing panel 28 time= 0.3389480 solver= 0.3329499
>>> processing panel 29 time= 0.3719430 solver= 0.3649440
>>> processing panel 30 time= 0.3949399 solver= 0.3879409
>>> processing panel 31 time= 0.4249353 solver= 0.4169374
>>> processing panel 32 time= 0.4549308 solver= 0.4469318
>>> processing panel 33 time= 0.4859262 solver= 0.4759283
>>> processing panel 34 time= 0.5119228 solver= 0.5019240
>>> processing panel 35 time= 0.5449171 solver= 0.5349178
>>> processing panel 36 time= 0.5689130 solver= 0.5579152
>>> processing panel 37 time= 0.5959096 solver= 0.5849104
>>> processing panel 38 time= 0.6199055 solver= 0.6079073
>>>
>>
>> You could see the time for solving the panels are increasing all the
>> time. The panel number here is the local one. If I start to solve from
>> panel 40 (random choice):
>>
>
> It certainly looks like you have a growing memory footprint. It is likely
> to have happened
> when you extracted/replaced parts of the matrix, which I think is
> unnecessary as I said above.
>
> Thanks,
>
> Matt
>
>
>> processing panel 40 time= 5.5992007E-02 solver= 5.1991999E-02
>>> processing panel 41 time= 9.1986001E-02 solver= 9.0986013E-02
>>> processing panel 42 time= 0.1309800 solver= 0.1299810
>>> processing panel 43 time= 0.1719730 solver= 0.1709740
>>> processing panel 44 time= 0.2119681 solver= 0.2109680
>>> processing panel 45 time= 0.2529620 solver= 0.2519621
>>> processing panel 46 time= 0.2919550 solver= 0.2909551
>>> processing panel 47 time= 0.3319499 solver= 0.3309500
>>> processing panel 48 time= 0.3719430 solver= 0.3709428
>>> processing panel 49 time= 0.4129372 solver= 0.4109371
>>> processing panel 50 time= 0.4529319 solver= 0.4509320
>>> processing panel 51 time= 0.4929240 solver= 0.4909239
>>> processing panel 52 time= 0.5339203 solver= 0.5319204
>>> processing panel 53 time= 0.5779119 solver= 0.5759130
>>> processing panel 54 time= 0.6199059 solver= 0.6179061
>>> processing panel 55 time= 0.6648979 solver= 0.6628990
>>> processing panel 56 time= 0.7248902 solver= 0.7218900
>>> processing panel 57 time= 0.7938790 solver= 0.7908792
>>> processing panel 58 time= 0.8728676 solver= 0.8698678
>>> processing panel 59 time= 0.9778509 solver= 0.9748516
>>> processing panel 60 time= 1.125830 solver= 1.122829
>>> processing panel 61 time= 1.273806 solver= 1.268806
>>> processing panel 62 time= 1.448780 solver= 1.444779
>>> processing panel 63 time= 1.647749 solver= 1.643749
>>> processing panel 64 time= 1.901712 solver= 1.896712
>>> processing panel 65 time= 2.143673 solver= 2.138674
>>> processing panel 66 time= 2.437630 solver= 2.431629
>>> processing panel 67 time= 2.744583 solver= 2.736586
>>> processing panel 68 time= 3.041536 solver= 3.035538
>>>
>>
>> The trend is the same, the time is increasing and also starts from a very
>> quick one.
>>
>>
>> Since I have thousands of panels for src and rec, the execution time is
>> unbearable as it goes.
>> So I am wondering whether I used the right method? or there's memory
>> issue?
>>
>> Thanks.
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
--
Brian Yang
U of Houston
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140731/f2deab4b/attachment.html>
More information about the petsc-users
mailing list