[petsc-users] Very slow SVD with SLEPC
Rakesh Halder
rhalder at umich.edu
Tue Nov 17 01:31:12 CST 2020
And this output is from the small matrix log:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="performance_xml2html.xsl"?>
<root>
<!-- PETSc Performance Summary: -->
<petscroot>
<runspecification desc="Run Specification">
<executable desc="Executable">simpleROMFoam</executable>
<architecture desc="Architecture">real-opt</architecture>
<hostname desc="Host">pmultigrid</hostname>
<nprocesses desc="Number of processes">1</nprocesses>
<user desc="Run by user">rhalder</user>
<date desc="Started at">Mon Nov 16 20:40:01 2020</date>
<petscrelease desc="Petsc Release">Petsc Release Version 3.14.1, Nov
03, 2020 </petscrelease>
</runspecification>
<globalperformance desc="Global performance">
<time desc="Time (sec)">
<max>2.030551e+02</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>1.000000</ratio>
<average>2.030551e+02</average>
</time>
<objects desc="Objects">
<max>5.300000e+01</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>1.000000</ratio>
<average>5.300000e+01</average>
</objects>
<mflop desc="MFlop">
<max>0.000000e+00</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>0.000000</ratio>
<average>0.000000e+00</average>
<total>0.000000e+00</total>
</mflop>
<mflops desc="MFlop/sec">
<max>0.000000e+00</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>0.000000</ratio>
<average>0.000000e+00</average>
<total>0.000000e+00</total>
</mflops>
<messagetransfers desc="MPI Message Transfers">
<max>0.000000e+00</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>0.000000</ratio>
<average>0.000000e+00</average>
<total>0.000000e+00</total>
</messagetransfers>
<messagevolume desc="MPI Message Volume (MiB)">
<max>0.000000e+00</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>0.000000</ratio>
<average>0.000000e+00</average>
<total>0.000000e+00</total>
</messagevolume>
<reductions desc="MPI Reductions">
<max>0.000000e+00</max>
<maxrank desc="rank at which max was found">0</maxrank>
<ratio>0.000000</ratio>
</reductions>
</globalperformance>
<timertree desc="Timings tree">
<totaltime>203.055134</totaltime>
<timethreshold>0.010000</timethreshold>
<event>
<name>MatConvert</name>
<time>
<value>0.0297699</value>
</time>
<events>
<event>
<name>self</name>
<time>
<value>0.029759</value>
</time>
</event>
</events>
</event>
<event>
<name>SVDSolve</name>
<time>
<value>0.0242731</value>
</time>
<events>
<event>
<name>self</name>
<time>
<value>0.0181869</value>
</time>
</event>
</events>
</event>
<event>
<name>MatView</name>
<time>
<value>0.0138235</value>
</time>
</event>
</timertree>
<selftimertable desc="Self-timings">
<totaltime>203.055134</totaltime>
<event>
<name>MatConvert</name>
<time>
<value>0.0324545</value>
</time>
</event>
<event>
<name>SVDSolve</name>
<time>
<value>0.0181869</value>
</time>
</event>
<event>
<name>MatView</name>
<time>
<value>0.0138235</value>
</time>
</event>
</selftimertable>
</petscroot>
</root>
On Tue, Nov 17, 2020 at 2:30 AM Rakesh Halder <rhalder at umich.edu> wrote:
> The following is from the large matrix log:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <?xml-stylesheet type="text/xsl" href="performance_xml2html.xsl"?>
> <root>
> <!-- PETSc Performance Summary: -->
> <petscroot>
> <runspecification desc="Run Specification">
> <executable desc="Executable">simpleROMFoam</executable>
> <architecture desc="Architecture">real-opt</architecture>
> <hostname desc="Host">pmultigrid</hostname>
> <nprocesses desc="Number of processes">1</nprocesses>
> <user desc="Run by user">rhalder</user>
> <date desc="Started at">Mon Nov 16 20:25:52 2020</date>
> <petscrelease desc="Petsc Release">Petsc Release Version 3.14.1, Nov
> 03, 2020 </petscrelease>
> </runspecification>
> <globalperformance desc="Global performance">
> <time desc="Time (sec)">
> <max>1.299397e+03</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>1.000000</ratio>
> <average>1.299397e+03</average>
> </time>
> <objects desc="Objects">
> <max>9.100000e+01</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>1.000000</ratio>
> <average>9.100000e+01</average>
> </objects>
> <mflop desc="MFlop">
> <max>0.000000e+00</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>0.000000</ratio>
> <average>0.000000e+00</average>
> <total>0.000000e+00</total>
> </mflop>
> <mflops desc="MFlop/sec">
> <max>0.000000e+00</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>0.000000</ratio>
> <average>0.000000e+00</average>
> <total>0.000000e+00</total>
> </mflops>
> <messagetransfers desc="MPI Message Transfers">
> <max>0.000000e+00</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>0.000000</ratio>
> <average>0.000000e+00</average>
> <total>0.000000e+00</total>
> </messagetransfers>
> <messagevolume desc="MPI Message Volume (MiB)">
> <max>0.000000e+00</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>0.000000</ratio>
> <average>0.000000e+00</average>
> <total>0.000000e+00</total>
> </messagevolume>
> <reductions desc="MPI Reductions">
> <max>0.000000e+00</max>
> <maxrank desc="rank at which max was found">0</maxrank>
> <ratio>0.000000</ratio>
> </reductions>
> </globalperformance>
> <timertree desc="Timings tree">
> <totaltime>1299.397478</totaltime>
> <timethreshold>0.010000</timethreshold>
> <event>
> <name>SVDSolve</name>
> <time>
> <value>75.5819</value>
> </time>
> <events>
> <event>
> <name>self</name>
> <time>
> <value>75.3134</value>
> </time>
> </event>
> <event>
> <name>MatConvert</name>
> <time>
> <value>0.165386</value>
> </time>
> <ncalls>
> <value>3.</value>
> </ncalls>
> <events>
> <event>
> <name>self</name>
> <time>
> <value>0.165386</value>
> </time>
> </event>
> </events>
> </event>
> <event>
> <name>SVDSetUp</name>
> <time>
> <value>0.102518</value>
> </time>
> <events>
> <event>
> <name>self</name>
> <time>
> <value>0.0601394</value>
> </time>
> </event>
> <event>
> <name>VecSet</name>
> <time>
> <value>0.0423783</value>
> </time>
> <ncalls>
> <value>4.</value>
> </ncalls>
> </event>
> </events>
> </event>
> </events>
> </event>
> <event>
> <name>MatConvert</name>
> <time>
> <value>0.575872</value>
> </time>
> <events>
> <event>
> <name>self</name>
> <time>
> <value>0.575869</value>
> </time>
> </event>
> </events>
> </event>
> <event>
> <name>MatView</name>
> <time>
> <value>0.424561</value>
> </time>
> </event>
> <event>
> <name>BVCopy</name>
> <time>
> <value>0.0288127</value>
> </time>
> <ncalls>
> <value>2000.</value>
> </ncalls>
> <events>
> <event>
> <name>VecCopy</name>
> <time>
> <value>0.0284472</value>
> </time>
> </event>
> </events>
> </event>
> <event>
> <name>MatAssemblyEnd</name>
> <time>
> <value>0.0128941</value>
> </time>
> </event>
> </timertree>
> <selftimertable desc="Self-timings">
> <totaltime>1299.397478</totaltime>
> <event>
> <name>SVDSolve</name>
> <time>
> <value>75.3134</value>
> </time>
> </event>
> <event>
> <name>MatConvert</name>
> <time>
> <value>0.741256</value>
> </time>
> </event>
> <event>
> <name>MatView</name>
> <time>
> <value>0.424561</value>
> </time>
> </event>
> <event>
> <name>SVDSetUp</name>
> <time>
> <value>0.0601394</value>
> </time>
> </event>
> <event>
> <name>VecSet</name>
> <time>
> <value>0.0424012</value>
> </time>
> </event>
> <event>
> <name>VecCopy</name>
> <time>
> <value>0.0284472</value>
> </time>
> </event>
> <event>
> <name>MatAssemblyEnd</name>
> <time>
> <value>0.0128944</value>
> </time>
> </event>
> </selftimertable>
> </petscroot>
> </root>
>
>
> On Tue, Nov 17, 2020 at 2:28 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
>
>> I cannot visualize the XML files. Please send the information in plain
>> text.
>> Jose
>>
>>
>> > El 17 nov 2020, a las 5:33, Rakesh Halder <rhalder at umich.edu> escribió:
>> >
>> > Hi Jose,
>> >
>> > I attached two XML logs of two different SVD calculations where N ~=
>> 140,000; first a small N x 5 matrix, and then a large N x 1000 matrix. The
>> global timing starts before the SVD calculations. The small matrix
>> calculation happens very quick in total (less than a second), while the
>> larger one takes around 1,000 seconds. The "largeMat.xml" file shows that
>> SVDSolve takes around 75 seconds, but when I time it myself by outputting
>> the time difference to the console, it shows that it takes around 1,000
>> seconds, and I'm not sure where this mismatch is coming from.
>> >
>> > This is using the scaLAPACK SVD solver on a single processor, and I
>> call MatConvert to convert my matrix to the MATSCALAPACK format.
>> >
>> > Thanks,
>> >
>> > Rakesh
>> >
>> > On Mon, Nov 16, 2020 at 2:45 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > For Cross and TRLanczos, make sure that the matrix is stored in DENSE
>> format, not in the default AIJ format. On the other hand, these solvers
>> build the transpose matrix explicitly, which is bad for dense matrices in
>> parallel. Try using SVDSetImplicitTranspose(), this will also save memory.
>> >
>> > For SCALAPACK, it is better if the matrix is passed in the MATSCALAPACK
>> format already, otherwise the solver must convert it internally. Still, the
>> matrix of singular vectors must be converted after computation.
>> >
>> > In any case, performance questions should include information from
>> -log_view so that we have a better idea of what is going on.
>> >
>> > Jose
>> >
>> >
>> > > El 16 nov 2020, a las 6:04, Rakesh Halder <rhalder at umich.edu>
>> escribió:
>> > >
>> > > Hi Jose,
>> > >
>> > > I'm only interested in part of the singular triplets, so those
>> algorithms work for me. I tried using ScaLAPACK and it gives similar
>> performance to Lanczos and Cross, so it's still very slow.... I'm still
>> having memory issues with LAPACK and Elemental is giving me an error
>> message indicating that the operation isn't supported for rectangular
>> matrices.
>> > >
>> > > With regards to scaLAPACK or any other solver, I'm wondering if
>> there's some settings to use with the SVD object to ensure optimal
>> performance.
>> > >
>> > > Thanks,
>> > >
>> > > Rakesh
>> > >
>> > > On Sun, Nov 15, 2020 at 2:59 PM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > Rakesh,
>> > >
>> > > The solvers you mention are not intended for computing the full SVD,
>> only part of the singular triplets. In the latest version (3.14) there are
>> now solvers that wrap external packages for parallel dense computations:
>> ScaLAPACK and Elemental.
>> > >
>> > > Jose
>> > >
>> > >
>> > > > El 15 nov 2020, a las 20:48, Matthew Knepley <knepley at gmail.com>
>> escribió:
>> > > >
>> > > > On Sun, Nov 15, 2020 at 2:18 PM Rakesh Halder <rhalder at umich.edu>
>> wrote:
>> > > > Hi all,
>> > > >
>> > > > A program I'm writing involves calculating the SVD of a large,
>> dense N by n matrix (N ~= 150,000, n ~=10,000). I've used the different SVD
>> solvers available through SLEPc, including the cross product, lanczos, and
>> method available through the LAPACK library. The cross product and lanczos
>> methods take a very long time to compute the SVD (around 7-8 hours on one
>> processor) while the solver using the LAPACK library runs out of memory. If
>> I write this matrix to a file and solve the SVD using MATLAB or python
>> (numPy) it takes around 10 minutes. I'm wondering if there's a much cheaper
>> way to solve the SVD.
>> > > >
>> > > > This seems suspicious, since I know numpy just calls LAPACK, and I
>> am fairly sure that Matlab does as well. Do the machines that you
>> > > > are running on have different amounts of RAM?
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Matt
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Rakesh
>> > > >
>> > > >
>> > > > --
>> > > > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > > > -- Norbert Wiener
>> > > >
>> > > > https://www.cse.buffalo.edu/~knepley/
>> > >
>> >
>> > <largeMat.xml><smallMat.xml>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20201117/602d75b2/attachment-0001.html>
More information about the petsc-users
mailing list