<div dir="auto">The following is from the large matrix log: </div><div dir="auto"><br></div><div dir="auto"><div dir="auto"><?xml version="1.0" encoding="UTF-8"?></div><div dir="auto"><?xml-stylesheet type="text/xsl" href="performance_xml2html.xsl"?></div><div dir="auto"><root></div><div dir="auto"><!-- PETSc Performance Summary: --></div><div dir="auto"> <petscroot></div><div dir="auto"> <runspecification desc="Run Specification"></div><div dir="auto"> <executable desc="Executable">simpleROMFoam</executable></div><div dir="auto"> <architecture desc="Architecture">real-opt</architecture></div><div dir="auto"> <hostname desc="Host">pmultigrid</hostname></div><div dir="auto"> <nprocesses desc="Number of processes">1</nprocesses></div><div dir="auto"> <user desc="Run by user">rhalder</user></div><div dir="auto"> <date desc="Started at">Mon Nov 16 20:25:52 2020</date></div><div dir="auto"> <petscrelease desc="Petsc Release">Petsc Release Version 3.14.1, Nov 03, 2020 </petscrelease></div><div dir="auto"> </runspecification></div><div dir="auto"> <globalperformance desc="Global performance"></div><div dir="auto"> <time desc="Time (sec)"></div><div dir="auto"> <max>1.299397e+03</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>1.000000</ratio></div><div dir="auto"> <average>1.299397e+03</average></div><div dir="auto"> </time></div><div dir="auto"> <objects desc="Objects"></div><div dir="auto"> <max>9.100000e+01</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>1.000000</ratio></div><div dir="auto"> <average>9.100000e+01</average></div><div dir="auto"> </objects></div><div dir="auto"> <mflop desc="MFlop"></div><div dir="auto"> <max>0.000000e+00</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>0.000000</ratio></div><div dir="auto"> <average>0.000000e+00</average></div><div dir="auto"> <total>0.000000e+00</total></div><div dir="auto"> </mflop></div><div dir="auto"> <mflops desc="MFlop/sec"></div><div dir="auto"> <max>0.000000e+00</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>0.000000</ratio></div><div dir="auto"> <average>0.000000e+00</average></div><div dir="auto"> <total>0.000000e+00</total></div><div dir="auto"> </mflops></div><div dir="auto"> <messagetransfers desc="MPI Message Transfers"></div><div dir="auto"> <max>0.000000e+00</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>0.000000</ratio></div><div dir="auto"> <average>0.000000e+00</average></div><div dir="auto"> <total>0.000000e+00</total></div><div dir="auto"> </messagetransfers></div><div dir="auto"> <messagevolume desc="MPI Message Volume (MiB)"></div><div dir="auto"> <max>0.000000e+00</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>0.000000</ratio></div><div dir="auto"> <average>0.000000e+00</average></div><div dir="auto"> <total>0.000000e+00</total></div><div dir="auto"> </messagevolume></div><div dir="auto"> <reductions desc="MPI Reductions"></div><div dir="auto"> <max>0.000000e+00</max></div><div dir="auto"> <maxrank desc="rank at which max was found">0</maxrank></div><div dir="auto"> <ratio>0.000000</ratio></div><div dir="auto"> </reductions></div><div dir="auto"> </globalperformance></div><div dir="auto"> <timertree desc="Timings tree"></div><div dir="auto"> <totaltime>1299.397478</totaltime></div><div dir="auto"> <timethreshold>0.010000</timethreshold></div><div dir="auto"> <event></div><div dir="auto"> <name>SVDSolve</name></div><div dir="auto"> <time></div><div dir="auto"> <value>75.5819</value></div><div dir="auto"> </time></div><div dir="auto"> <events></div><div dir="auto"> <event></div><div dir="auto"> <name>self</name></div><div dir="auto"> <time></div><div dir="auto"> <value>75.3134</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatConvert</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.165386</value></div><div dir="auto"> </time></div><div dir="auto"> <ncalls></div><div dir="auto"> <value>3.</value></div><div dir="auto"> </ncalls></div><div dir="auto"> <events></div><div dir="auto"> <event></div><div dir="auto"> <name>self</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.165386</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> </events></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>SVDSetUp</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.102518</value></div><div dir="auto"> </time></div><div dir="auto"> <events></div><div dir="auto"> <event></div><div dir="auto"> <name>self</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0601394</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>VecSet</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0423783</value></div><div dir="auto"> </time></div><div dir="auto"> <ncalls></div><div dir="auto"> <value>4.</value></div><div dir="auto"> </ncalls></div><div dir="auto"> </event></div><div dir="auto"> </events></div><div dir="auto"> </event></div><div dir="auto"> </events></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatConvert</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.575872</value></div><div dir="auto"> </time></div><div dir="auto"> <events></div><div dir="auto"> <event></div><div dir="auto"> <name>self</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.575869</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> </events></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatView</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.424561</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>BVCopy</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0288127</value></div><div dir="auto"> </time></div><div dir="auto"> <ncalls></div><div dir="auto"> <value>2000.</value></div><div dir="auto"> </ncalls></div><div dir="auto"> <events></div><div dir="auto"> <event></div><div dir="auto"> <name>VecCopy</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0284472</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> </events></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatAssemblyEnd</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0128941</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> </timertree></div><div dir="auto"> <selftimertable desc="Self-timings"></div><div dir="auto"> <totaltime>1299.397478</totaltime></div><div dir="auto"> <event></div><div dir="auto"> <name>SVDSolve</name></div><div dir="auto"> <time></div><div dir="auto"> <value>75.3134</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatConvert</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.741256</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatView</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.424561</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>SVDSetUp</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0601394</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>VecSet</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0424012</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>VecCopy</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0284472</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> <event></div><div dir="auto"> <name>MatAssemblyEnd</name></div><div dir="auto"> <time></div><div dir="auto"> <value>0.0128944</value></div><div dir="auto"> </time></div><div dir="auto"> </event></div><div dir="auto"> </selftimertable></div><div dir="auto"> </petscroot></div><div dir="auto"></root></div><div dir="auto"><br></div></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Nov 17, 2020 at 2:28 AM Jose E. Roman <<a href="mailto:jroman@dsic.upv.es">jroman@dsic.upv.es</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I cannot visualize the XML files. Please send the information in plain text.<br>
Jose<br>
<br>
<br>
> El 17 nov 2020, a las 5:33, Rakesh Halder <<a href="mailto:rhalder@umich.edu" target="_blank">rhalder@umich.edu</a>> escribió:<br>
> <br>
> Hi Jose,<br>
> <br>
> I attached two XML logs of two different SVD calculations where N ~= 140,000; first a small N x 5 matrix, and then a large N x 1000 matrix. The global timing starts before the SVD calculations. The small matrix calculation happens very quick in total (less than a second), while the larger one takes around 1,000 seconds. The "largeMat.xml" file shows that SVDSolve takes around 75 seconds, but when I time it myself by outputting the time difference to the console, it shows that it takes around 1,000 seconds, and I'm not sure where this mismatch is coming from.<br>
> <br>
> This is using the scaLAPACK SVD solver on a single processor, and I call MatConvert to convert my matrix to the MATSCALAPACK format.<br>
> <br>
> Thanks,<br>
> <br>
> Rakesh<br>
> <br>
> On Mon, Nov 16, 2020 at 2:45 AM Jose E. Roman <<a href="mailto:jroman@dsic.upv.es" target="_blank">jroman@dsic.upv.es</a>> wrote:<br>
> For Cross and TRLanczos, make sure that the matrix is stored in DENSE format, not in the default AIJ format. On the other hand, these solvers build the transpose matrix explicitly, which is bad for dense matrices in parallel. Try using SVDSetImplicitTranspose(), this will also save memory.<br>
> <br>
> For SCALAPACK, it is better if the matrix is passed in the MATSCALAPACK format already, otherwise the solver must convert it internally. Still, the matrix of singular vectors must be converted after computation.<br>
> <br>
> In any case, performance questions should include information from -log_view so that we have a better idea of what is going on.<br>
> <br>
> Jose<br>
> <br>
> <br>
> > El 16 nov 2020, a las 6:04, Rakesh Halder <<a href="mailto:rhalder@umich.edu" target="_blank">rhalder@umich.edu</a>> escribió:<br>
> > <br>
> > Hi Jose,<br>
> > <br>
> > I'm only interested in part of the singular triplets, so those algorithms work for me. I tried using ScaLAPACK and it gives similar performance to Lanczos and Cross, so it's still very slow.... I'm still having memory issues with LAPACK and Elemental is giving me an error message indicating that the operation isn't supported for rectangular matrices. <br>
> > <br>
> > With regards to scaLAPACK or any other solver, I'm wondering if there's some settings to use with the SVD object to ensure optimal performance.<br>
> > <br>
> > Thanks,<br>
> > <br>
> > Rakesh<br>
> > <br>
> > On Sun, Nov 15, 2020 at 2:59 PM Jose E. Roman <<a href="mailto:jroman@dsic.upv.es" target="_blank">jroman@dsic.upv.es</a>> wrote:<br>
> > Rakesh,<br>
> > <br>
> > The solvers you mention are not intended for computing the full SVD, only part of the singular triplets. In the latest version (3.14) there are now solvers that wrap external packages for parallel dense computations: ScaLAPACK and Elemental.<br>
> > <br>
> > Jose<br>
> > <br>
> > <br>
> > > El 15 nov 2020, a las 20:48, Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> escribió:<br>
> > > <br>
> > > On Sun, Nov 15, 2020 at 2:18 PM Rakesh Halder <<a href="mailto:rhalder@umich.edu" target="_blank">rhalder@umich.edu</a>> wrote:<br>
> > > Hi all,<br>
> > > <br>
> > > A program I'm writing involves calculating the SVD of a large, dense N by n matrix (N ~= 150,000, n ~=10,000). I've used the different SVD solvers available through SLEPc, including the cross product, lanczos, and method available through the LAPACK library. The cross product and lanczos methods take a very long time to compute the SVD (around 7-8 hours on one processor) while the solver using the LAPACK library runs out of memory. If I write this matrix to a file and solve the SVD using MATLAB or python (numPy) it takes around 10 minutes. I'm wondering if there's a much cheaper way to solve the SVD.<br>
> > > <br>
> > > This seems suspicious, since I know numpy just calls LAPACK, and I am fairly sure that Matlab does as well. Do the machines that you<br>
> > > are running on have different amounts of RAM?<br>
> > > <br>
> > > Thanks,<br>
> > > <br>
> > > Matt<br>
> > > <br>
> > > Thanks,<br>
> > > <br>
> > > Rakesh<br>
> > > <br>
> > > <br>
> > > -- <br>
> > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
> > > -- Norbert Wiener<br>
> > > <br>
> > > <a href="https://www.cse.buffalo.edu/~knepley/" rel="noreferrer" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br>
> > <br>
> <br>
> <largeMat.xml><smallMat.xml><br>
<br>
</blockquote></div></div>