[petsc-users] SLEPc EPSGD: too much time in single iteration

Wed Jun 15 03:09:01 CDT 2022

You are comparing two different codes on two different machines? Or is it the same machine? with different number of processes and different solver options...

If it is the same machine, the performance seems very different:

Matrix A:
Average time for MPI_Barrier(): 1.90986e-05
Average time for zero size MPI_Send(): 3.44587e-06

Matrix B:
Average time for MPI_Barrier(): 0.0578456
Average time for zero size MPI_Send(): 0.00358668

The reductions (VecReduceComm) are taking 2.1629e-01 and 2.4972e+01, respectively. It's a two orders of magnitude difference.

Jose

> El 15 jun 2022, a las 8:58, Runfeng Jin <jsfaraway at gmail.com> escribió:
> 
> Sorry ,I miss the attachment.
> 
> Runfeng Jin
> 
> Runfeng Jin <jsfaraway at gmail.com> 于2022年6月15日周三 14:56写道：
> Hi! You are right!  I try to use a SLEPc and PETSc version with nodebug, and the matrix B's solver time become 99s. But It is still a little higher than matrix A(8s). Same as mentioned before, attachment is log view of no-debug version:
>    file 1:  log of matrix A solver. This is a larger matrix(900,000*900,000) but solved quickly(8s);
>    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547) but solved much slower(99s).
> 
> By comparing these two files,  the strang phenomenon still exist:
> 1) Matrix A has more basis vectors(375) than B(189), but A spent less time on BVCreate(0.6s) than B(32s);
> 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
> 3) In debug version, matrix B distribute much more unbalancedly storage among processors(memory max/min 4365) than A(memory max/min 1.113), but other metrics seems more balanced. And in no-debug version there is no memory information output.
> 
> The significant difference I can tell is :1) B use preallocation; 2) A's matrix elements are calculated by CPU, while B's matrix elements are calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
> 
> Does this is a normal result? I mean, the matrix with less non-zero elements and less dimension can cost more epssolve time? Is this due to the structure of matrix? IF so, is there any ways to increase the solve speed?
> 
> Or this is weired and should  be fixed by some ways?
> Thank you!
> 
> Runfeng Jin
>   
> 
> Jose E. Roman <jroman at dsic.upv.es> 于2022年6月12日周日 16:08写道：
> Please always respond to the list.
> 
> Pay attention to the warnings in the log:
> 
>       ##########################################################
>       #                                                        #
>       #                       WARNING!!!                       #
>       #                                                        #
>       #   This code was compiled with a debugging option.      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
> 
> With the debugging option the times are not trustworthy, so I suggest repeating the analysis with an optimized build.
> 
> Jose
> 
> 
> > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com> escribió:
> > 
> > Hello!
> >  I compare these two matrix solver's log view and find some strange thing. Attachment files are the log view.:
> >    file 1:  log of matrix A solver. This is a larger matrix(900,000*900,000) but solved quickly(30s);
> >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547 , a little different from the matrix B that is mentioned in initial email, but solved much slower too. I use this for a quicker test) but solved much slower(1244s).
> > 
> > By comparing these two files, I find some thing:
> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less time on BVCreate(0.349s) than B(296s);
> > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
> > 3) Matrix B distribute much more unbalancedly storage among processors(memory max/min 4365) than A(memory max/min 1.113), but other metrics seems more balanced.
> > 
> > I don't do prealocation in A, and it is distributed across processors by PETSc. For B , when preallocation I use PetscSplitOwnership to decide which part belongs to local processor, and B is also distributed by PETSc when compute matrix values. 
> > 
> > - Does this mean, for matrix B, too much nonzero elements are stored in single process, and this is why it cost too much more time in solving the matrix and find eigenvalues? If so,  are there some better ways to distribute the matrix among processors?  
> > - Or are there any else reasons for this difference in cost time?
> > 
> > Hope to recieve your reply, thank you!
> > 
> > Runfeng Jin
> > 
> > 
> > 
> > Runfeng Jin <jsfaraway at gmail.com> 于2022年6月11日周六 20:33写道：
> > Hello!
> > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much time. Is there anything else I can do? Attachment is log when use PETSC_DEFAULT for eps_ncv.
> > 
> > Thank you !
> > 
> > Runfeng Jin
> > 
> > Jose E. Roman <jroman at dsic.upv.es> 于2022年6月10日周五 20:50写道：
> > The value -eps_ncv 5000 is huge.
> > Better let SLEPc use the default value.
> > 
> > Jose
> > 
> > 
> > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com> escribió:
> > > 
> > > Hello!
> > >  I want to acquire the 3 smallest eigenvalue, and attachment is the log  view output. I can see epssolve really cost the major time. But I can not see why it cost so much time. Can you see something from it?
> > > 
> > > Thank you !
> > > 
> > > Runfeng Jin
> > > 
> > > On 6月 4 2022, at 1:37 凌晨, Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > Convergence depends on distribution of eigenvalues you want to compute. On the other hand, the cost also depends on the time it takes to build the preconditioner. Use -log_view to see the cost of the different steps of the computation.
> > > 
> > > Jose
> > > 
> > > 
> > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com> escribió:
> > > >
> > > > hello!
> > > >
> > > > I am trying to use epsgd compute matrix's one smallest eigenvalue. And I find a strang thing. There are two matrix A(900000*900000) and B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B use 22 iterations and 38885s! What could be the reason for this? Or what can I do to find the reason?
> > > >
> > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
> > > > And there is one difference I can tell is matrix B has many small value, whose absolute value is less than 10-6. Could this be the reason?
> > > >
> > > > Thank you!
> > > >
> > > > Runfeng Jin
> > > <log_view.txt>
> > 
> > <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
> 
> <file2_nodebug_MatrixB.txt><file1_nodebug_MatrixA.txt>