[petsc-users] SLEPc eigensolver that uses minimal memory and finds ALL eigenvalues of a real symmetric sparse matrix in reasonable time

Shitij Bhargava shitij.cse at gmail.com
Thu Aug 11 02:23:05 CDT 2011


Thank you both for your replies.

Actually, Barry, you are right. On eight CPUs, as it turns out, it will not
take as long as I imagined. For a 9600x9600 matrix, to solve for half the
largest eigenvalues, it took about 300 minutes. Although it would have taken
much more time than this for solving half the smallest eigenvalues (I had to
kill it at total time of 600 minutes). This is still much much longer than
the LAPACK method, which takes (for calculating all the eigenvalues at once)
about 90 minutes (at the cost of much memory, which cant even be distributed
-- which is unacceptable. Also, I suppose you probably didnt know that I
have to calculate ALL the eigenvalues). But still, I was expecting that it
wouldnt complete even in "days", that is why I said it would take a very
long time....I forgot to take into account the fact that it was now running
nearly 8 times faster. (in one earlier instance I had to kill the process
when I was running the same program on one process only (was not using MPI),
as it had run over 13 hours and still hadnt completed !!)

And yes, I am making all the objects with PETSC_COMM_WORLD only, also, the
top command shows me eight processes running at nearly 100% CPU during call
to EPSSolve(). (I suppose that verifies what I am saying ? Though I am not
sure...)

I have done preallocation carefully, and verified that that part doesnt take
any extra time...so, actually when I referred to "time" here, I was talking
about time taken by the eigensolver only (because I was already satisfied by
the time taken for matrix generation and assembly)

Each process is not "generating" its part of the matrix only at this moment.
I am doing things in a very lousy way right now. Each process is generating
all the rows of the matrix, but inserts only the rows which fall into its
ownership range....but as I said, matrix generation and assembly is not the
time bottleneck in the program....eigensolver is....I am satisfied with the
time matrix generation and assembly is taking, although I understand that
ideally matrix generation should also be done in parallel. My time for this
summer training has ended, so I'll probably fix that later.

The amount of work that I have been able to accomplish in this training, has
been good because of your help. I might just have been stuck in some error
otherwise.


Thank you very much once again !!

Shitij

On 10 August 2011 02:02, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Aug 9, 2011, at 2:54 AM, Shitij Bhargava wrote:
>
> > Thanks Jose, Barry.
> >
> > I tried what you said, but that gives me an error:
> >
> > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > [0]PETSC ERROR: Argument out of range!
> > [0]PETSC ERROR: Can only get local values, trying 9!
> >
> >  This is probably because here I am trying to insert all rows of the
> matrix through process 0, but process 0 doesnt own all the rows.
> >
> > In any case, this seems very "unnatural", so I am using MPIAIJ the right
> way as you said, where I assemble the MPIAIJ matrix in parallel instead of
> only on one process. I have done that actually, and am running the code on
> the cluster right now. Its going to take a long long time to finish,
>
>
>    It shouldn't take a long time to finish. Are you sure you are creating
> all the objects with the PETSC_COMM_WORLD and not PETSC_COMM_SELF? Have you
> done the correct matrix preallocation
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly?
>   Is each process generating just its part of the matrix?
>
>  Barry
>
>
> > so I cant confirm some of my doubts, which I am asking below:
> >
> > 1. If I run the code with 1 process, and say it takes M memory (peak)
> while solving for eigenvalues, then when I run it with N processes, each
> will take nearly M/N memory (peak) (probably a little more) right ? And for
> doing this, I dont have to use any special MPI stuff....the fact that I am
> using MPIAIJ, and building the EPS object from it, and then calling
> EPSSolve() is enough ? I mean EPSSolve() is internally in some way
> distributing memory and computation effort automatically when I use MPIAIJ,
> and run the code with many processes, right ?
> > This confusion is there because when I use top, while running the code
> with 8 processes, each of them showed me nearly 250 mb initially, but each
> has grown to use 270 mb in about 70 minutes. I understand that the method
> krylovschur is such that memory requirements increase slowly, but the peak
> on any process will be less (than if I ran only one process), right ?  (Even
> though their memory requirements are growing, they will grow to some M/N
> only, right ?)
> >
> > Actually the fact that in this case, each of the process creates its own
> EPS context, initializes it itself, and then calls EPSSolve() itself without
> any "interaction" with other processes makes me wonder if they really are
> working together, or just individually (I would have verified this myself,
> but the program will take way too much time, and I know I would have to kill
> it sooner or later).....or the fact that they initialize their own EPS
> context with THEIR part of the MPI is enough to make them "cooperate and
> work together" ? (Although I think this is what Barry meant in that last
> post, but I am not too sure)
> >
> > I am not too comfortable with the MPI way of thinking right now, probably
> this is why I have this confusion.
> >
> > Anyways, I cant thank you guys enough. I would have been scrounging
> through documentation again and again to no avail if you guys had not helped
> me the way you did. The responses were always prompt, always to  the point
> (even though my questions were sometimes not, probably because I didnt
> completely understand the problems I was facing.....but you always knew what
> I was asking) and very clear. At this moment, I dont know much about
> PETSc/SLEPc myself, but I will be sure to contribute back to this list when
> I do. I have nothing but sincere gratitude for you guys.
> >
> >
> > Thank you very much !
> >
> > Shitij
> >
> >
> > On 9 August 2011 00:58, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > On Aug 8, 2011, at 2:14 AM, Shitij Bhargava wrote:
> >
> > > Thank you Jed. That was indeed the problem. I installed a separate MPI
> for PETSc/SLEPc, but was running my program with a default, already
> installed one.
> > >
> > > Now, I have a different question. What I want to do is this:
> > >
> > > 1. Only 1 process, say root, calculates the matrix in SeqAIJ format
> > > 2. Then root creates the EPS context, eps and initializes,sets
> parameters, problem type,etc. properly
> > > 3. After this the root process broadcasts this eps object to other
> processes
> > > 4. I use EPSSolve to solve for eigenvalues (all process together in
> cooperation resulting in memory distribution)
> > > 5. I get the results from root
> >
> >   We do have an undocumented routine MatDistribute_MPIAIJ(MPI_Comm
> comm,Mat gmat,PetscInt m,MatReuse reuse,Mat *inmat) in
> src/mat/impls/aij/mpi/mpiaij.c that will take a SeqAIJ matrix and distribute
> it over a larger MPI communicator.
> >
> >   Note that you cannot create the EPS context etc on a the root process
> and then broadcast the object but once the matrix is distributed you can
> simple create the EPS context etc on the parallel communicator where the
> matrix is and run with that.
> >
> >   Barry
> >
> > >
> > > is this possible ? I am not able to broadcast the EPS object, because
> it is not an MPI_DataType. Is there any PETSc/SLEPc function for this ? I am
> avoiding using MPIAIJ because that will mean making many changes in the
> existing code, including the numerous write(*,*) statements (i would have to
> convert them to PetscPrint in FORTRAN or something like that).
> > > So I want a single process to handle matrix generation and assembly,
> but want to solve the eigenproblem in parallel by different processes.
> Running the subroutine EPSSolve in parallel and hence distribute memory is
> the only reason why I want to use MPI.
> > >
> > > Thanks a lot !!
> > >
> > > Shitij
> > >
> > > On 8 August 2011 11:05, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> > > On Mon, Aug 8, 2011 at 00:29, Shitij Bhargava <shitij.cse at gmail.com>
> wrote:
> > > I ran it with:
> > >
> > > mpirun -np 2 ./slepcEigenMPI -eps_monitor
> > >
> > > I didnt do exactly what you said, because the matrix generation part in
> the actual program is quite time consuming itself. But I assume what I am
> doing is equivalent to what you meant to do? Also, I put MPD as
> PETSC_DECIDE, because I didnt know what to put it for this matrix dimension.
> > >
> > > This is the output I get: (part of the output)
> > > MATRIX ASSMEBLY DONE !!!!!!!!
> > >
> > > MATRIX ASSMEBLY DONE !!!!!!!!
> > >
> > >   1 EPS nconv=98 first unconverged value (error) 1490.88
> (1.73958730e-05)
> > >   1 EPS nconv=98 first unconverged value (error) 1490.88
> (1.73958730e-05)
> > >   2 EPS nconv=282 first unconverged value (error) 3.04636e-27
> (2.49532175e-04)
> > >   2 EPS nconv=282 first unconverged value (error) 3.04636e-27
> (2.49532175e-04)
> > >
> > > The most likely case is that you have more than one MPI implementation
> installed and that you are running with a different implementation than you
> built with. Compare the outputs:
> > >
> > > $ ldd ./slepcEigenMPI
> > > $ which mpirun
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110811/c509eeb3/attachment-0001.htm>


More information about the petsc-users mailing list