[petsc-users] SLEPc eigensolver that uses minimal memory and finds ALL eigenvalues of a real symmetric sparse matrix in reasonable time

Barry Smith bsmith at mcs.anl.gov
Tue Aug 9 15:32:33 CDT 2011


On Aug 9, 2011, at 2:54 AM, Shitij Bhargava wrote:

> Thanks Jose, Barry.
> 
> I tried what you said, but that gives me an error:
> 
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Argument out of range!
> [0]PETSC ERROR: Can only get local values, trying 9!
> 
>  This is probably because here I am trying to insert all rows of the matrix through process 0, but process 0 doesnt own all the rows.
> 
> In any case, this seems very "unnatural", so I am using MPIAIJ the right way as you said, where I assemble the MPIAIJ matrix in parallel instead of only on one process. I have done that actually, and am running the code on the cluster right now. Its going to take a long long time to finish,


   It shouldn't take a long time to finish. Are you sure you are creating all the objects with the PETSC_COMM_WORLD and not PETSC_COMM_SELF? Have you done the correct matrix preallocation http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly?   Is each process generating just its part of the matrix? 

  Barry


> so I cant confirm some of my doubts, which I am asking below:
> 
> 1. If I run the code with 1 process, and say it takes M memory (peak) while solving for eigenvalues, then when I run it with N processes, each will take nearly M/N memory (peak) (probably a little more) right ? And for doing this, I dont have to use any special MPI stuff....the fact that I am using MPIAIJ, and building the EPS object from it, and then calling EPSSolve() is enough ? I mean EPSSolve() is internally in some way distributing memory and computation effort automatically when I use MPIAIJ, and run the code with many processes, right ?
> This confusion is there because when I use top, while running the code with 8 processes, each of them showed me nearly 250 mb initially, but each has grown to use 270 mb in about 70 minutes. I understand that the method krylovschur is such that memory requirements increase slowly, but the peak on any process will be less (than if I ran only one process), right ?  (Even though their memory requirements are growing, they will grow to some M/N only, right ?)
> 
> Actually the fact that in this case, each of the process creates its own EPS context, initializes it itself, and then calls EPSSolve() itself without any "interaction" with other processes makes me wonder if they really are working together, or just individually (I would have verified this myself, but the program will take way too much time, and I know I would have to kill it sooner or later).....or the fact that they initialize their own EPS context with THEIR part of the MPI is enough to make them "cooperate and work together" ? (Although I think this is what Barry meant in that last post, but I am not too sure)
> 
> I am not too comfortable with the MPI way of thinking right now, probably this is why I have this confusion.
> 
> Anyways, I cant thank you guys enough. I would have been scrounging through documentation again and again to no avail if you guys had not helped me the way you did. The responses were always prompt, always to  the point (even though my questions were sometimes not, probably because I didnt completely understand the problems I was facing.....but you always knew what I was asking) and very clear. At this moment, I dont know much about PETSc/SLEPc myself, but I will be sure to contribute back to this list when I do. I have nothing but sincere gratitude for you guys.
> 
> 
> Thank you very much !
> 
> Shitij
> 
> 
> On 9 August 2011 00:58, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On Aug 8, 2011, at 2:14 AM, Shitij Bhargava wrote:
> 
> > Thank you Jed. That was indeed the problem. I installed a separate MPI for PETSc/SLEPc, but was running my program with a default, already installed one.
> >
> > Now, I have a different question. What I want to do is this:
> >
> > 1. Only 1 process, say root, calculates the matrix in SeqAIJ format
> > 2. Then root creates the EPS context, eps and initializes,sets parameters, problem type,etc. properly
> > 3. After this the root process broadcasts this eps object to other processes
> > 4. I use EPSSolve to solve for eigenvalues (all process together in cooperation resulting in memory distribution)
> > 5. I get the results from root
> 
>   We do have an undocumented routine MatDistribute_MPIAIJ(MPI_Comm comm,Mat gmat,PetscInt m,MatReuse reuse,Mat *inmat) in src/mat/impls/aij/mpi/mpiaij.c that will take a SeqAIJ matrix and distribute it over a larger MPI communicator.
> 
>   Note that you cannot create the EPS context etc on a the root process and then broadcast the object but once the matrix is distributed you can simple create the EPS context etc on the parallel communicator where the matrix is and run with that.
> 
>   Barry
> 
> >
> > is this possible ? I am not able to broadcast the EPS object, because it is not an MPI_DataType. Is there any PETSc/SLEPc function for this ? I am avoiding using MPIAIJ because that will mean making many changes in the existing code, including the numerous write(*,*) statements (i would have to convert them to PetscPrint in FORTRAN or something like that).
> > So I want a single process to handle matrix generation and assembly, but want to solve the eigenproblem in parallel by different processes. Running the subroutine EPSSolve in parallel and hence distribute memory is the only reason why I want to use MPI.
> >
> > Thanks a lot !!
> >
> > Shitij
> >
> > On 8 August 2011 11:05, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> > On Mon, Aug 8, 2011 at 00:29, Shitij Bhargava <shitij.cse at gmail.com> wrote:
> > I ran it with:
> >
> > mpirun -np 2 ./slepcEigenMPI -eps_monitor
> >
> > I didnt do exactly what you said, because the matrix generation part in the actual program is quite time consuming itself. But I assume what I am doing is equivalent to what you meant to do? Also, I put MPD as PETSC_DECIDE, because I didnt know what to put it for this matrix dimension.
> >
> > This is the output I get: (part of the output)
> > MATRIX ASSMEBLY DONE !!!!!!!!
> >
> > MATRIX ASSMEBLY DONE !!!!!!!!
> >
> >   1 EPS nconv=98 first unconverged value (error) 1490.88 (1.73958730e-05)
> >   1 EPS nconv=98 first unconverged value (error) 1490.88 (1.73958730e-05)
> >   2 EPS nconv=282 first unconverged value (error) 3.04636e-27 (2.49532175e-04)
> >   2 EPS nconv=282 first unconverged value (error) 3.04636e-27 (2.49532175e-04)
> >
> > The most likely case is that you have more than one MPI implementation installed and that you are running with a different implementation than you built with. Compare the outputs:
> >
> > $ ldd ./slepcEigenMPI
> > $ which mpirun
> >
> 
> 



More information about the petsc-users mailing list