Thank you both for your replies.<br><br>Actually, Barry, you are right. On eight CPUs, as it turns out, it will not take as long as I imagined. For a 9600x9600 matrix, to solve for half the largest eigenvalues, it took about 300 minutes. Although it would have taken much more time than this for solving half the smallest eigenvalues (I had to kill it at total time of 600 minutes). This is still much much longer than the LAPACK method, which takes (for calculating all the eigenvalues at once) about 90 minutes (at the cost of much memory, which cant even be distributed -- which is unacceptable. Also, I suppose you probably didnt know that I have to calculate ALL the eigenvalues). But still, I was expecting that it wouldnt complete even in "days", that is why I said it would take a very long time....I forgot to take into account the fact that it was now running nearly 8 times faster. (in one earlier instance I had to kill the process when I was running the same program on one process only (was not using MPI), as it had run over 13 hours and still hadnt completed !!)<br>
<br>And yes, I am making all the objects with PETSC_COMM_WORLD only, also, the top command shows me eight processes running at nearly 100% CPU during call to EPSSolve(). (I suppose that verifies what I am saying ? Though I am not sure...)<br>
<br>I have done preallocation carefully, and verified that that part doesnt take any extra time...so, actually when I referred to "time" here, I was talking about time taken by the eigensolver only (because I was already satisfied by the time taken for matrix generation and assembly)<br>
<br>Each process is not "generating" its part of the matrix only at this moment. I am doing things in a very lousy way right now. Each process is generating all the rows of the matrix, but inserts only the rows which fall into its ownership range....but as I said, matrix generation and assembly is not the time bottleneck in the program....eigensolver is....I am satisfied with the time matrix generation and assembly is taking, although I understand that ideally matrix generation should also be done in parallel. My time for this summer training has ended, so I'll probably fix that later. <br>
<br>The amount of work that I have been able to accomplish in this training, has been good because of your help. I might just have been stuck in some error otherwise.<br><br><br>Thank you very much once again !!<br><br>Shitij<br>
<br><div class="gmail_quote">On 10 August 2011 02:02, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><br>
On Aug 9, 2011, at 2:54 AM, Shitij Bhargava wrote:<br>
<br>
> Thanks Jose, Barry.<br>
><br>
> I tried what you said, but that gives me an error:<br>
><br>
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------<br>
> [0]PETSC ERROR: Argument out of range!<br>
> [0]PETSC ERROR: Can only get local values, trying 9!<br>
><br>
> This is probably because here I am trying to insert all rows of the matrix through process 0, but process 0 doesnt own all the rows.<br>
><br>
> In any case, this seems very "unnatural", so I am using MPIAIJ the right way as you said, where I assemble the MPIAIJ matrix in parallel instead of only on one process. I have done that actually, and am running the code on the cluster right now. Its going to take a long long time to finish,<br>
<br>
<br>
</div> It shouldn't take a long time to finish. Are you sure you are creating all the objects with the PETSC_COMM_WORLD and not PETSC_COMM_SELF? Have you done the correct matrix preallocation <a href="http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly" target="_blank">http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly</a>? Is each process generating just its part of the matrix?<br>
<font color="#888888"><br>
Barry<br>
</font><div><div></div><div><br>
<br>
> so I cant confirm some of my doubts, which I am asking below:<br>
><br>
> 1. If I run the code with 1 process, and say it takes M memory (peak) while solving for eigenvalues, then when I run it with N processes, each will take nearly M/N memory (peak) (probably a little more) right ? And for doing this, I dont have to use any special MPI stuff....the fact that I am using MPIAIJ, and building the EPS object from it, and then calling EPSSolve() is enough ? I mean EPSSolve() is internally in some way distributing memory and computation effort automatically when I use MPIAIJ, and run the code with many processes, right ?<br>
> This confusion is there because when I use top, while running the code with 8 processes, each of them showed me nearly 250 mb initially, but each has grown to use 270 mb in about 70 minutes. I understand that the method krylovschur is such that memory requirements increase slowly, but the peak on any process will be less (than if I ran only one process), right ? (Even though their memory requirements are growing, they will grow to some M/N only, right ?)<br>
><br>
> Actually the fact that in this case, each of the process creates its own EPS context, initializes it itself, and then calls EPSSolve() itself without any "interaction" with other processes makes me wonder if they really are working together, or just individually (I would have verified this myself, but the program will take way too much time, and I know I would have to kill it sooner or later).....or the fact that they initialize their own EPS context with THEIR part of the MPI is enough to make them "cooperate and work together" ? (Although I think this is what Barry meant in that last post, but I am not too sure)<br>
><br>
> I am not too comfortable with the MPI way of thinking right now, probably this is why I have this confusion.<br>
><br>
> Anyways, I cant thank you guys enough. I would have been scrounging through documentation again and again to no avail if you guys had not helped me the way you did. The responses were always prompt, always to the point (even though my questions were sometimes not, probably because I didnt completely understand the problems I was facing.....but you always knew what I was asking) and very clear. At this moment, I dont know much about PETSc/SLEPc myself, but I will be sure to contribute back to this list when I do. I have nothing but sincere gratitude for you guys.<br>
><br>
><br>
> Thank you very much !<br>
><br>
> Shitij<br>
><br>
><br>
> On 9 August 2011 00:58, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>> wrote:<br>
><br>
> On Aug 8, 2011, at 2:14 AM, Shitij Bhargava wrote:<br>
><br>
> > Thank you Jed. That was indeed the problem. I installed a separate MPI for PETSc/SLEPc, but was running my program with a default, already installed one.<br>
> ><br>
> > Now, I have a different question. What I want to do is this:<br>
> ><br>
> > 1. Only 1 process, say root, calculates the matrix in SeqAIJ format<br>
> > 2. Then root creates the EPS context, eps and initializes,sets parameters, problem type,etc. properly<br>
> > 3. After this the root process broadcasts this eps object to other processes<br>
> > 4. I use EPSSolve to solve for eigenvalues (all process together in cooperation resulting in memory distribution)<br>
> > 5. I get the results from root<br>
><br>
> We do have an undocumented routine MatDistribute_MPIAIJ(MPI_Comm comm,Mat gmat,PetscInt m,MatReuse reuse,Mat *inmat) in src/mat/impls/aij/mpi/mpiaij.c that will take a SeqAIJ matrix and distribute it over a larger MPI communicator.<br>
><br>
> Note that you cannot create the EPS context etc on a the root process and then broadcast the object but once the matrix is distributed you can simple create the EPS context etc on the parallel communicator where the matrix is and run with that.<br>
><br>
> Barry<br>
><br>
> ><br>
> > is this possible ? I am not able to broadcast the EPS object, because it is not an MPI_DataType. Is there any PETSc/SLEPc function for this ? I am avoiding using MPIAIJ because that will mean making many changes in the existing code, including the numerous write(*,*) statements (i would have to convert them to PetscPrint in FORTRAN or something like that).<br>
> > So I want a single process to handle matrix generation and assembly, but want to solve the eigenproblem in parallel by different processes. Running the subroutine EPSSolve in parallel and hence distribute memory is the only reason why I want to use MPI.<br>
> ><br>
> > Thanks a lot !!<br>
> ><br>
> > Shitij<br>
> ><br>
> > On 8 August 2011 11:05, Jed Brown <<a href="mailto:jedbrown@mcs.anl.gov" target="_blank">jedbrown@mcs.anl.gov</a>> wrote:<br>
> > On Mon, Aug 8, 2011 at 00:29, Shitij Bhargava <<a href="mailto:shitij.cse@gmail.com" target="_blank">shitij.cse@gmail.com</a>> wrote:<br>
> > I ran it with:<br>
> ><br>
> > mpirun -np 2 ./slepcEigenMPI -eps_monitor<br>
> ><br>
> > I didnt do exactly what you said, because the matrix generation part in the actual program is quite time consuming itself. But I assume what I am doing is equivalent to what you meant to do? Also, I put MPD as PETSC_DECIDE, because I didnt know what to put it for this matrix dimension.<br>
> ><br>
> > This is the output I get: (part of the output)<br>
> > MATRIX ASSMEBLY DONE !!!!!!!!<br>
> ><br>
> > MATRIX ASSMEBLY DONE !!!!!!!!<br>
> ><br>
> > 1 EPS nconv=98 first unconverged value (error) 1490.88 (1.73958730e-05)<br>
> > 1 EPS nconv=98 first unconverged value (error) 1490.88 (1.73958730e-05)<br>
> > 2 EPS nconv=282 first unconverged value (error) 3.04636e-27 (2.49532175e-04)<br>
> > 2 EPS nconv=282 first unconverged value (error) 3.04636e-27 (2.49532175e-04)<br>
> ><br>
> > The most likely case is that you have more than one MPI implementation installed and that you are running with a different implementation than you built with. Compare the outputs:<br>
> ><br>
> > $ ldd ./slepcEigenMPI<br>
> > $ which mpirun<br>
> ><br>
><br>
><br>
<br>
</div></div></blockquote></div><br>