From shriram at ualberta.ca Sun Jun 1 18:00:26 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Sun, 01 Jun 2014 17:00:26 -0600 Subject: [petsc-users] set a Vec as row or column of a Mat Message-ID: <538BB08A.2010500@ualberta.ca> Hi, I have a petsc Vec object ( say length 3). I want to use its entries as the column of a Petsc Mat (say 3x3). Is there an obvious way to do this ? I could only think of first copying the entries of the Vec into an array with VecGetValues, and then use it to set the values of the column of the Mat. In case it is relevant, I am doing a sweep of a direction splitting method over a square, and I want to store the solution (vector) of each one dimensional problem in a Mat. Thanks, Shriram From jed at jedbrown.org Sun Jun 1 18:10:51 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 02 Jun 2014 01:10:51 +0200 Subject: [petsc-users] set a Vec as row or column of a Mat In-Reply-To: <538BB08A.2010500@ualberta.ca> References: <538BB08A.2010500@ualberta.ca> Message-ID: <87oaycql8k.fsf@jedbrown.org> Shriram Srinivasan writes: > Hi, > I have a petsc Vec object ( say length 3). I want to use its entries as > the column of a Petsc Mat (say 3x3). PETSc Mat and Vec are not really meant for such small sizes. Some people like template libraries like Eigen if they have lots of complicated expressions involving tiny matrices/tensors, otherwise it's common to write the small loops or even unroll (you want the compiler to perform aggressive transformations on sizes this small). If you were thinking of much larger values for "3", the MatDense format uses column-oriented storage so you can use VecGetArrayRead(), then copy those columns into the contiguous storage used by MatDense. > Is there an obvious way to do this ? > I could only think of first copying the entries of the Vec into an > array with VecGetValues, and then use it to set the values of the column > of the Mat. > In case it is relevant, I am doing a sweep of a direction splitting > method over a square, and I want to store the solution (vector) of each > one dimensional problem in a Mat. > > Thanks, > Shriram -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sun Jun 1 20:18:07 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 1 Jun 2014 20:18:07 -0500 Subject: [petsc-users] set a Vec as row or column of a Mat In-Reply-To: <538BB08A.2010500@ualberta.ca> References: <538BB08A.2010500@ualberta.ca> Message-ID: <8172365C-EBF3-41C5-8E75-81CC841D2A2E@mcs.anl.gov> Do you intend to do this in parallel or just sequential? I assume the ?solution of each one dimensional problem? involves a tridiagonal (or wider if you several degrees of freedom per point or a wider stencil) linear solve? Are all the linear solves the same or is each one different (nonlinear function of coefficients?)? Perhaps if you provide more details on the overall problem you are solving we may have more suggestions. A PETSc Mat is really intended as a linear operator, not a way to store 2d arrays, (unlike MATLAB, which is structured for both roles). I would use a DMDACreate2d() and then ?fish out? the rows into a Vec with VecPlaceArray() requires no copies of data and then the columns with VecStrideGather() (and put back with VecStrideScatter) requires copying the entries (but there is no avoiding the copy in this case since the columns are not adjacent entries in Vec obtained from DMCreateGlobalVector(). Barry On Jun 1, 2014, at 6:00 PM, Shriram Srinivasan wrote: > Hi, > I have a petsc Vec object ( say length 3). I want to use its entries as the column of a Petsc Mat (say 3x3). > Is there an obvious way to do this ? > I could only think of first copying the entries of the Vec into an array with VecGetValues, and then use it to set the values of the column of the Mat. > In case it is relevant, I am doing a sweep of a direction splitting method over a square, and I want to store the solution (vector) of each one dimensional problem in a Mat. > > Thanks, > Shriram From zonexo at gmail.com Mon Jun 2 00:45:53 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 02 Jun 2014 13:45:53 +0800 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: References: <534C9A2C.5060404@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> <5379A433.5000401@gmail.com> Message-ID: <538C0F91.7060008@gmail.com> Dear all, May I know if there is any problem compiling / building / running the file I emailed. We can work together if there is any. Thank you Yours sincerely, TAY wee-beng On 20/5/2014 1:43 AM, Barry Smith wrote: > On May 19, 2014, at 1:26 AM, TAY wee-beng wrote: > >> On 19/5/2014 11:36 AM, Barry Smith wrote: >>> On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: >>> >>>> On 19/5/2014 9:53 AM, Matthew Knepley wrote: >>>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >>>>> Hi Barry, >>>>> >>>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? >>>>> >>>>> Yes it works with Intel. Is this using optimization? >>>> Hi Matt, >>>> >>>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? >>> No. Does it run clean under valgrind? >> Hi, >> >> Do you mean the debug or optimized version? > Both. > >> Thanks. >>>>> Matt >>>>> >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 14/5/2014 12:03 AM, Barry Smith wrote: >>>>> Please send you current code. So we may compile and run it. >>>>> >>>>> Barry >>>>> >>>>> >>>>> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >>>>> >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>>>> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >>>>> >>>>> Barry >>>>> >>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>>>> >>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>>> >>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>>> >>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>> Hmm, >>>>> >>>>> Interface DMDAVecGetArrayF90 >>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>> USE_DM_HIDE >>>>> DM_HIDE da1 >>>>> VEC_HIDE v >>>>> PetscScalar,pointer :: d1(:,:,:) >>>>> PetscErrorCode ierr >>>>> End Subroutine >>>>> >>>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>> Hi, >>>>> >>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>>> >>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>>> >>>>> Also, supposed I call: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> u_array .... >>>>> >>>>> v_array .... etc >>>>> >>>>> Now to restore the array, does it matter the sequence they are restored? >>>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>>> >>>>> Hi, >>>>> >>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>>> >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> u_array = 0.d0 >>>>> >>>>> v_array = 0.d0 >>>>> >>>>> w_array = 0.d0 >>>>> >>>>> p_array = 0.d0 >>>>> >>>>> >>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>>> >>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>> Hi Matt, >>>>> >>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>> >>>>> It already has DMDAVecGetArray(). Just run it. >>>>> Hi, >>>>> >>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>>> >>>>> No the global/local difference should not matter. >>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>>> >>>>> DMGetLocalVector() >>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>>> >>>>> If so, when should I call them? >>>>> >>>>> You just need a local vector from somewhere. >>>>> Hi, >>>>> >>>>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>>>> >>>>> Thanks. >>>>> Hi, >>>>> >>>>> I insert part of my error region code into ex11f90: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> u_array = 0.d0 >>>>> v_array = 0.d0 >>>>> w_array = 0.d0 >>>>> p_array = 0.d0 >>>>> >>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>>>> >>>>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>>>> >>>>> module solve >>>>> <- add include file? >>>>> subroutine RRK >>>>> <- add include file? >>>>> end subroutine RRK >>>>> >>>>> end module solve >>>>> >>>>> So where should the include files (#include ) be placed? >>>>> >>>>> After the module or inside the subroutine? >>>>> >>>>> Thanks. >>>>> Matt >>>>> Thanks. >>>>> Matt >>>>> Thanks. >>>>> Matt >>>>> Thanks >>>>> >>>>> Regards. >>>>> Matt >>>>> As in w, then v and u? >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> thanks >>>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>>> Hi, >>>>> >>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>> >>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>>> >>>>> >>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>>> >>>>> Barry >>>>> >>>>> Thanks. >>>>> Barry >>>>> >>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>>> >>>>> Hi, >>>>> >>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>>> >>>>> However, by re-writing my code, I found out a few things: >>>>> >>>>> 1. if I write my code this way: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> u_array = .... >>>>> >>>>> v_array = .... >>>>> >>>>> w_array = .... >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> The code runs fine. >>>>> >>>>> 2. if I write my code this way: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>> >>>>> where the subroutine is: >>>>> >>>>> subroutine uvw_array_change(u,v,w) >>>>> >>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>> >>>>> u ... >>>>> v... >>>>> w ... >>>>> >>>>> end subroutine uvw_array_change. >>>>> >>>>> The above will give an error at : >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> 3. Same as above, except I change the order of the last 3 lines to: >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> So they are now in reversed order. Now it works. >>>>> >>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>> >>>>> subroutine uvw_array_change(u,v,w) >>>>> >>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>> >>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>> >>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>> >>>>> u ... >>>>> v... >>>>> w ... >>>>> >>>>> end subroutine uvw_array_change. >>>>> >>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>>> >>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>>> >>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>>> >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>> >>>>> >>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>>> >>>>> Hi Barry, >>>>> >>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>>> >>>>> I have attached my code. >>>>> >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>> Please send the code that creates da_w and the declarations of w_array >>>>> >>>>> Barry >>>>> >>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>> >>>>> wrote: >>>>> >>>>> >>>>> Hi Barry, >>>>> >>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>> >>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>> >>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>>> >>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>>> >>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>> -------------------------------------------------------------------------- >>>>> An MPI process has executed an operation involving a call to the >>>>> "fork()" system call to create a child process. Open MPI is currently >>>>> operating in a condition that could result in memory corruption or >>>>> other system errors; your MPI job may hang, crash, or produce silent >>>>> data corruption. The use of fork() (or system() or other calls that >>>>> create child processes) is strongly discouraged. >>>>> >>>>> The process that invoked fork was: >>>>> >>>>> Local host: n12-76 (PID 20235) >>>>> MPI_COMM_WORLD rank: 2 >>>>> >>>>> If you are *absolutely sure* that your application will successfully >>>>> and correctly survive a call to fork(), you may disable this warning >>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>> -------------------------------------------------------------------------- >>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>>> >>>>> .... >>>>> >>>>> 1 >>>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>> [1]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>> [1]PETSC ERROR: to get more information on the crash. >>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>> [3]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>> [3]PETSC ERROR: to get more information on the crash. >>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>> >>>>> ... >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>> >>>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>>> >>>>> Barry >>>>> >>>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>>> >>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>> >>>>> >>>>> >>>>> wrote: >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> -- >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener From mrosso at uci.edu Mon Jun 2 00:46:56 2014 From: mrosso at uci.edu (Michele Rosso) Date: Sun, 01 Jun 2014 22:46:56 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> <538369C9.6010209@uci.edu> Message-ID: <538C0FD0.7080300@uci.edu> Mark, I tried to reset every-time: the number of iterations is now constant during the whole simulation! I tried GMG instead of AMG as well: it works in this case too, so the trick was to reset the ksp object each time. As you predicted, each solve takes longer since the ksp has to be setup again. I noticed that the time increase is larger than 2x, particularly for large grids. I need to optimize the solve now, maybe by resetting only when needed. Could you help me with that please? Thanks, Michele On 05/28/2014 07:54 PM, Mark Adams wrote: > > > > On Mon, May 26, 2014 at 12:20 PM, Michele Rosso > wrote: > > Mark, > > thank you for your input and sorry my late reply: I saw your email > only now. > By setting up the solver each time step you mean re-defining the > KSP context every time? > > > THe simplest thing is to just delete the object and create it again. > THere are "reset" methods that do the same thing semantically but it > is probably just easier to destroy the KSP object and recreate it and > redo your setup code. > > Why should this help? > > > AMG methods optimized for a particular operator but "stale" setup data > often work well on problems that evolve, at least for a while, and it > saves a lot of time to not redo the "setup" every time. How often you > should "refresh" the setup data is problem dependant and the > application needs to control that. There are some hooks to fine tune > how much setup data is recomputed each solve, but we are just trying > to see if redoing the setup every time helps. If this fixes the > problem then we can think about cost. If it does not fix the problem > then it is more serious. > > I will definitely try that as well as the hypre solution and > report back. > Again, thank you. > > Michele > > > On 05/22/2014 09:34 AM, Mark Adams wrote: >> If the solver is degrading as the coefficients change, and I >> would assume get more nasty, you can try deleting the solver at >> each time step. This will be about 2x more expensive, because it >> does the setup each solve, but it might fix your problem. >> >> You also might try: >> >> -pc_type hypre >> -pc_hypre_type boomeramg >> >> >> >> >> On Mon, May 19, 2014 at 6:49 PM, Jed Brown > > wrote: >> >> Michele Rosso > writes: >> >> > Jed, >> > >> > thank you very much! >> > I will try with ///-mg_levels_ksp_type chebyshev >> -mg_levels_pc_type >> > sor/ and report back. >> > Yes, I removed the nullspace from both the system matrix >> and the rhs. >> > Is there a way to have something similar to Dendy's >> multigrid or the >> > deflated conjugate gradient method with PETSc? >> >> Dendy's MG needs geometry. The algorithm to produce the >> interpolation >> operators is not terribly complicated so it could be done, >> though DMDA >> support for cell-centered is a somewhat awkward. "Deflated >> CG" can mean >> lots of things so you'll have to be more precise. (Most >> everything in >> the "deflation" world has a clear analogue in the MG world, >> but the >> deflation community doesn't have a precise language to talk >> about their >> methods so you always have to read the paper carefully to >> find out if >> it's completely standard or if there is something new.) >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jun 2 01:29:31 2014 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 1 Jun 2014 23:29:31 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <87zji0xef6.fsf@jedbrown.org> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> <538369C9.6010209@uci.edu> <87zji0xef6.fsf@jedbrown.org> Message-ID: > > > Mark, if PCReset (via KSPReset) does not produce the same behavior as > destroying the KSP and recreating it, it is a bug. I think this is the > case, but if it's not, it needs to be fixed. > Yes it should work. I've just never used it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jun 2 01:34:50 2014 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 1 Jun 2014 23:34:50 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <538C0FD0.7080300@uci.edu> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> <538369C9.6010209@uci.edu> <538C0FD0.7080300@uci.edu> Message-ID: On Sun, Jun 1, 2014 at 10:46 PM, Michele Rosso wrote: > Mark, > > I tried to reset every-time: the number of iterations is now constant > during the whole simulation! > I tried GMG instead of AMG as well: it works in this case too, so the > trick was to reset the ksp object each time. > As you predicted, each solve takes longer since the ksp has to be setup > again. I noticed that the time increase is larger than 2x, particularly for > large grids. > I need to optimize the solve now, maybe by resetting only when needed. > Could you help me with that please? > This logic that you put in your code. You can setup a variable for a frequency of resetting, if (mod(++count,reset_factor)==0) solver. KSPReset(ksp); // or whatever. You could check the iteration count after a solve and if it is too high. > > Thanks, > Michele > > > On 05/28/2014 07:54 PM, Mark Adams wrote: > > > > > On Mon, May 26, 2014 at 12:20 PM, Michele Rosso wrote: > >> Mark, >> >> thank you for your input and sorry my late reply: I saw your email only >> now. >> By setting up the solver each time step you mean re-defining the KSP >> context every time? >> > > THe simplest thing is to just delete the object and create it again. > THere are "reset" methods that do the same thing semantically but it is > probably just easier to destroy the KSP object and recreate it and redo > your setup code. > > >> Why should this help? >> > > AMG methods optimized for a particular operator but "stale" setup data > often work well on problems that evolve, at least for a while, and it saves > a lot of time to not redo the "setup" every time. How often you should > "refresh" the setup data is problem dependant and the application needs to > control that. There are some hooks to fine tune how much setup data is > recomputed each solve, but we are just trying to see if redoing the setup > every time helps. If this fixes the problem then we can think about cost. > If it does not fix the problem then it is more serious. > > >> I will definitely try that as well as the hypre solution and report back. >> Again, thank you. >> >> Michele >> >> >> On 05/22/2014 09:34 AM, Mark Adams wrote: >> >> If the solver is degrading as the coefficients change, and I would assume >> get more nasty, you can try deleting the solver at each time step. This >> will be about 2x more expensive, because it does the setup each solve, but >> it might fix your problem. >> >> You also might try: >> >> -pc_type hypre >> -pc_hypre_type boomeramg >> >> >> >> >> On Mon, May 19, 2014 at 6:49 PM, Jed Brown wrote: >> >>> Michele Rosso writes: >>> >>> > Jed, >>> > >>> > thank you very much! >>> > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type >>> > sor/ and report back. >>> > Yes, I removed the nullspace from both the system matrix and the rhs. >>> > Is there a way to have something similar to Dendy's multigrid or the >>> > deflated conjugate gradient method with PETSc? >>> >>> Dendy's MG needs geometry. The algorithm to produce the interpolation >>> operators is not terribly complicated so it could be done, though DMDA >>> support for cell-centered is a somewhat awkward. "Deflated CG" can mean >>> lots of things so you'll have to be more precise. (Most everything in >>> the "deflation" world has a clear analogue in the MG world, but the >>> deflation community doesn't have a precise language to talk about their >>> methods so you always have to read the paper carefully to find out if >>> it's completely standard or if there is something new.) >>> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Mon Jun 2 02:13:12 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Mon, 2 Jun 2014 17:13:12 +1000 Subject: [petsc-users] Use MPIAIJ, but receive help information of seqaij Message-ID: Hi, all Recently I'm trying to improve the performance of my program, so I add -help option to see options available. I think I use MPIAIJ type of MAT, but I receive information of SEQAIJ. Is it means actually I am using a sequential matrix type? If not, how could I get information of MPIAIJ to improve performance? Code of generating mate MatCreate(PETSC_COMM_WORLD,&A); MatSetType(A,MATMPIAIJ); MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); Code of getting information char common_options[] = "-help -st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2"; ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); Thank you very much Guoxi -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Mon Jun 2 02:54:42 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 2 Jun 2014 07:54:42 +0000 Subject: [petsc-users] MatNestGetISs in fortran In-Reply-To: <64c4658aeb7441abbe20e4aa252554a2@MAR190N1.marin.local> References: <64c4658aeb7441abbe20e4aa252554a2@MAR190N1.marin.local> Message-ID: Just a reminder. Could you please add fortran support for MatNestGetISs? dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl ________________________________________ From: Klaij, Christiaan Sent: Tuesday, May 27, 2014 3:47 PM To: petsc-users at mcs.anl.gov Subject: MatNestGetISs in fortran I'm trying to use MatNestGetISs in a fortran program but it seems to be missing from the fortran include file (PETSc 3.4). From hgbk2008 at gmail.com Mon Jun 2 06:04:39 2014 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Mon, 02 Jun 2014 13:04:39 +0200 Subject: [petsc-users] row scale the matrix In-Reply-To: <78644A10-9A8F-452F-8306-B445B5C3D60E@mcs.anl.gov> References: <53836582.6010804@gmail.com> <78644A10-9A8F-452F-8306-B445B5C3D60E@mcs.anl.gov> Message-ID: <538C5A47.3030503@gmail.com> That's right. It does exactly what I want. BR Bui On 05/26/2014 08:44 PM, Barry Smith wrote: > Why not MatDiagonalScale()? The left diagonal matrix l scales each row i of the matrix by l[i,i] so it seems to do exactly what you want. > > Barry > > On May 26, 2014, at 11:02 AM, Hoang Giang Bui wrote: > >> Hi >> >> My matrix contains some overshoot entries in the diagonal and I want to row scale by a factor that I defined. How can I do that with petsc ? (I don't want to use MatDiagonalScale instead, I also don't want to create a diagonal matrix and left multiply to the system.) >> >> BR >> Bui >> From romain.veltz at inria.fr Mon Jun 2 07:04:15 2014 From: romain.veltz at inria.fr (Veltz Romain) Date: Mon, 2 Jun 2014 14:04:15 +0200 Subject: [petsc-users] Calling trilinos function from PETSC Message-ID: Dear user, I am wondering whether one can call Trilinos LOCA functions from PETSc and if anybody has ever done it. Thank you for your help, Bests, Veltz Romain Neuromathcomp Project Team Inria Sophia Antipolis M?diterran?e 2004 Route des Lucioles-BP 93 FR-06902 Sophia Antipolis http://www-sop.inria.fr/members/Romain.Veltz/public_html/Home.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Mon Jun 2 09:13:21 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Mon, 2 Jun 2014 07:13:21 -0700 (PDT) Subject: [petsc-users] About parallel performance In-Reply-To: <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov> Message-ID: <1401718401.36233.YahooMailNeo@web160202.mail.bf1.yahoo.com> Will the speedup measured by the?streams benchmark?the upper limit of speedup of a parallel program? I.e., suppose there is a program with ideal linear speedup (=2 for np=2 if running in a perfect machine for parallelism), if it runs in your laptop, the maximum speedup would be 1.44 with np=2? ? Thanks, Qin ________________________________ From: Barry Smith To: Qin Lu Cc: petsc-users Sent: Thursday, May 29, 2014 5:46 PM Subject: Re: [petsc-users] About parallel performance ? For the parallel case a perfect machine would have twice the memory bandwidth when using 2 cores as opposed to 1 core. For yours it is almost exactly the same. The issue is not with the MPI or software. It depends on how many memory sockets there are and how they are shared by the various cores. As I said the initial memory bandwidth for one core 21,682. gigabytes per second is good so it is a very good sequential machine. ? Here are the results on my laptop Number of MPI processes 1 Process 0 Barrys-MacBook-Pro.local Function? ? ? Rate (MB/s) Copy:? ? ? ? 7928.7346 Scale:? ? ? 8271.5103 Add:? ? ? ? 11017.0430 Triad:? ? ? 10843.9018 Number of MPI processes 2 Process 0 Barrys-MacBook-Pro.local Process 1 Barrys-MacBook-Pro.local Function? ? ? Rate (MB/s) Copy:? ? ? 13513.0365 Scale:? ? ? 13516.7086 Add:? ? ? ? 15455.3952 Triad:? ? ? 15562.0822 ------------------------------------------------ np? speedup 1 1.0 2 1.44 Note that the memory bandwidth is much lower than your machine but there is an increase in speedup from one to two cores because one core cannot utilize all the memory bandwidth. But even with two cores my laptop will be slower on PETSc then one core on your machine. Here is the performance on a workstation we have that has multiple CPUs and multiple memory sockets Number of MPI processes 1 Process 0 es Function? ? ? Rate (MB/s) Copy:? ? ? 13077.8260 Scale:? ? ? 12867.1966 Add:? ? ? ? 14637.6757 Triad:? ? ? 14414.4478 Number of MPI processes 2 Process 0 es Process 1 es Function? ? ? Rate (MB/s) Copy:? ? ? 22663.3116 Scale:? ? ? 22102.5495 Add:? ? ? ? 25768.1550 Triad:? ? ? 26076.0410 Number of MPI processes 3 Process 0 es Process 1 es Process 2 es Function? ? ? Rate (MB/s) Copy:? ? ? 27501.7610 Scale:? ? ? 26971.2183 Add:? ? ? ? 30433.3276 Triad:? ? ? 31302.9396 Number of MPI processes 4 Process 0 es Process 1 es Process 2 es Process 3 es Function? ? ? Rate (MB/s) Copy:? ? ? 29302.3183 Scale:? ? ? 30165.5295 Add:? ? ? ? 34577.3458 Triad:? ? ? 35195.8067 ------------------------------------------------ np? speedup 1 1.0 2 1.81 3 2.17 4 2.44 Note that one core has a lower memory bandwidth than your machine but as I add more cores the memory bandwidth increases by a factor of 2.4 There is nothing wrong with your machine, it is just not suitable to run sparse linear algebra on multiple cores for it. ? Barry On May 29, 2014, at 5:15 PM, Qin Lu wrote: > Barry, >? > How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? >? > The machine has very new Intel chips and is very for serial run. What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2) that was not built correctly? > Many thanks, > Qin >? > ----- Original Message ----- > From: Barry Smith > To: Qin Lu ; petsc-users > Cc: > Sent: Thursday, May 29, 2014 4:54 PM > Subject: Re: [petsc-users] About parallel performance > > >? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. > >? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. > >? ? Barry > > > > > On May 29, 2014, at 4:37 PM, Qin Lu wrote: > >> Barry, >> >> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): >> >> ================= >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion >> Number of MPI processes 1 >> Function? ? ? Rate (MB/s) >> Copy:? ? ? 21682.9932 >> Scale:? ? ? 21637.5509 >> Add:? ? ? ? 21583.0395 >> Triad:? ? ? 21504.6563 >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion >> Number of MPI processes 2 >> Function? ? ? Rate (MB/s) >> Copy:? ? ? 21369.6976 >> Scale:? ? ? 21632.3203 >> Add:? ? ? ? 22203.7107 >> Triad:? ? ? 22305.1841 >> ======================= >> >> Thanks a lot, >> Qin >> >> From: Barry Smith >> To: Qin Lu >> Cc: "petsc-users at mcs.anl.gov" >> Sent: Thursday, May 29, 2014 4:17 PM >> Subject: Re: [petsc-users] About parallel performance >> >> >> >>? ? You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can >> >> cd? src/benchmarks/streams/ >> >> make MPIVersion >> >> mpiexec -n 1 ./MPIVersion >> >> mpiexec -n 2 ./MPIVersion >> >>? ? and send all the results >> >>? ? Barry >> >> >> >> On May 29, 2014, at 4:06 PM, Qin Lu wrote: >> >>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >>>? >>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >>> >>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >>> >>> Many thanks, >>> Qin >>> >>> ----- Original Message ----- >>> From: Barry Smith >>> To: Qin Lu >>> Cc: "petsc-users at mcs.anl.gov" >>> Sent: Thursday, May 29, 2014 2:12 PM >>> Subject: Re: [petsc-users] About parallel performance >>> >>> >>>? ? ? You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >>> >>>? ? ? Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >>> >>>? ? ? Barry >>> >>> >>> >>> >>> >>> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >>> >>>> Hello, >>>> >>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>>> >>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>>>? >>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>>> My questions are: >>>>? >>>> 1. what is the bottle neck of the parallel run according to the summary? >>>> 2. Do you have any suggestions to improve the parallel performance? >>>>? >>>> Thanks a lot for your suggestions! >>>>? >>>> Regards, >>>> Qin? ? ? ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Jun 2 09:25:50 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Mon, 2 Jun 2014 09:25:50 -0500 Subject: [petsc-users] Use MPIAIJ, but receive help information of seqaij In-Reply-To: <6fa0c3c0155d479fb39cb97ea37d4562@NAGURSKI.anl.gov> References: <6fa0c3c0155d479fb39cb97ea37d4562@NAGURSKI.anl.gov> Message-ID: Run your code with option "-eps_view -st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2" and send us entire output of '-eps_view'. Hong On Mon, Jun 2, 2014 at 2:13 AM, ??? wrote: > Hi, all > > Recently I'm trying to improve the performance of my program, so I add -help > option to see options available. I think I use MPIAIJ type of MAT, but I > receive information of SEQAIJ. > > Is it means actually I am using a sequential matrix type? > If not, how could I get information of MPIAIJ to improve performance? > > Code of generating mate > MatCreate(PETSC_COMM_WORLD,&A); > MatSetType(A,MATMPIAIJ); > MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); > MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); > > Code of getting information > char common_options[] = "-help -st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_28 2 > -mat_mumps_icntl_29 2"; > > ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); > > Thank you very much > Guoxi From rupp at iue.tuwien.ac.at Mon Jun 2 10:10:27 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Mon, 2 Jun 2014 17:10:27 +0200 Subject: [petsc-users] Use MPIAIJ, but receive help information of seqaij In-Reply-To: References: Message-ID: <538C93E3.60705@iue.tuwien.ac.at> Hi Guoxi, do you run your code in parallel using mpirun? If you run the code in serial, then MATMPIAIJ is internally handled like MATSEQAIJ, which might explain the behavior you are seeing. Best regards, Karli On 06/02/2014 09:13 AM, ??? wrote: > Hi, all > > Recently I'm trying to improve the performance of my program, so I add > -help option to see options available. I think I use MPIAIJ type of MAT, > but I receive information of SEQAIJ. > > Is it means actually I am using a sequential matrix type? > If not, how could I get information of MPIAIJ to improve performance? > > Code of generating mate > MatCreate(PETSC_COMM_WORLD,&A); > MatSetType(A,MATMPIAIJ); > MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); > MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); > > Code of getting information > char common_options[] = "-help -st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_28 2 > -mat_mumps_icntl_29 2"; > > ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); > > Thank you very much > Guoxi From bsmith at mcs.anl.gov Mon Jun 2 11:07:09 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 2 Jun 2014 11:07:09 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401718401.36233.YahooMailNeo@web160202.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov> <1401718401.36233.YahooMailNeo@web160202.mail.bf1.yahoo.com> Message-ID: On Jun 2, 2014, at 9:13 AM, Qin Lu wrote: > Will the speedup measured by the streams benchmark the upper limit of speedup of a parallel program? I.e., suppose there is a program with ideal linear speedup (=2 for np=2 if running in a perfect machine for parallelism), if it runs in your laptop, the maximum speedup would be 1.44 with np=2? It depends on the computation being run. For PETSc solvers it is generally a pretty good measure of the upper bound. > > Thanks, > Qin > > From: Barry Smith > To: Qin Lu > Cc: petsc-users > Sent: Thursday, May 29, 2014 5:46 PM > Subject: Re: [petsc-users] About parallel performance > > > For the parallel case a perfect machine would have twice the memory bandwidth when using 2 cores as opposed to 1 core. For yours it is almost exactly the same. The issue is not with the MPI or software. It depends on how many memory sockets there are and how they are shared by the various cores. As I said the initial memory bandwidth for one core 21,682. gigabytes per second is good so it is a very good sequential machine. > > Here are the results on my laptop > > Number of MPI processes 1 > Process 0 Barrys-MacBook-Pro.local > Function Rate (MB/s) > Copy: 7928.7346 > Scale: 8271.5103 > Add: 11017.0430 > Triad: 10843.9018 > Number of MPI processes 2 > Process 0 Barrys-MacBook-Pro.local > Process 1 Barrys-MacBook-Pro.local > Function Rate (MB/s) > Copy: 13513.0365 > Scale: 13516.7086 > Add: 15455.3952 > Triad: 15562.0822 > ------------------------------------------------ > np speedup > 1 1.0 > 2 1.44 > > > Note that the memory bandwidth is much lower than your machine but there is an increase in speedup from one to two cores because one core cannot utilize all the memory bandwidth. But even with two cores my laptop will be slower on PETSc then one core on your machine. > > Here is the performance on a workstation we have that has multiple CPUs and multiple memory sockets > > Number of MPI processes 1 > Process 0 es > Function Rate (MB/s) > Copy: 13077.8260 > Scale: 12867.1966 > Add: 14637.6757 > Triad: 14414.4478 > Number of MPI processes 2 > Process 0 es > Process 1 es > Function Rate (MB/s) > Copy: 22663.3116 > Scale: 22102.5495 > Add: 25768.1550 > Triad: 26076.0410 > Number of MPI processes 3 > Process 0 es > Process 1 es > Process 2 es > Function Rate (MB/s) > Copy: 27501.7610 > Scale: 26971.2183 > Add: 30433.3276 > Triad: 31302.9396 > Number of MPI processes 4 > Process 0 es > Process 1 es > Process 2 es > Process 3 es > Function Rate (MB/s) > Copy: 29302.3183 > Scale: 30165.5295 > Add: 34577.3458 > Triad: 35195.8067 > ------------------------------------------------ > np speedup > 1 1.0 > 2 1.81 > 3 2.17 > 4 2.44 > > Note that one core has a lower memory bandwidth than your machine but as I add more cores the memory bandwidth increases by a factor of 2.4 > > There is nothing wrong with your machine, it is just not suitable to run sparse linear algebra on multiple cores for it. > > Barry > > > > > On May 29, 2014, at 5:15 PM, Qin Lu wrote: > > > Barry, > > > > How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? > > > > The machine has very new Intel chips and is very for serial run. What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2) that was not built correctly? > > Many thanks, > > Qin > > > > ----- Original Message ----- > > From: Barry Smith > > To: Qin Lu ; petsc-users > > Cc: > > Sent: Thursday, May 29, 2014 4:54 PM > > Subject: Re: [petsc-users] About parallel performance > > > > > > In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. > > > > But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. > > > > Barry > > > > > > > > > > On May 29, 2014, at 4:37 PM, Qin Lu wrote: > > > >> Barry, > >> > >> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): > >> > >> ================= > >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion > >> Number of MPI processes 1 > >> Function Rate (MB/s) > >> Copy: 21682.9932 > >> Scale: 21637.5509 > >> Add: 21583.0395 > >> Triad: 21504.6563 > >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion > >> Number of MPI processes 2 > >> Function Rate (MB/s) > >> Copy: 21369.6976 > >> Scale: 21632.3203 > >> Add: 22203.7107 > >> Triad: 22305.1841 > >> ======================= > >> > >> Thanks a lot, > >> Qin > >> > >> From: Barry Smith > >> To: Qin Lu > >> Cc: "petsc-users at mcs.anl.gov" > >> Sent: Thursday, May 29, 2014 4:17 PM > >> Subject: Re: [petsc-users] About parallel performance > >> > >> > >> > >> You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can > >> > >> cd src/benchmarks/streams/ > >> > >> make MPIVersion > >> > >> mpiexec -n 1 ./MPIVersion > >> > >> mpiexec -n 2 ./MPIVersion > >> > >> and send all the results > >> > >> Barry > >> > >> > >> > >> On May 29, 2014, at 4:06 PM, Qin Lu wrote: > >> > >>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. > >>> > >>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). > >>> > >>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? > >>> > >>> Many thanks, > >>> Qin > >>> > >>> ----- Original Message ----- > >>> From: Barry Smith > >>> To: Qin Lu > >>> Cc: "petsc-users at mcs.anl.gov" > >>> Sent: Thursday, May 29, 2014 2:12 PM > >>> Subject: Re: [petsc-users] About parallel performance > >>> > >>> > >>> You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). > >>> > >>> Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory. If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. > >>> > >>> Barry > >>> > >>> > >>> > >>> > >>> > >>> On May 29, 2014, at 1:23 PM, Qin Lu wrote: > >>> > >>>> Hello, > >>>> > >>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). > >>>> > >>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. > >>>> > >>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. > >>>> My questions are: > >>>> > >>>> 1. what is the bottle neck of the parallel run according to the summary? > >>>> 2. Do you have any suggestions to improve the parallel performance? > >>>> > >>>> Thanks a lot for your suggestions! > >>>> > >>>> Regards, > >>>> Qin > > From jed at jedbrown.org Mon Jun 2 11:28:21 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 02 Jun 2014 18:28:21 +0200 Subject: [petsc-users] Calling trilinos function from PETSC In-Reply-To: References: Message-ID: <874n03qnru.fsf@jedbrown.org> Veltz Romain writes: > Dear user, > > I am wondering whether one can call Trilinos LOCA functions from PETSc and if anybody has ever done it. PETSc does not provide an interface to LOCA, but you're welcome to use LOCA in the same application that uses PETSc for other things. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From hus003 at ucsd.edu Thu Jun 5 11:26:40 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Thu, 5 Jun 2014 16:26:40 +0000 Subject: [petsc-users] questions on KSP solver and preconditioning Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B7952@XMAIL-MBX-BH1.AD.UCSD.EDU> Hello, I'm reading ex19.c from examples under directory snes. The version I'm using is 3.3. In the code, a user defined function NonlinearGS is implemented, and is registered thru the command ierr = SNESSetGS(snes, NonlinearGS, (void*)&user);CHKERRQ(ierr); I'm wondering if this only serves as a preconditioner? Because I read from the output that the default ksp solver is gmres. I'm also wondering what shall I do if I want to make the user defined NonlinearGS the actual solver instead of the preconditioner? >From the discussion a few days ago, I knew that by specifying option -ksp_type gmres or -ksp_type bcgs, you use gmres or bcgs as your solvers, where then can I find a database for the ksp solvers? Best, Hui -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Jun 5 12:36:27 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 05 Jun 2014 19:36:27 +0200 Subject: [petsc-users] questions on KSP solver and preconditioning In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B7952@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B7952@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87vbsfi7hg.fsf@jedbrown.org> "Sun, Hui" writes: > Hello, I'm reading ex19.c from examples under directory snes. The version I'm using is 3.3. In the code, a user defined function NonlinearGS is implemented, and is registered thru the command > ierr = SNESSetGS(snes, NonlinearGS, (void*)&user);CHKERRQ(ierr); > > I'm wondering if this only serves as a preconditioner? Because I read > from the output that the default ksp solver is gmres. I'm also > wondering what shall I do if I want to make the user defined > NonlinearGS the actual solver instead of the preconditioner? NonlinearGS is primarily used as a smoother in nonlinear multigrid (FAS). It is not used by the default configuration. > From the discussion a few days ago, I knew that by specifying option > -ksp_type gmres or -ksp_type bcgs, you use gmres or bcgs as your > solvers, where then can I find a database for the ksp solvers? ./ex19 -help | grep -A5 ksp_type this will output a list of available KSPs. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From danyang.su at gmail.com Thu Jun 5 12:58:34 2014 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 05 Jun 2014 10:58:34 -0700 Subject: [petsc-users] Running problem with pc_type hypre In-Reply-To: <53866CFC.1080401@gmail.com> References: <53865BE2.2060807@gmail.com> <53866CFC.1080401@gmail.com> Message-ID: <5390AFCA.3020802@gmail.com> Hi All, I recompiled the hypre library with the same compiler and intel mkl, then the error is gone. Thanks, Danyang On 28/05/2014 4:10 PM, Danyang Su wrote: > Hi Barry, > > I need further check on it. Running this executable file on another > machine results into mkl_intel_thread.dll missing error. I am not sure > at present if the mkl_intel_thread.dll version causes this problem. > > Thanks, > > Danyang > > On 28/05/2014 4:01 PM, Barry Smith wrote: >> Some possibilities: >> >> Are you sure that the hypre was compiled with exactly the same MPI >> as the that used to build PETSc? >> >> On May 28, 2014, at 4:57 PM, Danyang Su wrote: >> >>> Hi All, >>> >>> I am testing my codes under windows with PETSc V3.4.4. >>> >>> When running with option -pc_type hypre using 1 processor, the >>> program exactly uses 6 processors (my computer is 6 processors 12 >>> threads) >> 6 threads? or 6 processes? It should not be possible for it to >> use more processes then what you start the program with. >> >> hypre can be configured to use OpenMP thread parallelism PLUS >> MPI parallelism. Was it configured/compiled for that? If so you want >> to turn that off, >> configure and compile hypre before linking to PETSc so it does not >> use OpenMP. >> >> Are you sure you don?t have a bunch of zombie MPI processes >> running from previous jobs that crashed. They suck up CPU but are not >> involved in the current MPI run. Reboot the machine to get rid of >> them all. >> >> Barry >> >>> and the program crashed after many timesteps. The error information >>> is as follows: >>> >>> job aborted: >>> [ranks] message >>> >>> [0] fatal error >>> Fatal error in MPI_Comm_create: Internal MPI error!, error stack: >>> MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, >>> group=0xc80300f2, new_comm=0x000000001EA6DD30) failed >>> MPI_Comm_create(524).......: >>> MPIR_Comm_create_intra(209): >>> MPIR_Get_contextid(253)....: Too many communicators >>> >>> When running with option -pc_type hypre using 2 processors or more, >>> the program exactly uses all the threads, making the system >>> seriously overburden and the program runs very slowly. >>> >>> When running without -pc_type hypre, the program works fine without >>> any problem. >>> >>> Does anybody have the same problem in windows. >>> >>> Thanks and regards, >>> >>> Danyang > From hus003 at ucsd.edu Thu Jun 5 13:31:20 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Thu, 5 Jun 2014 18:31:20 +0000 Subject: [petsc-users] questions on KSP solver and preconditioning In-Reply-To: <87vbsfi7hg.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B7952@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87vbsfi7hg.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B7991@XMAIL-MBX-BH1.AD.UCSD.EDU> >From the output of the command ./ex19 -help | grep -A5 ksp_type, I can see the possible Krylov method options: cg groppcg pipecg cgne nash stcg gltr richardson chebyshev gmres tcqmr bcgs ibcgs fbcgs fbcgsr bcgsl cgs tfqmr cr pipecr lsqr preonly qcg bicg fgmres minres symmlq lgmres lcd gcr pgmres specest dgmres (KSPSetType) The multigrid method does not belong to the Krylov methods so it is not there. I'm wondering which option should I specify if I want to use multigrid and hence the NonlinearGS smoother? ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Thursday, June 05, 2014 10:36 AM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] questions on KSP solver and preconditioning "Sun, Hui" writes: > Hello, I'm reading ex19.c from examples under directory snes. The version I'm using is 3.3. In the code, a user defined function NonlinearGS is implemented, and is registered thru the command > ierr = SNESSetGS(snes, NonlinearGS, (void*)&user);CHKERRQ(ierr); > > I'm wondering if this only serves as a preconditioner? Because I read > from the output that the default ksp solver is gmres. I'm also > wondering what shall I do if I want to make the user defined > NonlinearGS the actual solver instead of the preconditioner? NonlinearGS is primarily used as a smoother in nonlinear multigrid (FAS). It is not used by the default configuration. > From the discussion a few days ago, I knew that by specifying option > -ksp_type gmres or -ksp_type bcgs, you use gmres or bcgs as your > solvers, where then can I find a database for the ksp solvers? ./ex19 -help | grep -A5 ksp_type this will output a list of available KSPs. From jed at jedbrown.org Thu Jun 5 14:18:20 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 05 Jun 2014 21:18:20 +0200 Subject: [petsc-users] questions on KSP solver and preconditioning In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B7991@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B7952@XMAIL-MBX-BH1.AD.UCSD.EDU> <87vbsfi7hg.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7991@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87mwdri2rn.fsf@jedbrown.org> "Sun, Hui" writes: > The multigrid method does not belong to the Krylov methods so it is > not there. I'm wondering which option should I specify if I want to > use multigrid and hence the NonlinearGS smoother? -snes_type fas (or nonlinear preconditioning). I'm afraid you're going to have to read examples and documentation to learn how to operate the nonlinear solvers. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu Jun 5 18:04:24 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 5 Jun 2014 18:04:24 -0500 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: <538C0F91.7060008@gmail.com> References: <534C9A2C.5060404@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> <5379A433.5000401@gmail.com> <538C0F91.7060008@gmail.com> Message-ID: On April 23 you sent us the message Hi, Ya that's better. I've been trying to remove the unnecessary codes (also some copyright issues with the university) and to present only the code which shows the error. But strangely, by restructuring and removing all the redundant stuffs, the error is no longer there. I'm now trying to add back and re-construct my code and see if it appears. I'll let you people know if there's update. Then on May 12 you sent Hi, I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. I have never received a complete code that I could compile and run to reproduce your problem. Please send the code that compiles for you but crashes so we may reproduce the problem. Barry On Jun 2, 2014, at 12:45 AM, TAY wee-beng wrote: > Dear all, > > May I know if there is any problem compiling / building / running the file I emailed. We can work together if there is any. > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 20/5/2014 1:43 AM, Barry Smith wrote: >> On May 19, 2014, at 1:26 AM, TAY wee-beng wrote: >> >>> On 19/5/2014 11:36 AM, Barry Smith wrote: >>>> On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: >>>> >>>>> On 19/5/2014 9:53 AM, Matthew Knepley wrote: >>>>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >>>>>> Hi Barry, >>>>>> >>>>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? >>>>>> >>>>>> Yes it works with Intel. Is this using optimization? >>>>> Hi Matt, >>>>> >>>>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? >>>> No. Does it run clean under valgrind? >>> Hi, >>> >>> Do you mean the debug or optimized version? >> Both. >> >>> Thanks. >>>>>> Matt >>>>>> Thank you >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> On 14/5/2014 12:03 AM, Barry Smith wrote: >>>>>> Please send you current code. So we may compile and run it. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >>>>>> >>>>>> Thank you. >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>>>>> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >>>>>> >>>>>> Barry >>>>>> >>>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>>>>> >>>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>>>> >>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>>>> >>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>>> Hmm, >>>>>> >>>>>> Interface DMDAVecGetArrayF90 >>>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>>> USE_DM_HIDE >>>>>> DM_HIDE da1 >>>>>> VEC_HIDE v >>>>>> PetscScalar,pointer :: d1(:,:,:) >>>>>> PetscErrorCode ierr >>>>>> End Subroutine >>>>>> >>>>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>> Hi, >>>>>> >>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>>>> >>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>>>> >>>>>> Also, supposed I call: >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> u_array .... >>>>>> >>>>>> v_array .... etc >>>>>> >>>>>> Now to restore the array, does it matter the sequence they are restored? >>>>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>>>> >>>>>> Hi, >>>>>> >>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>>>> >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> u_array = 0.d0 >>>>>> >>>>>> v_array = 0.d0 >>>>>> >>>>>> w_array = 0.d0 >>>>>> >>>>>> p_array = 0.d0 >>>>>> >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>>>> >>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>>> Hi Matt, >>>>>> >>>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>>> >>>>>> It already has DMDAVecGetArray(). Just run it. >>>>>> Hi, >>>>>> >>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>>>> >>>>>> No the global/local difference should not matter. >>>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>>>> >>>>>> DMGetLocalVector() >>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>>>> >>>>>> If so, when should I call them? >>>>>> >>>>>> You just need a local vector from somewhere. >>>>>> Hi, >>>>>> >>>>>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>>>>> >>>>>> Thanks. >>>>>> Hi, >>>>>> >>>>>> I insert part of my error region code into ex11f90: >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> u_array = 0.d0 >>>>>> v_array = 0.d0 >>>>>> w_array = 0.d0 >>>>>> p_array = 0.d0 >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>>>>> >>>>>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>>>>> >>>>>> module solve >>>>>> <- add include file? >>>>>> subroutine RRK >>>>>> <- add include file? >>>>>> end subroutine RRK >>>>>> >>>>>> end module solve >>>>>> >>>>>> So where should the include files (#include ) be placed? >>>>>> >>>>>> After the module or inside the subroutine? >>>>>> >>>>>> Thanks. >>>>>> Matt >>>>>> Thanks. >>>>>> Matt >>>>>> Thanks. >>>>>> Matt >>>>>> Thanks >>>>>> >>>>>> Regards. >>>>>> Matt >>>>>> As in w, then v and u? >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> thanks >>>>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>>>> Hi, >>>>>> >>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>>> >>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>>>> >>>>>> >>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>>>> >>>>>> Barry >>>>>> >>>>>> Thanks. >>>>>> Barry >>>>>> >>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>>>> >>>>>> However, by re-writing my code, I found out a few things: >>>>>> >>>>>> 1. if I write my code this way: >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> u_array = .... >>>>>> >>>>>> v_array = .... >>>>>> >>>>>> w_array = .... >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> The code runs fine. >>>>>> >>>>>> 2. if I write my code this way: >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>>> >>>>>> where the subroutine is: >>>>>> >>>>>> subroutine uvw_array_change(u,v,w) >>>>>> >>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>> >>>>>> u ... >>>>>> v... >>>>>> w ... >>>>>> >>>>>> end subroutine uvw_array_change. >>>>>> >>>>>> The above will give an error at : >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> 3. Same as above, except I change the order of the last 3 lines to: >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> So they are now in reversed order. Now it works. >>>>>> >>>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>>> >>>>>> subroutine uvw_array_change(u,v,w) >>>>>> >>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>> >>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>> >>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>> >>>>>> u ... >>>>>> v... >>>>>> w ... >>>>>> >>>>>> end subroutine uvw_array_change. >>>>>> >>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>>>> >>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>>>> >>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>>>> >>>>>> Thank you. >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>> >>>>>> >>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>>>> >>>>>> Hi Barry, >>>>>> >>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>>>> >>>>>> I have attached my code. >>>>>> >>>>>> Thank you >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>>> Please send the code that creates da_w and the declarations of w_array >>>>>> >>>>>> Barry >>>>>> >>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>>> >>>>>> wrote: >>>>>> >>>>>> >>>>>> Hi Barry, >>>>>> >>>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>>> >>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>> >>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>>>> >>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>>>> >>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>> -------------------------------------------------------------------------- >>>>>> An MPI process has executed an operation involving a call to the >>>>>> "fork()" system call to create a child process. Open MPI is currently >>>>>> operating in a condition that could result in memory corruption or >>>>>> other system errors; your MPI job may hang, crash, or produce silent >>>>>> data corruption. The use of fork() (or system() or other calls that >>>>>> create child processes) is strongly discouraged. >>>>>> >>>>>> The process that invoked fork was: >>>>>> >>>>>> Local host: n12-76 (PID 20235) >>>>>> MPI_COMM_WORLD rank: 2 >>>>>> >>>>>> If you are *absolutely sure* that your application will successfully >>>>>> and correctly survive a call to fork(), you may disable this warning >>>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>>> -------------------------------------------------------------------------- >>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>>>> >>>>>> .... >>>>>> >>>>>> 1 >>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>> [1]PETSC ERROR: or see >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>> [1]PETSC ERROR: to get more information on the crash. >>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>> [3]PETSC ERROR: or see >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>> [3]PETSC ERROR: to get more information on the crash. >>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>> >>>>>> ... >>>>>> Thank you. >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>>> >>>>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>>>> >>>>>> Barry >>>>>> >>>>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>>>> >>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>>> >>>>>> >>>>>> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> -- >>>>>> Thank you. >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener > From tlk0812 at hotmail.com Sat Jun 7 01:40:06 2014 From: tlk0812 at hotmail.com (LikunTan) Date: Sat, 7 Jun 2014 14:40:06 +0800 Subject: [petsc-users] about VecGetArray() Message-ID: Hello, I defined the partition of Vector, which is not stored contiguously. Here is a part of my code. The total number of nodes is NODE*DOF. Before executing this following code, I defined LEN and an array NODENP[] to store the number of nodes and the list of nodes in each processor. I accessed the element using aM[node*DOF+j] and aM[i*DOF+j], but none of them gave me the correct answer. Your help is well appreciated. **************************************************************VecCreate(PETSC_COMM_WORLD, &M);VecSetSizes(M, LEN*DOF, NODE*DOF); VecGetArray(M, &aM); for(int i=0; i From knepley at gmail.com Sat Jun 7 07:53:11 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 7 Jun 2014 07:53:11 -0500 Subject: [petsc-users] about VecGetArray() In-Reply-To: References: Message-ID: On Sat, Jun 7, 2014 at 1:40 AM, LikunTan wrote: > Hello, > > I defined the partition of Vector, which is not stored contiguously. Here > is a part of my code. The total number of nodes is NODE*DOF. Before > executing this following code, I defined LEN and an array NODENP[] to store > the number of nodes and the list of nodes in each processor. I accessed the > element using aM[node*DOF+j] and aM[i*DOF+j], but none of them gave me the > correct answer. Your help is well appreciated. > We cannot know what you mean by "the correct answer". Vec is just a 1D array of memory. You decide on the organization inside. Matt > ************************************************************** > VecCreate(PETSC_COMM_WORLD, &M); > VecSetSizes(M, LEN*DOF, NODE*DOF); > > VecGetArray(M, &aM); > > for(int i=0; i { > node=NODENP[i]; > for(int j=0; j { > aM[node*DOF+j] or aM[i*DOF+j] ? //accessing the elements of M > > } > } > VecRestoreArray(M, &aM); > ********************************************************* > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jun 7 10:45:58 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 7 Jun 2014 10:45:58 -0500 Subject: [petsc-users] about VecGetArray() In-Reply-To: References: Message-ID: It looks like you are trying to access the entire vector on all processes. You cannot do this. VecGetArray() only gives you access to the local values. If you need to access the entire vector on each process (which is not scalable or a good idea) you can use VecScatterCreateToAll() or VecScatterCreateToZero() then scatter the vector then use VecGetArray() on the now new sequential vector. Barry On Jun 7, 2014, at 1:40 AM, LikunTan wrote: > Hello, > > I defined the partition of Vector, which is not stored contiguously. Here is a part of my code. The total number of nodes is NODE*DOF. Before executing this following code, I defined LEN and an array NODENP[] to store the number of nodes and the list of nodes in each processor. I accessed the element using aM[node*DOF+j] and aM[i*DOF+j], but none of them gave me the correct answer. Your help is well appreciated. > > ************************************************************** > VecCreate(PETSC_COMM_WORLD, &M); > VecSetSizes(M, LEN*DOF, NODE*DOF); > > VecGetArray(M, &aM); > > for(int i=0; i { > node=NODENP[i]; > for(int j=0; j { > aM[node*DOF+j] or aM[i*DOF+j] ? //accessing the elements of M > > } > } > VecRestoreArray(M, &aM); > ********************************************************* From tlk0812 at hotmail.com Sat Jun 7 13:02:08 2014 From: tlk0812 at hotmail.com (LikunTan) Date: Sun, 8 Jun 2014 02:02:08 +0800 Subject: [petsc-users] about VecGetArray() In-Reply-To: References: , Message-ID: Hello, Thank you for your reply. Sorry I did not make it clear. I added the part where I set values for M before calling VecGetArray(). Below is a more complete code. I am not sure how to access the elements consistently by using aM. ************************************************************** VecCreate(PETSC_COMM_WORLD, &M); VecSetSizes(M, LEN*DOF, NODE*DOF); //setting LEN and NODENP[] in the function get_nodes_process//the stored data are not contiguousMPI_Comm_rank(PETSC_COMM_WORLD, &rank);MPI_Comm_size(PETSC_COMM_WORLD, &size);get_nodes_process(NODE, size, rank) //set the values of Mfor(int i=0; i Subject: Re: [petsc-users] about VecGetArray() > From: bsmith at mcs.anl.gov > Date: Sat, 7 Jun 2014 10:45:58 -0500 > CC: petsc-users at mcs.anl.gov > To: tlk0812 at hotmail.com > > > It looks like you are trying to access the entire vector on all processes. You cannot do this. VecGetArray() only gives you access to the local values. If you need to access the entire vector on each process (which is not scalable or a good idea) you can use VecScatterCreateToAll() or VecScatterCreateToZero() then scatter the vector then use VecGetArray() on the now new sequential vector. > > Barry > > On Jun 7, 2014, at 1:40 AM, LikunTan wrote: > > > Hello, > > > > I defined the partition of Vector, which is not stored contiguously. Here is a part of my code. The total number of nodes is NODE*DOF. Before executing this following code, I defined LEN and an array NODENP[] to store the number of nodes and the list of nodes in each processor. I accessed the element using aM[node*DOF+j] and aM[i*DOF+j], but none of them gave me the correct answer. Your help is well appreciated. > > > > ************************************************************** > > VecCreate(PETSC_COMM_WORLD, &M); > > VecSetSizes(M, LEN*DOF, NODE*DOF); > > > > VecGetArray(M, &aM); > > > > for(int i=0; i > { > > node=NODENP[i]; > > for(int j=0; j > { > > aM[node*DOF+j] or aM[i*DOF+j] ? //accessing the elements of M > > > > } > > } > > VecRestoreArray(M, &aM); > > ********************************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jun 7 13:16:23 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 7 Jun 2014 13:16:23 -0500 Subject: [petsc-users] about VecGetArray() In-Reply-To: References: Message-ID: On Sat, Jun 7, 2014 at 1:02 PM, LikunTan wrote: > Hello, > > Thank you for your reply. Sorry I did not make it clear. I added the part > where I set values for M before calling VecGetArray(). Below is a more > complete code. I am not sure how to access the elements consistently by > using aM. > As Barry pointed out, this cannot work in parallel. A Vec stores contiguous chunks of memory in process order. VecGetArray() has access to the local chunk, starting from the process offset. This is detalied in the manual. Random access in parallel is not efficiently achievable. Thanks, Matt > ************************************************************** > VecCreate(PETSC_COMM_WORLD, &M); > VecSetSizes(M, LEN*DOF, NODE*DOF); > > //setting LEN and NODENP[] in the function get_nodes_process > //the stored data are not contiguous > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > MPI_Comm_size(PETSC_COMM_WORLD, &size); > get_nodes_process(NODE, size, rank) > > //set the values of M > for(int i=0; i { > node=NODENP[i]; > for(int j=0; j { > col[j]=node*DOF+j; > val[j]=1.0*j; > } > VecSetValues(M, DOF, col, val, INSERT_VALUES); > } > VecAssemblyBegin(M); > VecAssemblyEnd(M); > > //change values of M > VecGetArray(M, &aM); > for(int i=0; i { > node=NODENP[i]; > for(int j=0; j { > aM[node*DOF+j] or aM[i*DOF+j] ? //accessing the elements of M > } > } > VecRestoreArray(M, &aM); > ********************************************************* > > best, > > > > Subject: Re: [petsc-users] about VecGetArray() > > From: bsmith at mcs.anl.gov > > Date: Sat, 7 Jun 2014 10:45:58 -0500 > > CC: petsc-users at mcs.anl.gov > > To: tlk0812 at hotmail.com > > > > > > It looks like you are trying to access the entire vector on all > processes. You cannot do this. VecGetArray() only gives you access to the > local values. If you need to access the entire vector on each process > (which is not scalable or a good idea) you can use VecScatterCreateToAll() > or VecScatterCreateToZero() then scatter the vector then use VecGetArray() > on the now new sequential vector. > > > > Barry > > > > On Jun 7, 2014, at 1:40 AM, LikunTan wrote: > > > > > Hello, > > > > > > I defined the partition of Vector, which is not stored contiguously. > Here is a part of my code. The total number of nodes is NODE*DOF. Before > executing this following code, I defined LEN and an array NODENP[] to store > the number of nodes and the list of nodes in each processor. I accessed the > element using aM[node*DOF+j] and aM[i*DOF+j], but none of them gave me the > correct answer. Your help is well appreciated. > > > > > > ************************************************************** > > > VecCreate(PETSC_COMM_WORLD, &M); > > > VecSetSizes(M, LEN*DOF, NODE*DOF); > > > > > > VecGetArray(M, &aM); > > > > > > for(int i=0; i > > { > > > node=NODENP[i]; > > > for(int j=0; j > > { > > > aM[node*DOF+j] or aM[i*DOF+j] ? //accessing the elements of M > > > > > > } > > > } > > > VecRestoreArray(M, &aM); > > > ********************************************************* > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ztdepyahoo at 163.com Sun Jun 8 00:32:53 2014 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Sun, 8 Jun 2014 13:32:53 +0800 (CST) Subject: [petsc-users] How to run a parallel code with mpiexec -np 1 Message-ID: <3820daad.8fe4.14679f7131b.Coremail.ztdepyahoo@163.com> Dear friends: I write a parallel code with Petsc, but it does not work with mpiexec -np 1 . since Petsc automatically allocate the vec and mat according to the number of cpu, why it does not work. and how to modify the parallel code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Jun 8 01:33:46 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 08 Jun 2014 08:33:46 +0200 Subject: [petsc-users] How to run a parallel code with mpiexec -np 1 In-Reply-To: <3820daad.8fe4.14679f7131b.Coremail.ztdepyahoo@163.com> References: <3820daad.8fe4.14679f7131b.Coremail.ztdepyahoo@163.com> Message-ID: <87ioobdi5x.fsf@jedbrown.org> ??? writes: > Dear friends: > I write a parallel code with Petsc, but it does not work with mpiexec -np 1 . > since Petsc automatically allocate the vec and mat according to the number of cpu, why it does not work. You can (should) write parallel code that behaves correctly on one process. It should be no extra work and the examples do this. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From hzhang at mcs.anl.gov Mon Jun 9 09:46:05 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Mon, 9 Jun 2014 09:46:05 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: <5ea3bb79ad084c4cacf1c150cb6d7fd9@LUCKMAN.anl.gov> References: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov> <5ea3bb79ad084c4cacf1c150cb6d7fd9@LUCKMAN.anl.gov> Message-ID: M Asghar : > Many thanks for the quick reply! I can see calls to MatMumpsGetInfog in > ex52.c. > > * This is in PETSc's dev copy if I'm not mistaken - will this make it into > the next PETSc release? > * Will/does this have a Fortran equivalent? The Fortran equivalent is added to petsc master branch and will be released soon. See ex52f.F Hong > > On Thu, May 29, 2014 at 4:43 PM, Hong Zhang wrote: >> >> Asghar: >> > Is it possible to access the contents of MUMPS array INFOG (and INFO, >> > RINFOG >> > etc) via the PETSc interface? >> >> Yes. Use the latest petsc (master branch). >> See petsc/src/ksp/ksp/examples/tutorials/ex52.c >> >> Hong >> > >> > I am working with SLEPc and am using MUMPS for the factorisation. I >> > would >> > like to access the contents of their INFOG array within our code >> > particularly when an error occurs in order to determine whether any >> > remedial >> > action can be taken. The error code returned from PETSc is useful; any >> > additional information from MUMPS that can be accessed from within ones >> > code >> > would be very helpful also. >> > >> > Many thanks in advance. >> > >> > M Asghar >> > > > From shriram at ualberta.ca Mon Jun 9 11:51:54 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Mon, 09 Jun 2014 10:51:54 -0600 Subject: [petsc-users] Can I use TS routines for operator split formulation Message-ID: <5395E62A.70707@ualberta.ca> Hi, I am working with the (discretised) PDE: (u* - u_prev) + (tau )A u* = f1 (u - u*) + (tau)B u = f2 Here A and B are constant matrices which have been assembled, u_prev is solution at previous time level, tau is the time step and u is the solution at current time level. It appears to me I cannot rewrite this in a form required by the TS module. So my question(s) are: 1) Am I missing something here, or is there a way to cast this into the framework of TS 2) If there isn?t a way, am I better off doing the time stepping myself ? Thanks, Shriram From knepley at gmail.com Mon Jun 9 12:09:48 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 9 Jun 2014 12:09:48 -0500 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <5395E62A.70707@ualberta.ca> References: <5395E62A.70707@ualberta.ca> Message-ID: On Mon, Jun 9, 2014 at 11:51 AM, Shriram Srinivasan wrote: > Hi, > I am working with the (discretised) PDE: > (u* - u_prev) + (tau )A u* = f1 > (u - u*) + (tau)B u = f2 > Here A and B are constant matrices which have been assembled, u_prev is > solution at previous time level, tau is the time step and u is the solution > at current time level. > > It appears to me I cannot rewrite this in a form required by the TS > module. So my question(s) are: > > 1) Am I missing something here, or is there a way to cast this into the > framework of TS > It seems like above you have already chosen a time discretization, in that you have time steps appearing. The idea with TS is to begin with the continuum form, in the simplest case u_t = G(u, t) and in the implicit form F(u_t, u, t) = 0 and let PETSc choose the time discretization (since there are many multistep methods). It is likely that you could reproduce the method you have above by choosing one of the existing TS methods. Does this make sense? Thanks, Matt > 2) If there isn?t a way, am I better off doing the time stepping myself ? > > Thanks, > Shriram > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From shriram at ualberta.ca Mon Jun 9 12:55:59 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Mon, 09 Jun 2014 11:55:59 -0600 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: References: <5395E62A.70707@ualberta.ca> Message-ID: <5395F52F.3040309@ualberta.ca> /It seems like above you have already chosen a time discretization, in that you have time steps appearing. The/ // /idea with TS is to begin with the continuum form, in the simplest case/ / / / u_t = G(u, t)/ / / /and in the implicit form/ / / / F(u_t, u, t) = 0/ / / /and let PETSc choose the time discretization (since there are many multistep methods). It is likely that/ /you could reproduce the method you have above by choosing one of the existing TS methods. Does this/ // /make sense?/ Yes, I have tried that. I have perhaps not been clear with my question. The discretization employed is simply Backward Euler. The problem I see with trying to use TS is that my scheme uses u* as a kind of predictor. I can write rewrite (u* - u_prev) + (tau )A u* = f1 as u*_t + A u* = f1; I apply backward Euler on this to find u* after one time step. But the problem is the next part (u - u*) + (tau)B u = f2. This I can rewrite as u_t + B u = f2; But when I apply backward euler, I want u_t = (u - u*)/tau. This breaks the pattern for use of the TS module, it seems to me. I would like to know if I am correct in my assesment. Can I still use TS profitably, or do I need to implement my own time stepper---that was my question. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlk0812 at hotmail.com Mon Jun 9 12:59:30 2014 From: tlk0812 at hotmail.com (Likun Tan) Date: Mon, 9 Jun 2014 10:59:30 -0700 Subject: [petsc-users] Scattering PetscScalar Message-ID: Dear Petsc developers, I defined PetscScalar *len to store the number of nodes in each processor, MPI_Comm_rank(PETSC_COMM_WPRLD, rank); len[rank]=NumNode; but I want the elements in len to be seen in all the processors. Is there any function in Petsc being able to do this job? Many thanks, Likun From prbrune at gmail.com Mon Jun 9 13:02:48 2014 From: prbrune at gmail.com (Peter Brune) Date: Mon, 9 Jun 2014 13:02:48 -0500 Subject: [petsc-users] Scattering PetscScalar In-Reply-To: References: Message-ID: If you have created a vector with the desired PETSc layout, you may use VecGetOwnershipRanges() http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecGetOwnershipRanges.html to get the ownership ranges for each processor indexable by rank, but with type PetscInt instead of PetscScalar. It's generally discouraged to use types such as PetscScalar to denote index or length quantities. - Peter On Mon, Jun 9, 2014 at 12:59 PM, Likun Tan wrote: > Dear Petsc developers, > > I defined PetscScalar *len to store the number of nodes in each processor, > MPI_Comm_rank(PETSC_COMM_WPRLD, rank); > len[rank]=NumNode; > > but I want the elements in len to be seen in all the processors. Is there > any function in Petsc being able to do this job? > > Many thanks, > Likun -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jun 9 13:13:32 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 9 Jun 2014 13:13:32 -0500 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <5395F52F.3040309@ualberta.ca> References: <5395E62A.70707@ualberta.ca> <5395F52F.3040309@ualberta.ca> Message-ID: On Mon, Jun 9, 2014 at 12:55 PM, Shriram Srinivasan wrote: > *It seems like above you have already chosen a time discretization, in > that you have time steps appearing. The* > *idea with TS is to begin with the continuum form, in the simplest case* > > * u_t = G(u, t)* > > *and in the implicit form* > > * F(u_t, u, t) = 0* > > *and let PETSc choose the time discretization (since there are many > multistep methods). It is likely that* > *you could reproduce the method you have above by choosing one of the > existing TS methods. Does this* > *make sense?* > > > Yes, I have tried that. I have perhaps not been clear with my question. > The discretization employed is simply Backward Euler. The problem I see > with trying to use TS is that my scheme uses u* as a kind of predictor. > > I can write rewrite (u* - u_prev) + (tau )A u* = f1 as > u*_t + A u* = f1; I apply backward Euler on this to find u* after one > time step. > > But the problem is the next part (u - u*) + (tau)B u = f2. > This I can rewrite as u_t + B u = f2; But when I apply backward euler, I > want u_t = (u - u*)/tau. > > This breaks the pattern for use of the TS module, it seems to me. I would > like to know if I am correct in my assesment. > Can I still use TS profitably, or do I need to implement my own time > stepper---that was my question. > I am not sure if you are using something like this, http://mathworld.wolfram.com/Predictor-CorrectorMethods.html, but TS does not have an implementation of that. As noted on the page, these have been largely supplanted by RK methods, which we do support. If you want to go back to the original continuum formulation, I think TS is usable. However, for the method as formulated above, I think you are right that TS is not suitable. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jun 9 13:28:42 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 9 Jun 2014 13:28:42 -0500 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <5395E62A.70707@ualberta.ca> References: <5395E62A.70707@ualberta.ca> Message-ID: <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> What is the ODE you are solving? Barry On Jun 9, 2014, at 11:51 AM, Shriram Srinivasan wrote: > Hi, > I am working with the (discretised) PDE: > (u* - u_prev) + (tau )A u* = f1 > (u - u*) + (tau)B u = f2 > Here A and B are constant matrices which have been assembled, u_prev is solution at previous time level, tau is the time step and u is the solution at current time level. > > It appears to me I cannot rewrite this in a form required by the TS module. So my question(s) are: > > 1) Am I missing something here, or is there a way to cast this into the framework of TS > 2) If there isn?t a way, am I better off doing the time stepping myself ? > > Thanks, > Shriram From shriram at ualberta.ca Mon Jun 9 13:37:47 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Mon, 09 Jun 2014 12:37:47 -0600 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> References: <5395E62A.70707@ualberta.ca> <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> Message-ID: <5395FEFB.9000307@ualberta.ca> It isnt an ODE actually. Its the unsteady diffusion equation u_t + div(k grad(u)) = f. The matrices A and B represent discretisations of the fluxes. I am using operator splitting to advance the solution in two stages in every time step: First compute u*, then use it to compute u. Shriram On 14-06-09 12:28 PM, Barry Smith wrote: > What is the ODE you are solving? > > Barr > > On Jun 9, 2014, at 11:51 AM, Shriram Srinivasan wrote: > >> Hi, >> I am working with the (discretised) PDE: >> (u* - u_prev) + (tau )A u* = f1 >> (u - u*) + (tau)B u = f2 >> Here A and B are constant matrices which have been assembled, u_prev is solution at previous time level, tau is the time step and u is the solution at current time level. >> >> It appears to me I cannot rewrite this in a form required by the TS module. So my question(s) are: >> >> 1) Am I missing something here, or is there a way to cast this into the framework of TS >> 2) If there isn?t a way, am I better off doing the time stepping myself ? >> >> Thanks, >> Shriram From alexeftimiades at gmail.com Mon Jun 9 13:45:45 2014 From: alexeftimiades at gmail.com (Alex Eftimiades) Date: Mon, 09 Jun 2014 14:45:45 -0400 Subject: [petsc-users] Getting Eigenvectors from a SLEPc run (in python) Message-ID: <539600D9.7020203@gmail.com> Hi I am trying solve a sparse generalized Hermitian matrix eigenvalue problem. I am trying to use slepc4py to get both the eigenvalues and eigenvectors, but I am unable to get the eigenvectors. Instead, I keep getting None. Do you have any example code that can do this? I copied the code I have tried below. Thanks, Alex Eftimiades xr, tmp = A.getVecs() xi, tmp = A.getVecs() # Setup the eigensolver E = SLEPc.EPS().create() E.setOperators(A,M) E.setDimensions(50, PETSc.DECIDE) E.setWhichEigenpairs("SM") E.solve() vals = [] vecs = [] for i in range(E.getConverged()): val = E.getEigenpair(i, xr, xi) vecr, veci = E.getEigenvector(i, xr, xi) vals.append(val) vecs.append(complex(vecr, veci)) vals = asarray(vals) vecs = asarray(vecs).T From bsmith at mcs.anl.gov Mon Jun 9 13:47:38 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 9 Jun 2014 13:47:38 -0500 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <5395FEFB.9000307@ualberta.ca> References: <5395E62A.70707@ualberta.ca> <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> <5395FEFB.9000307@ualberta.ca> Message-ID: <2EFA1736-7B1F-42FB-9CDF-57B2485079CF@mcs.anl.gov> On Jun 9, 2014, at 1:37 PM, Shriram Srinivasan wrote: > It isnt an ODE actually. Its the unsteady diffusion equation u_t + div(k grad(u)) = f. The matrices A and B represent discretisations of the fluxes. I am using operator splitting to advance the solution in two stages in every time step: > First compute u*, then use it to compute u. Understood. But if you can define/describe the exact operator split method (i.e. what defines A and B) then we see how it could possibly be handled within TS. Our goal is, when possible, to make TS flexible enough to support some operator split methods, we can only do this by specific examples. Thanks Barry > > Shriram > > > > On 14-06-09 12:28 PM, Barry Smith wrote: >> What is the ODE you are solving? >> >> Barr >> >> On Jun 9, 2014, at 11:51 AM, Shriram Srinivasan wrote: >> >>> Hi, >>> I am working with the (discretised) PDE: >>> (u* - u_prev) + (tau )A u* = f1 >>> (u - u*) + (tau)B u = f2 >>> Here A and B are constant matrices which have been assembled, u_prev is solution at previous time level, tau is the time step and u is the solution at current time level. >>> >>> It appears to me I cannot rewrite this in a form required by the TS module. So my question(s) are: >>> >>> 1) Am I missing something here, or is there a way to cast this into the framework of TS >>> 2) If there isn?t a way, am I better off doing the time stepping myself ? >>> >>> Thanks, >>> Shriram > From knepley at gmail.com Mon Jun 9 13:49:02 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 9 Jun 2014 13:49:02 -0500 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <5395FEFB.9000307@ualberta.ca> References: <5395E62A.70707@ualberta.ca> <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> <5395FEFB.9000307@ualberta.ca> Message-ID: On Mon, Jun 9, 2014 at 1:37 PM, Shriram Srinivasan wrote: > It isnt an ODE actually. Its the unsteady diffusion equation u_t + div(k > grad(u)) = f. The matrices A and B represent discretisations of the fluxes. > I am using operator splitting to advance the solution in two stages in > every time step: > First compute u*, then use it to compute u. > For a purely explicit method, this would map to u_t = G(u, t) where G(u, t) = f - div(k grad u) which you could set using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetRHSFunction.html If you want to use implicit methods, then you can set F(u_t, u, t) = u_t + div(k grad u) - f using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetIFunction.html TS ex25 is a reaction-diffusion example, and is gone over in these slides http://www.mcs.anl.gov/petsc/documentation/tutorials/BuffaloTutorial.pdf starting at slide 169 Thanks, Matt > Shriram > > > > On 14-06-09 12:28 PM, Barry Smith wrote: > >> What is the ODE you are solving? >> >> Barr >> >> On Jun 9, 2014, at 11:51 AM, Shriram Srinivasan >> wrote: >> >> Hi, >>> I am working with the (discretised) PDE: >>> (u* - u_prev) + (tau )A u* = f1 >>> (u - u*) + (tau)B u = f2 >>> Here A and B are constant matrices which have been assembled, u_prev is >>> solution at previous time level, tau is the time step and u is the solution >>> at current time level. >>> >>> It appears to me I cannot rewrite this in a form required by the TS >>> module. So my question(s) are: >>> >>> 1) Am I missing something here, or is there a way to cast this into the >>> framework of TS >>> 2) If there isn?t a way, am I better off doing the time stepping myself ? >>> >>> Thanks, >>> Shriram >>> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Jun 9 13:58:32 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 9 Jun 2014 20:58:32 +0200 Subject: [petsc-users] Getting Eigenvectors from a SLEPc run (in python) In-Reply-To: <539600D9.7020203@gmail.com> References: <539600D9.7020203@gmail.com> Message-ID: <959DD8ED-6DCD-4B56-947F-376293D5C8F2@dsic.upv.es> El 09/06/2014, a las 20:45, Alex Eftimiades escribi?: > Hi > > I am trying solve a sparse generalized Hermitian matrix eigenvalue problem. I am trying to use slepc4py to get both the eigenvalues and eigenvectors, but I am unable to get the eigenvectors. Instead, I keep getting None. Do you have any example code that can do this? I copied the code I have tried below. > > Thanks, > Alex Eftimiades > > xr, tmp = A.getVecs() > xi, tmp = A.getVecs() > > # Setup the eigensolver > E = SLEPc.EPS().create() > E.setOperators(A,M) > E.setDimensions(50, PETSc.DECIDE) > > E.setWhichEigenpairs("SM") > > E.solve() > > vals = [] > vecs = [] > for i in range(E.getConverged()): > val = E.getEigenpair(i, xr, xi) > vecr, veci = E.getEigenvector(i, xr, xi) > vals.append(val) > vecs.append(complex(vecr, veci)) > > vals = asarray(vals) > vecs = asarray(vecs).T Eigenvectors are not obtained as return values. Instead, the arguments of getEigenvector() are mutable, so after the call they contain the eigenvector. Jose From alexeftimiades at gmail.com Mon Jun 9 15:33:44 2014 From: alexeftimiades at gmail.com (Alex Eftimiades) Date: Mon, 09 Jun 2014 16:33:44 -0400 Subject: [petsc-users] Getting Eigenvectors from a SLEPc run (in python) In-Reply-To: <959DD8ED-6DCD-4B56-947F-376293D5C8F2@dsic.upv.es> References: <539600D9.7020203@gmail.com> <959DD8ED-6DCD-4B56-947F-376293D5C8F2@dsic.upv.es> Message-ID: <53961A28.6010804@gmail.com> Thanks. For anyone who reads this, the correct code looks like this: E.solve() vals = [] vecs = [] for i in range(E.getConverged()): val = E.getEigenpair(i, xr, xi) vals.append(val) vecs = [complex(xr0, xi0) for xr0, xi0 in zip(xr.getArray(), xi.getArray())] vals = asarray(vals) vecs = asarray(vecs).T On 06/09/2014 02:58 PM, Jose E. Roman wrote: > El 09/06/2014, a las 20:45, Alex Eftimiades escribi?: > >> Hi >> >> I am trying solve a sparse generalized Hermitian matrix eigenvalue problem. I am trying to use slepc4py to get both the eigenvalues and eigenvectors, but I am unable to get the eigenvectors. Instead, I keep getting None. Do you have any example code that can do this? I copied the code I have tried below. >> >> Thanks, >> Alex Eftimiades >> >> xr, tmp = A.getVecs() >> xi, tmp = A.getVecs() >> >> # Setup the eigensolver >> E = SLEPc.EPS().create() >> E.setOperators(A,M) >> E.setDimensions(50, PETSc.DECIDE) >> >> E.setWhichEigenpairs("SM") >> >> E.solve() >> >> vals = [] >> vecs = [] >> for i in range(E.getConverged()): >> val = E.getEigenpair(i, xr, xi) >> vecr, veci = E.getEigenvector(i, xr, xi) >> vals.append(val) >> vecs.append(complex(vecr, veci)) >> >> vals = asarray(vals) >> vecs = asarray(vecs).T > Eigenvectors are not obtained as return values. Instead, the arguments of getEigenvector() are mutable, so after the call they contain the eigenvector. > > Jose > From alexeftimiades at gmail.com Mon Jun 9 15:43:31 2014 From: alexeftimiades at gmail.com (Alex Eftimiades) Date: Mon, 09 Jun 2014 16:43:31 -0400 Subject: [petsc-users] slepc4py: How to create a matrix in serial then find its eigenvalues in parallel? Message-ID: <53961C73.1030300@gmail.com> I have a routine that creates a matrix in parallel by spawning subprocesse. This is to say it is run from a serial python interpreter, but spawns subprocesses so that it works in parallel. The result is a scipy csr_matrix. What would be the best way use slepsc4py to solve for eigenvalues and eigenvectors in parallel? I tried a rather awkward method (described below), but I suspect there is a better way to go about doing this. I have currently tried saving the csr data to a file, reading it with a new python interpreter spawned with mpiexec, and writing the resulting eigenvalues and eigenvectors to a file. However, something is going wrong as I read in the file from the python interpreter spawned with mpiexec. Thanks, Alex Eftimiades From shriram at ualberta.ca Mon Jun 9 15:54:53 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Mon, 09 Jun 2014 14:54:53 -0600 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <2EFA1736-7B1F-42FB-9CDF-57B2485079CF@mcs.anl.gov> References: <5395E62A.70707@ualberta.ca> <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> <5395FEFB.9000307@ualberta.ca> <2EFA1736-7B1F-42FB-9CDF-57B2485079CF@mcs.anl.gov> Message-ID: <53961F1D.90505@ualberta.ca> /Understood. But if you can define/describe the exact operator split method (i.e. what defines A and B) then we see how it could possibly be handled within TS. Our goal is, when possible, to make TS flexible enough to support some operator split methods, we can only do this by specific examples. Thanks Barry/ Au = (k grad(u)_x, is a discretization of the flux in the x direction, while B is for flux in the y direction. These matrices are assembled by solving local elliptic problems in each cell following a multiscale finite volume method. To begin, the operator split is done as follows: u_t + Au = f ( considering only x dir fluxes) as (u* - u_n)/tau + A u* = f and solve for u* then u_t + B u = f (considering only y dir fluxes) as (u - u*)/tau + B u = f and solve for u at current level. Since TS module cant be used, I shall have to do the time marching by writing my code. Perhaps I should be reusing the KSP solvers, since the sparsity pattern etc will be unchanged at each time step. Thanks, Shriram -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jun 9 15:56:46 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 9 Jun 2014 15:56:46 -0500 Subject: [petsc-users] slepc4py: How to create a matrix in serial then find its eigenvalues in parallel? In-Reply-To: <53961C73.1030300@gmail.com> References: <53961C73.1030300@gmail.com> Message-ID: On Mon, Jun 9, 2014 at 3:43 PM, Alex Eftimiades wrote: > I have a routine that creates a matrix in parallel by spawning > subprocesse. This is to say it is run from a serial python interpreter, but > spawns subprocesses so that it works in parallel. The result is a scipy > csr_matrix. > > What would be the best way use slepsc4py to solve for eigenvalues and > eigenvectors in parallel? I tried a rather awkward method (described > below), but I suspect there is a better way to go about doing this. > > I have currently tried saving the csr data to a file, reading it with a > new python interpreter spawned with mpiexec, and writing the resulting > eigenvalues and eigenvectors to a file. However, something is going wrong > as I read in the file from the python interpreter spawned with mpiexec. > I suggest using the PetscBinaryViewer to read and write the file. This way you can save it in serial, but read it in parallel. Thanks, Matt > Thanks, > Alex Eftimiades > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jun 9 16:04:01 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 09 Jun 2014 23:04:01 +0200 Subject: [petsc-users] Can I use TS routines for operator split formulation In-Reply-To: <53961F1D.90505@ualberta.ca> References: <5395E62A.70707@ualberta.ca> <5B15C980-8C39-4C0D-B0C6-9860B35ACCC5@mcs.anl.gov> <5395FEFB.9000307@ualberta.ca> <2EFA1736-7B1F-42FB-9CDF-57B2485079CF@mcs.anl.gov> <53961F1D.90505@ualberta.ca> Message-ID: <87tx7t7q2m.fsf@jedbrown.org> Shriram Srinivasan writes: > Au = (k grad(u)_x, is a discretization of the flux in the x direction, > while B is for flux in the y direction. These matrices are assembled by > solving local elliptic problems in each cell following a multiscale > finite volume method. > To begin, the operator split is done as follows: > > u_t + Au = f ( considering only x dir fluxes) as (u* - u_n)/tau + A u* > = f and solve for u* > > then u_t + B u = f (considering only y dir fluxes) as (u - u*)/tau + B u > = f and solve for u at current level. This is ADI "alternating direction implicit" and tends not to be very accurate, but more importantly, is a pretty terrible parallel algorithm that isn't used much any more. Unless you are implementing this for nostalgia purposes, I recommend using multigrid (algebraic or geometric) to solve the coupled problem. This will be fast and eliminate the splitting error. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From michele.rosso84 at gmail.com Mon Jun 9 22:56:13 2014 From: michele.rosso84 at gmail.com (Michele Rosso) Date: Mon, 09 Jun 2014 20:56:13 -0700 Subject: [petsc-users] Problem in MPI communicator Message-ID: <539681DD.9030700@gmail.com> Hi, I am trying to re-number the mpi ranks in order to have the domain decomposition obtained from DMDACreate3D() match the default decomposition provided by MPI_Cart_create(). I followed the method described in the FAQ: call PetscInitialize(PETSC_NULL_CHARACTER,ierr) call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' x = rank / (pz*py); y = mod(rank,(pz*py))/pz z = mod(mod(rank,pz*py),pz) newrank = z*py*px + y*px + x; call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) PETSC_COMM_WORLD = newcomm I tried to run my code (it works fine with the standard PETSc decomposition) with the new decomposition but I received the error message; I attached the full output. I run with only one processor to test the setup and I commented all the lines where I actually insert/get data into/from the PETSc arrays. Could you please help fixing this? Thanks, Michele -------------- next part -------------- [0] petscinitialize_(): (Fortran):PETSc successfully started: procs 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS nid27514.(none) [0] petscinitialize_(): Running on machine: nid27514 [0] PetscCommDuplicate(): Duplicating a communicator -2080374779 -2080374778 max tags = 4194303 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 4194303 [0] VecScatterCreate(): Sequential vector scatter with block indices [0] VecScatterCreate(): Sequential vector scatter with block indices Processor [0] M 128 N 32 P 32 m 1 n 1 p 1 w 1 s 1 X range of indices: 0 128, Y range of indices: 0 32, Z range of indices: 0 32 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 131072 X 131072; storage space: 0 unneeded,917504 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 7 [0] Mat_CheckInode(): Found 131072 nodes out of 131072 rows. Not using Inode routines [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 2! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./hit on a interlagos-64idx-gnu-dbg named nid27514 by mrosso Mon Jun 9 21:54:56 2014 [0]PETSC ERROR: Libraries linked from /u/sciteam/mrosso/LIBS/petsc-3.4.4/interlagos-64idx-gnu-dbg/lib [0]PETSC ERROR: Configure run at Mon May 26 17:26:25 2014 [0]PETSC ERROR: Configure options --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/gfortran64/lib -lacml" --with-x="0 " --with-debugging="1 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-dynamic-loading="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --download-hypre=1 PETSC_ARCH=interlagos-64idx-gnu-dbg [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: DMGlobalToLocalBegin_DA() line 17 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dagtol.c [0]PETSC ERROR: DMGlobalToLocalBegin() line 1626 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 2! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./hit on a interlagos-64idx-gnu-dbg named nid27514 by mrosso Mon Jun 9 21:54:56 2014 [0]PETSC ERROR: Libraries linked from /u/sciteam/mrosso/LIBS/petsc-3.4.4/interlagos-64idx-gnu-dbg/lib [0]PETSC ERROR: Configure run at Mon May 26 17:26:25 2014 [0]PETSC ERROR: Configure options --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/gfortran64/lib -lacml" --with-x="0 " --with-debugging="1 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-dynamic-loading="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --download-hypre=1 PETSC_ARCH=interlagos-64idx-gnu-dbg [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: DMGlobalToLocalEnd_DA() line 33 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dagtol.c [0]PETSC ERROR: DMGlobalToLocalEnd() line 1669 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [0] PetscFinalize(): PetscFinalize() called [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm -2080374778 [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -1006632960 max tags = 4194303 #PETSc Option Table entries: -dm_view -info -ksp_converged_reason -ksp_monitor_true_residual -malloc_dump -mg_coarse_pc_factor_mat_solver_package superlu_dist -mg_coarse_pc_type lu -mg_levels_ksp_max_it 1 -mg_levels_ksp_type richardson -options_left -pc_mg_galerkin -pc_mg_levels 4 -pc_type mg #End of PETSc Option Table entries [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm -1006632960 [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator embedded in a user MPI_Comm -1006632960 [0] Petsc_DelComm_Outer(): User MPI_Comm 1140850688 is being freed after removing reference from inner PETSc comm to this outer comm [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -1006632960 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm -1006632960 [0] Petsc_DelThreadComm(): Deleting thread communicator data in an MPI_Comm -1006632960 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -1006632960 There are 4 unused database options. They are: Option left: name:-mg_coarse_pc_factor_mat_solver_package value: superlu_dist Option left: name:-mg_coarse_pc_type value: lu Option left: name:-mg_levels_ksp_max_it value: 1 Option left: name:-mg_levels_ksp_type value: richardson [0]Total space allocated 7959408 bytes [ 0]288 bytes PetscObjectListAdd() line 119 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/olist.c [0] PetscObjectListAdd() line 119 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/olist.c [0] PetscObjectCompose_Petsc() line 640 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] PetscObjectCompose() line 724 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] VecSetDM() line 182 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]80 bytes PetscObjectComposedDataIncreaseReal() line 170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] PetscObjectComposedDataIncreaseReal() line 170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] VecSet() line 564 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/rvector.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]80 bytes PetscObjectComposedDataIncreaseReal() line 168 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] PetscObjectComposedDataIncreaseReal() line 168 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] VecSet() line 564 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/rvector.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]16 bytes PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscObjectChangeTypeName() line 134 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/pname.c [0] VecCreate_Seq_Private() line 1244 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]32 bytes VecCreate_Seq_Private() line 1245 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq_Private() line 1245 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]1048576 bytes VecCreate_Seq() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecCreate_Seq() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]528 bytes VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]64 bytes VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]1008 bytes VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]16 bytes PetscLayoutSetUp() line 156 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] PetscLayoutSetUp() line 156 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] VecCreate_Seq_Private() line 1244 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [ 0]96 bytes PetscLayoutCreate() line 53 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] PetscLayoutCreate() line 53 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] VecCreate() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [ 0]1202240 bytes ISLocalToGlobalMappingCreate() line 235 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 235 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]576 bytes ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1202240 bytes DMSetUp_DA_3D() line 1360 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA_3D() line 1360 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1196032 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1196032 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]640 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]224 bytes DMSetUp_DA_3D() line 729 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA_3D() line 729 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1048576 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1048576 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]640 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] VecCreate() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreateSeqWithArray() line 1303 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes DMSetUp_DA() line 22 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp_DA() line 22 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMSetUp_DA() line 20 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp_DA() line 20 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMDASetOwnershipRanges() line 576 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDASetOwnershipRanges() line 576 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMDASetOwnershipRanges() line 568 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDASetOwnershipRanges() line 568 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMDASetOwnershipRanges() line 560 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDASetOwnershipRanges() line 560 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscObjectChangeTypeName() line 134 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/pname.c [0] DMSetType() line 2393 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] DMCreate_DA() line 279 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMSetType() line 2393 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]736 bytes DMCreate_DA() line 281 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMCreate_DA() line 281 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMSetType() line 2393 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]96 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]704 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]96 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]704 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscThreadCommReductionCreate() line 448 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 448 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]512 bytes PetscThreadCommReductionCreate() line 440 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 440 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]256 bytes PetscThreadCommReductionCreate() line 436 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 436 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1280 bytes PetscThreadCommReductionCreate() line 435 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 435 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscThreadCommReductionCreate() line 432 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 432 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]128 bytes PetscThreadCommWorldInitialize() line 1241 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1241 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]2560 bytes PetscThreadCommWorldInitialize() line 1240 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1240 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscThreadCommWorldInitialize() line 1232 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1232 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscThreadCommSetAffinities() line 423 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommSetAffinities() line 423 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]48 bytes PetscThreadCommCreate() line 150 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommCreate() line 150 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]336 bytes PetscThreadCommCreate() line 146 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommCreate() line 146 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]240 bytes DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]4112 bytes DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c Application 4892110 resources: utime ~4s, stime ~0s, Rss ~88192, inblocks ~26785, outblocks ~93626 From mrosso at uci.edu Mon Jun 9 22:56:21 2014 From: mrosso at uci.edu (Michele Rosso) Date: Mon, 09 Jun 2014 20:56:21 -0700 Subject: [petsc-users] Problem in MPI communicator Message-ID: <539681E5.9050007@uci.edu> Hi, I am trying to re-number the mpi ranks in order to have the domain decomposition obtained from DMDACreate3D() match the default decomposition provided by MPI_Cart_create(). I followed the method described in the FAQ: call PetscInitialize(PETSC_NULL_CHARACTER,ierr) call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' x = rank / (pz*py); y = mod(rank,(pz*py))/pz z = mod(mod(rank,pz*py),pz) newrank = z*py*px + y*px + x; call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) PETSC_COMM_WORLD = newcomm I tried to run my code (it works fine with the standard PETSc decomposition) with the new decomposition but I received the error message; I attached the full output. I run with only one processor to test the setup and I commented all the lines where I actually insert/get data into/from the PETSc arrays. Could you please help fixing this? Thanks, Michele -------------- next part -------------- [0] petscinitialize_(): (Fortran):PETSc successfully started: procs 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS nid27514.(none) [0] petscinitialize_(): Running on machine: nid27514 [0] PetscCommDuplicate(): Duplicating a communicator -2080374779 -2080374778 max tags = 4194303 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 4194303 [0] VecScatterCreate(): Sequential vector scatter with block indices [0] VecScatterCreate(): Sequential vector scatter with block indices Processor [0] M 128 N 32 P 32 m 1 n 1 p 1 w 1 s 1 X range of indices: 0 128, Y range of indices: 0 32, Z range of indices: 0 32 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 131072 X 131072; storage space: 0 unneeded,917504 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 7 [0] Mat_CheckInode(): Found 131072 nodes out of 131072 rows. Not using Inode routines [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] KSPSetNormType(): Warning: setting KSPNormType to skip computing the norm KSP convergence test is implicitly set to KSPSkipConverged [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [0] PetscCommDuplicate(): Using internal PETSc communicator -2080374779 -2080374778 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 2! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./hit on a interlagos-64idx-gnu-dbg named nid27514 by mrosso Mon Jun 9 21:54:56 2014 [0]PETSC ERROR: Libraries linked from /u/sciteam/mrosso/LIBS/petsc-3.4.4/interlagos-64idx-gnu-dbg/lib [0]PETSC ERROR: Configure run at Mon May 26 17:26:25 2014 [0]PETSC ERROR: Configure options --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/gfortran64/lib -lacml" --with-x="0 " --with-debugging="1 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-dynamic-loading="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --download-hypre=1 PETSC_ARCH=interlagos-64idx-gnu-dbg [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: DMGlobalToLocalBegin_DA() line 17 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dagtol.c [0]PETSC ERROR: DMGlobalToLocalBegin() line 1626 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 2! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./hit on a interlagos-64idx-gnu-dbg named nid27514 by mrosso Mon Jun 9 21:54:56 2014 [0]PETSC ERROR: Libraries linked from /u/sciteam/mrosso/LIBS/petsc-3.4.4/interlagos-64idx-gnu-dbg/lib [0]PETSC ERROR: Configure run at Mon May 26 17:26:25 2014 [0]PETSC ERROR: Configure options --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/gfortran64/lib -lacml" --with-x="0 " --with-debugging="1 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-dynamic-loading="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --download-hypre=1 PETSC_ARCH=interlagos-64idx-gnu-dbg [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: DMGlobalToLocalEnd_DA() line 33 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dagtol.c [0]PETSC ERROR: DMGlobalToLocalEnd() line 1669 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [0] PetscFinalize(): PetscFinalize() called [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm -2080374778 [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -1006632960 max tags = 4194303 #PETSc Option Table entries: -dm_view -info -ksp_converged_reason -ksp_monitor_true_residual -malloc_dump -mg_coarse_pc_factor_mat_solver_package superlu_dist -mg_coarse_pc_type lu -mg_levels_ksp_max_it 1 -mg_levels_ksp_type richardson -options_left -pc_mg_galerkin -pc_mg_levels 4 -pc_type mg #End of PETSc Option Table entries [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm -1006632960 [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator embedded in a user MPI_Comm -1006632960 [0] Petsc_DelComm_Outer(): User MPI_Comm 1140850688 is being freed after removing reference from inner PETSc comm to this outer comm [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -1006632960 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm -1006632960 [0] Petsc_DelThreadComm(): Deleting thread communicator data in an MPI_Comm -1006632960 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -1006632960 There are 4 unused database options. They are: Option left: name:-mg_coarse_pc_factor_mat_solver_package value: superlu_dist Option left: name:-mg_coarse_pc_type value: lu Option left: name:-mg_levels_ksp_max_it value: 1 Option left: name:-mg_levels_ksp_type value: richardson [0]Total space allocated 7959408 bytes [ 0]288 bytes PetscObjectListAdd() line 119 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/olist.c [0] PetscObjectListAdd() line 119 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/olist.c [0] PetscObjectCompose_Petsc() line 640 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] PetscObjectCompose() line 724 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] VecSetDM() line 182 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]80 bytes PetscObjectComposedDataIncreaseReal() line 170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] PetscObjectComposedDataIncreaseReal() line 170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] VecSet() line 564 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/rvector.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]80 bytes PetscObjectComposedDataIncreaseReal() line 168 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] PetscObjectComposedDataIncreaseReal() line 168 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/state.c [0] VecSet() line 564 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/rvector.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]16 bytes PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscObjectChangeTypeName() line 134 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/pname.c [0] VecCreate_Seq_Private() line 1244 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]32 bytes VecCreate_Seq_Private() line 1245 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq_Private() line 1245 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]1048576 bytes VecCreate_Seq() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecCreate_Seq() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]528 bytes VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]64 bytes VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]1008 bytes VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreate() line 39 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] VecDuplicate_MPI_DA() line 16 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] VecDuplicate() line 510 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vector.c [ 0]16 bytes PetscLayoutSetUp() line 156 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] PetscLayoutSetUp() line 156 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] VecCreate_Seq_Private() line 1244 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] VecCreate_Seq() line 34 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec3.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] VecCreate_Standard() line 262 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/mpi/pbvec.c [0] VecSetType() line 38 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/vecreg.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [ 0]96 bytes PetscLayoutCreate() line 53 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] PetscLayoutCreate() line 53 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/pmap.c [0] VecCreate() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] DMCreateGlobalVector_DA() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dadist.c [0] DMCreateGlobalVector() line 597 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [ 0]1202240 bytes ISLocalToGlobalMappingCreate() line 235 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 235 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]576 bytes ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreate() line 227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] ISLocalToGlobalMappingCreateIS() line 128 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/utils/isltog.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1202240 bytes DMSetUp_DA_3D() line 1360 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA_3D() line 1360 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1196032 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1196032 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]640 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]224 bytes DMSetUp_DA_3D() line 729 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA_3D() line 729 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1048576 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1048576 bytes VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1171 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]80 bytes VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 1170 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]640 bytes VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] VecScatterCreate() line 938 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/utils/vscat.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] VecCreate() line 32 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/interface/veccreate.c [0] VecCreateSeqWithArray() line 1303 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/vec/impls/seq/bvec2.c [0] DMSetUp_DA_3D() line 205 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [0] DMSetUp_DA() line 15 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes DMSetUp_DA() line 22 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp_DA() line 22 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMSetUp_DA() line 20 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp_DA() line 20 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dareg.c [0] DMSetUp() line 474 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMDASetOwnershipRanges() line 576 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDASetOwnershipRanges() line 576 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMDASetOwnershipRanges() line 568 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDASetOwnershipRanges() line 568 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes DMDASetOwnershipRanges() line 560 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDASetOwnershipRanges() line 560 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscObjectChangeTypeName() line 134 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/pname.c [0] DMSetType() line 2393 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] PetscStrallocpy() line 188 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/utils/str.c [0] DMCreate_DA() line 279 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMSetType() line 2393 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]736 bytes DMCreate_DA() line 281 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMCreate_DA() line 281 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMSetType() line 2393 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]96 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]704 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]96 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]704 bytes PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] PetscSFCreate() line 43 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/vec/is/sf/interface/sf.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscThreadCommReductionCreate() line 448 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 448 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]512 bytes PetscThreadCommReductionCreate() line 440 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 440 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]256 bytes PetscThreadCommReductionCreate() line 436 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 436 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]1280 bytes PetscThreadCommReductionCreate() line 435 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 435 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscThreadCommReductionCreate() line 432 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommReductionCreate() line 432 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcommred.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]128 bytes PetscThreadCommWorldInitialize() line 1241 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1241 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]2560 bytes PetscThreadCommWorldInitialize() line 1240 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1240 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscThreadCommWorldInitialize() line 1232 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1232 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]16 bytes PetscThreadCommSetAffinities() line 423 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommSetAffinities() line 423 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]48 bytes PetscThreadCommCreate() line 150 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommCreate() line 150 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]336 bytes PetscThreadCommCreate() line 146 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommCreate() line 146 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscThreadCommWorldInitialize() line 1227 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscGetThreadCommWorld() line 80 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommGetThreadComm() line 114 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/threadcomm/interface/threadcomm.c [0] PetscCommDuplicate() line 139 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]32 bytes PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscCommDuplicate() line 151 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/tagm.c [0] PetscHeaderCreate_Private() line 31 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/sys/objects/inherit.c [0] DMCreate() line 72 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]240 bytes DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]64 bytes DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c [ 0]4112 bytes DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMCreate() line 81 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/interface/dm.c [0] DMDACreate() line 390 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/dacreate.c [0] DMDACreate3d() line 1456 in /u/sciteam/mrosso/LIBS/petsc-3.4.4/src/dm/impls/da/da3.c Application 4892110 resources: utime ~4s, stime ~0s, Rss ~88192, inblocks ~26785, outblocks ~93626 From bsmith at mcs.anl.gov Mon Jun 9 23:06:18 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 9 Jun 2014 23:06:18 -0500 Subject: [petsc-users] Problem in MPI communicator In-Reply-To: <539681DD.9030700@gmail.com> References: <539681DD.9030700@gmail.com> Message-ID: This may not be the only problem but you can never ever ever change PETSC_COMM_WORLD after PetscInitialize(). You need to 1) call MPI_Init() your self first then 2) do your renumbering using MPI_COMM_WORLD, not PETSC_COMM_WORLD 3) set PETSC_COMM_WORLD to your new comm 4) call PetscInitialize(). Let us know how it goes, if you still get failure send the entire .F file so we can run it to track down the issue. Barry On Jun 9, 2014, at 10:56 PM, Michele Rosso wrote: > Hi, > > I am trying to re-number the mpi ranks in order to have the domain decomposition obtained from DMDACreate3D() match the default decomposition provided by MPI_Cart_create(). I followed the method described in the FAQ: > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) > if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' > x = rank / (pz*py); > y = mod(rank,(pz*py))/pz > z = mod(mod(rank,pz*py),pz) > newrank = z*py*px + y*px + x; > call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) > PETSC_COMM_WORLD = newcomm > > I tried to run my code (it works fine with the standard PETSc decomposition) with the new decomposition but I received the error message; I attached the full output. I run with only one processor to test the setup and I commented all the lines where I actually insert/get data into/from the PETSc arrays. > Could you please help fixing this? > > Thanks, > Michele > > > > > > From jed at jedbrown.org Tue Jun 10 01:39:23 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 10 Jun 2014 08:39:23 +0200 Subject: [petsc-users] Problem in MPI communicator In-Reply-To: <539681E5.9050007@uci.edu> References: <539681E5.9050007@uci.edu> Message-ID: <87bnu16zfo.fsf@jedbrown.org> Michele Rosso writes: > Hi, > > I am trying to re-number the mpi ranks in order to have the domain > decomposition obtained from DMDACreate3D() match the default > decomposition provided by MPI_Cart_create(). I followed the method > described in the FAQ: > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) > if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator > problem' > x = rank / (pz*py); > y = mod(rank,(pz*py))/pz > z = mod(mod(rank,pz*py),pz) > newrank = z*py*px + y*px + x; > call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) > PETSC_COMM_WORLD = newcomm You can't change PETSC_COMM_WORLD after PetscInitialize. Just pass newcomm to DMDACreate3d(), or literally use MPI_Cart_create() and pass that communicator to DMDACreate3d. You could alternatively set PETSC_COMM_WORLD before PetscInitialize, but that is a non-local dependency (a "code smell"). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From mrosso at uci.edu Tue Jun 10 02:19:27 2014 From: mrosso at uci.edu (Michele Rosso) Date: Tue, 10 Jun 2014 00:19:27 -0700 Subject: [petsc-users] Problem in MPI communicator In-Reply-To: References: <539681DD.9030700@gmail.com> Message-ID: <5396B17F.4000303@uci.edu> Barry, thanks for your reply. I tried to do as you suggested but it did not solve anything as you expected. I attached a minimal version on my code. The file "test.f90" contains both a main program and a module. The main program is set to run on 8 processors, but you can easily change the parameters in it. It seems like the issue affects the DMGlobalToLocalBegin() subroutine in my subroutine restore_rhs(). Hope this helps. Thanks, Michele On 06/09/2014 09:06 PM, Barry Smith wrote: > This may not be the only problem but you can never ever ever change PETSC_COMM_WORLD after PetscInitialize(). You need to > > 1) call MPI_Init() your self first then > > 2) do your renumbering using MPI_COMM_WORLD, not PETSC_COMM_WORLD > > 3) set PETSC_COMM_WORLD to your new comm > > 4) call PetscInitialize(). > > Let us know how it goes, if you still get failure send the entire .F file so we can run it to track down the issue. > > Barry > > On Jun 9, 2014, at 10:56 PM, Michele Rosso wrote: > >> Hi, >> >> I am trying to re-number the mpi ranks in order to have the domain decomposition obtained from DMDACreate3D() match the default decomposition provided by MPI_Cart_create(). I followed the method described in the FAQ: >> >> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >> call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) >> if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' >> x = rank / (pz*py); >> y = mod(rank,(pz*py))/pz >> z = mod(mod(rank,pz*py),pz) >> newrank = z*py*px + y*px + x; >> call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) >> PETSC_COMM_WORLD = newcomm >> >> I tried to run my code (it works fine with the standard PETSc decomposition) with the new decomposition but I received the error message; I attached the full output. I run with only one processor to test the setup and I commented all the lines where I actually insert/get data into/from the PETSc arrays. >> Could you please help fixing this? >> >> Thanks, >> Michele >> >> >> >> >> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: test.f90 Type: text/x-fortran Size: 10583 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Jun 10 13:13:28 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 10 Jun 2014 13:13:28 -0500 Subject: [petsc-users] Problem in MPI communicator In-Reply-To: <5396B17F.4000303@uci.edu> References: <539681DD.9030700@gmail.com> <5396B17F.4000303@uci.edu> Message-ID: <22B35CAF-5BF8-41BA-AF89-8BE057DABE7E@mcs.anl.gov> You have a bug in you code unrelated the the reordering. Note the variable x is declared in your module and then again inside the subroutine init. Thus when init does call VecDuplicate(b,x,ierr) it sets the value of the inner x, not the outer x. It would be nice if the Fortran compilers warned you about variables like this. Barry module poisson_solver implicit none #include private Vec :: b, x Mat :: A KSP :: ksp DM :: da PetscScalar :: dxdyqdz, dxdzqdy, dydzqdx integer, parameter :: dp = kind(0.0D0) logical :: reset = .false. integer, parameter :: max_number_iter = 15 public init, finalize, solve save b, x, da, A, ksp, dxdyqdz, dxdzqdy, dydzqdx, reset contains subroutine init(px,py,pz,nx,ny,nz,nxl,nyl,nzl) integer, intent(in) :: px,py,pz,nx,ny,nz,nxl,nyl,nzl PetscInt :: nnx(px), nny(py), nnz(pz) PetscErrorCode :: ierr MatNullSpace :: nullspace integer :: newcomm,rank, newrank, x, y, z PetscInt :: xs,xm,ys,zs,ym,zm call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) ! if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' x = rank / (pz*py); y = mod(rank,(pz*py))/pz z = mod(mod(rank,pz*py),pz) newrank = z*py*px + y*px + x; call mpi_comm_split(MPI_COMM_WORLD, 1, newrank, newcomm, ierr) PETSC_COMM_WORLD = newcomm call PetscInitialize(PETSC_NULL_CHARACTER,ierr) ! Create DM context nnx = nxl nny = nyl nnz = nzl call DMDACreate3d( PETSC_COMM_WORLD , & & DMDA_BOUNDARY_PERIODIC , DMDA_BOUNDARY_PERIODIC, & & DMDA_BOUNDARY_PERIODIC , DMDA_STENCIL_STAR, & & nx, ny, nz, px, py, pz, 1, 1, nnx, nny, nnz, da , ierr) call DMDASetInterpolationType(da, DMDA_Q0, ierr); call DMCreateGlobalVector(da,b,ierr) ! Create Global Vectors call VecDuplicate(b,x,ierr) On Jun 10, 2014, at 2:19 AM, Michele Rosso wrote: > Barry, > > thanks for your reply. I tried to do as you suggested but it did not solve anything as you expected. > I attached a minimal version on my code. The file "test.f90" contains both a main program and a module. The main program is set to run on 8 processors, but you can easily change the parameters in it. It seems like the issue affects the > DMGlobalToLocalBegin() subroutine in my subroutine restore_rhs(). > Hope this helps. > > Thanks, > Michele > > On 06/09/2014 09:06 PM, Barry Smith wrote: >> This may not be the only problem but you can never ever ever change PETSC_COMM_WORLD after PetscInitialize(). You need to >> >> 1) call MPI_Init() your self first then >> >> 2) do your renumbering using MPI_COMM_WORLD, not PETSC_COMM_WORLD >> >> 3) set PETSC_COMM_WORLD to your new comm >> >> 4) call PetscInitialize(). >> >> Let us know how it goes, if you still get failure send the entire .F file so we can run it to track down the issue. >> >> Barry >> >> On Jun 9, 2014, at 10:56 PM, Michele Rosso wrote: >> >>> Hi, >>> >>> I am trying to re-number the mpi ranks in order to have the domain decomposition obtained from DMDACreate3D() match the default decomposition provided by MPI_Cart_create(). I followed the method described in the FAQ: >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>> call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) >>> if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' >>> x = rank / (pz*py); >>> y = mod(rank,(pz*py))/pz >>> z = mod(mod(rank,pz*py),pz) >>> newrank = z*py*px + y*px + x; >>> call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) >>> PETSC_COMM_WORLD = newcomm >>> >>> I tried to run my code (it works fine with the standard PETSc decomposition) with the new decomposition but I received the error message; I attached the full output. I run with only one processor to test the setup and I commented all the lines where I actually insert/get data into/from the PETSc arrays. >>> Could you please help fixing this? >>> >>> Thanks, >>> Michele >>> >>> >>> >>> >>> >>> > > From shriram at ualberta.ca Tue Jun 10 15:52:59 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Tue, 10 Jun 2014 14:52:59 -0600 Subject: [petsc-users] reuse ksp in time marching loop Message-ID: <5397702B.4090303@ualberta.ca> Hi all, I am implementing a time marching loop (My problem is not amenable to usage of the TS module) where the coefficient matrix A wont change with time. I was wondering, in the sequence of commands /KSPCreate(PETSC_COMM_SELF, &ksp); // //KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER); // //KSPGetPC(ksp, &pc);// //PCSetType(pc, PCLU); // //PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS);// //KSPSetFromOptions(ksp);// //KSPSolve(ksp, b, x);// //KSPDestroy(&ksp);/ would it be efficient/possible to move some of them out of the loop. If so, which ones can I move out. Thanks, Shriram -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jun 10 15:58:38 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 10 Jun 2014 22:58:38 +0200 Subject: [petsc-users] reuse ksp in time marching loop In-Reply-To: <5397702B.4090303@ualberta.ca> References: <5397702B.4090303@ualberta.ca> Message-ID: <87lht44h35.fsf@jedbrown.org> Shriram Srinivasan writes: > Hi all, > I am implementing a time marching loop (My problem is not amenable to > usage of the TS module) where the coefficient matrix A wont change with > time. I was wondering, in the sequence of commands > > /KSPCreate(PETSC_COMM_SELF, &ksp); // > //KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER); // > //KSPGetPC(ksp, &pc);// > //PCSetType(pc, PCLU); // > //PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS);// > //KSPSetFromOptions(ksp);// > //KSPSolve(ksp, b, x);// > //KSPDestroy(&ksp);/ > > would it be efficient/possible to move some of them out of the loop. > If so, which ones can I move out. Everything except KSPSolve. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Jun 10 15:59:29 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 10 Jun 2014 15:59:29 -0500 Subject: [petsc-users] reuse ksp in time marching loop In-Reply-To: <5397702B.4090303@ualberta.ca> References: <5397702B.4090303@ualberta.ca> Message-ID: <967B8F7F-D2A5-44D9-B952-86EB7E3668AD@mcs.anl.gov> everything except the solve On Jun 10, 2014, at 3:52 PM, Shriram Srinivasan wrote: > Hi all, > I am implementing a time marching loop (My problem is not amenable to usage of the TS module) where the coefficient matrix A wont change with time. I was wondering, in the sequence of commands > > KSPCreate(PETSC_COMM_SELF, &ksp); > KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER); > KSPGetPC(ksp, &pc); > PCSetType(pc, PCLU); > PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); > KSPSetFromOptions(ksp); > KSPSolve(ksp, b, x); > KSPDestroy(&ksp); > > would it be efficient/possible to move some of them out of the loop. > If so, which ones can I move out. > > Thanks, > Shriram > From michele.rosso84 at gmail.com Tue Jun 10 16:34:01 2014 From: michele.rosso84 at gmail.com (Michele Rosso) Date: Tue, 10 Jun 2014 14:34:01 -0700 Subject: [petsc-users] Problem in MPI communicator In-Reply-To: <22B35CAF-5BF8-41BA-AF89-8BE057DABE7E@mcs.anl.gov> References: <539681DD.9030700@gmail.com> <5396B17F.4000303@uci.edu> <22B35CAF-5BF8-41BA-AF89-8BE057DABE7E@mcs.anl.gov> Message-ID: <539779C9.8000902@gmail.com> Barry, thank you for pointing this out. Now it works as expected. Thanks for your help. Michele On 06/10/2014 11:13 AM, Barry Smith wrote: > You have a bug in you code unrelated the the reordering. Note the variable x is declared in your module and then again inside the subroutine init. Thus when init does call VecDuplicate(b,x,ierr) it sets the value of the inner x, not the outer x. > > It would be nice if the Fortran compilers warned you about variables like this. > > Barry > > > module poisson_solver > implicit none > #include > private > Vec :: b, x > Mat :: A > KSP :: ksp > DM :: da > PetscScalar :: dxdyqdz, dxdzqdy, dydzqdx > integer, parameter :: dp = kind(0.0D0) > logical :: reset = .false. > integer, parameter :: max_number_iter = 15 > > > public init, finalize, solve > save b, x, da, A, ksp, dxdyqdz, dxdzqdy, dydzqdx, reset > > contains > > subroutine init(px,py,pz,nx,ny,nz,nxl,nyl,nzl) > integer, intent(in) :: px,py,pz,nx,ny,nz,nxl,nyl,nzl > PetscInt :: nnx(px), nny(py), nnz(pz) > PetscErrorCode :: ierr > MatNullSpace :: nullspace > integer :: newcomm,rank, newrank, x, y, z > PetscInt :: xs,xm,ys,zs,ym,zm > > call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) > ! if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' > x = rank / (pz*py); > y = mod(rank,(pz*py))/pz > z = mod(mod(rank,pz*py),pz) > newrank = z*py*px + y*px + x; > call mpi_comm_split(MPI_COMM_WORLD, 1, newrank, newcomm, ierr) > PETSC_COMM_WORLD = newcomm > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > > ! Create DM context > nnx = nxl > nny = nyl > nnz = nzl > call DMDACreate3d( PETSC_COMM_WORLD , & > & DMDA_BOUNDARY_PERIODIC , DMDA_BOUNDARY_PERIODIC, & > & DMDA_BOUNDARY_PERIODIC , DMDA_STENCIL_STAR, & > & nx, ny, nz, px, py, pz, 1, 1, nnx, nny, nnz, da , ierr) > call DMDASetInterpolationType(da, DMDA_Q0, ierr); > call DMCreateGlobalVector(da,b,ierr) ! Create Global Vectors > call VecDuplicate(b,x,ierr) > > > On Jun 10, 2014, at 2:19 AM, Michele Rosso wrote: > >> Barry, >> >> thanks for your reply. I tried to do as you suggested but it did not solve anything as you expected. >> I attached a minimal version on my code. The file "test.f90" contains both a main program and a module. The main program is set to run on 8 processors, but you can easily change the parameters in it. It seems like the issue affects the >> DMGlobalToLocalBegin() subroutine in my subroutine restore_rhs(). >> Hope this helps. >> >> Thanks, >> Michele >> >> On 06/09/2014 09:06 PM, Barry Smith wrote: >>> This may not be the only problem but you can never ever ever change PETSC_COMM_WORLD after PetscInitialize(). You need to >>> >>> 1) call MPI_Init() your self first then >>> >>> 2) do your renumbering using MPI_COMM_WORLD, not PETSC_COMM_WORLD >>> >>> 3) set PETSC_COMM_WORLD to your new comm >>> >>> 4) call PetscInitialize(). >>> >>> Let us know how it goes, if you still get failure send the entire .F file so we can run it to track down the issue. >>> >>> Barry >>> >>> On Jun 9, 2014, at 10:56 PM, Michele Rosso wrote: >>> >>>> Hi, >>>> >>>> I am trying to re-number the mpi ranks in order to have the domain decomposition obtained from DMDACreate3D() match the default decomposition provided by MPI_Cart_create(). I followed the method described in the FAQ: >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>>> call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr) >>>> if(PETSC_COMM_WORLD/=MPI_COMM_WORLD) write(*,*) 'Communicator problem' >>>> x = rank / (pz*py); >>>> y = mod(rank,(pz*py))/pz >>>> z = mod(mod(rank,pz*py),pz) >>>> newrank = z*py*px + y*px + x; >>>> call mpi_comm_split(PETSC_COMM_WORLD, 1, newrank, newcomm, ierr) >>>> PETSC_COMM_WORLD = newcomm >>>> >>>> I tried to run my code (it works fine with the standard PETSc decomposition) with the new decomposition but I received the error message; I attached the full output. I run with only one processor to test the setup and I commented all the lines where I actually insert/get data into/from the PETSc arrays. >>>> Could you please help fixing this? >>>> >>>> Thanks, >>>> Michele >>>> >>>> >>>> >>>> >>>> >>>> >> From shriram at ualberta.ca Tue Jun 10 17:54:40 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Tue, 10 Jun 2014 16:54:40 -0600 Subject: [petsc-users] tridiagonal matrix in kspsolve Message-ID: <53978CB0.7010200@ualberta.ca> Apologies if my question is answered elsewhere already. I could not find this discussed in the manual or mailing list. I have been using matsolvermumps through ksp to solve my linear system. I wanted to simply get it working so I was not concerned about the structure of the matrix. 1) My matrix is tridiagonal, but the elements of the diagonals are not all equal. Will a dedicated tridiagonal solver be faster than matsolvermumps in this case ? 2) My matrix is tridiagonal but not in the conventional sense, the minor diagonals are farther apart. But it is possible to renumber the nodes and make it a conventional tridiagonal system. In this case, again, am I losing efficiency using matsolvermumps ? Please advise if you recommend a completely different course of action. Thanks, Shriram From jed at jedbrown.org Tue Jun 10 18:02:41 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 11 Jun 2014 01:02:41 +0200 Subject: [petsc-users] tridiagonal matrix in kspsolve In-Reply-To: <53978CB0.7010200@ualberta.ca> References: <53978CB0.7010200@ualberta.ca> Message-ID: <87egyw4bce.fsf@jedbrown.org> Shriram Srinivasan writes: > Apologies if my question is answered elsewhere already. I could not > find this discussed in the manual or mailing list. > > I have been using matsolvermumps through ksp to solve my linear system. > I wanted to simply get it working so I was not concerned about the > structure of the matrix. > > > 1) My matrix is tridiagonal, but the elements of the diagonals are not > all equal. Will a dedicated tridiagonal solver be faster than > matsolvermumps in this case ? > > 2) My matrix is tridiagonal but not in the conventional sense, the minor > diagonals are farther apart. But it is possible to renumber the nodes > and make it a conventional tridiagonal system. In this case, again, am I > losing efficiency using matsolvermumps ? Yes on all counts, but use a profiler to see where the time is spent. A good implementation will specialize the tridiagonal structure and aggregate all the problems. It can be done efficiently, but I'm not aware of general-purpose libraries. However, I reiterate my earlier statements that ADI is an antiquated method that has fallen out of favor since it is a poor parallel algorithm and has large splitting errors. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From tlk0812 at hotmail.com Tue Jun 10 18:16:37 2014 From: tlk0812 at hotmail.com (LikunTan) Date: Wed, 11 Jun 2014 07:16:37 +0800 Subject: [petsc-users] vec of integer type Message-ID: Dear Petsc developers, Is that possible to define Vec of integer type? I am defining Vec M;PetscInt *aM;VecGetArray(M, &aM); but this gives me an error when compiling:argument of type "PetscInt={int} **" is incompatible with parameter of type "PetscScalar= {PetscReal={double}} ** Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jun 10 18:25:21 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 11 Jun 2014 01:25:21 +0200 Subject: [petsc-users] vec of integer type In-Reply-To: References: Message-ID: <877g4o4aam.fsf@jedbrown.org> LikunTan writes: > Dear Petsc developers, > Is that possible to define Vec of integer type? No. Why do you want that? > I am defining > Vec M;PetscInt *aM;VecGetArray(M, &aM); > but this gives me an error when compiling:argument of type "PetscInt={int} **" is incompatible with parameter of type "PetscScalar= {PetscReal={double}} ** > Thanks, > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Jun 10 18:26:19 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 10 Jun 2014 18:26:19 -0500 Subject: [petsc-users] vec of integer type In-Reply-To: References: Message-ID: <3B2BD01D-6F57-4A16-9DD4-65B1AD5D38A2@mcs.anl.gov> No On Jun 10, 2014, at 6:16 PM, LikunTan wrote: > Dear Petsc developers, > > Is that possible to define Vec of integer type? > > I am defining > > Vec M; > PetscInt *aM; > VecGetArray(M, &aM); > > but this gives me an error when compiling: > argument of type "PetscInt={int} **" is incompatible with parameter of type "PetscScalar= {PetscReal={double}} ** > > Thanks, From knepley at gmail.com Tue Jun 10 19:02:00 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 10 Jun 2014 19:02:00 -0500 Subject: [petsc-users] vec of integer type In-Reply-To: References: Message-ID: On Tue, Jun 10, 2014 at 6:16 PM, LikunTan wrote: > Dear Petsc developers, > > Is that possible to define Vec of integer type? > What you want here is IS instead of Vec. This is what I use for parallel integer data. Matt > I am defining > > Vec M; > PetscInt *aM; > VecGetArray(M, &aM); > > but this gives me an error when compiling: > argument of type "PetscInt={int} **" is incompatible with parameter of > type "PetscScalar= {PetscReal={double}} ** > > Thanks, > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlk0812 at hotmail.com Tue Jun 10 19:56:54 2014 From: tlk0812 at hotmail.com (Likun Tan) Date: Tue, 10 Jun 2014 17:56:54 -0700 Subject: [petsc-users] vec of integer type In-Reply-To: References: Message-ID: I c, thank you. > On Jun 10, 2014, at 5:02 PM, Matthew Knepley wrote: > >> On Tue, Jun 10, 2014 at 6:16 PM, LikunTan wrote: >> Dear Petsc developers, >> >> Is that possible to define Vec of integer type? > > What you want here is IS instead of Vec. This is what I use for parallel integer data. > > Matt > >> I am defining >> >> Vec M; >> PetscInt *aM; >> VecGetArray(M, &aM); >> >> but this gives me an error when compiling: >> argument of type "PetscInt={int} **" is incompatible with parameter of type "PetscScalar= {PetscReal={double}} ** >> >> Thanks, > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From shriram at ualberta.ca Wed Jun 11 01:04:56 2014 From: shriram at ualberta.ca (Shriram Srinivasan) Date: Wed, 11 Jun 2014 00:04:56 -0600 Subject: [petsc-users] tridiagonal matrix in kspsolve In-Reply-To: <87egyw4bce.fsf@jedbrown.org> References: <53978CB0.7010200@ualberta.ca> <87egyw4bce.fsf@jedbrown.org> Message-ID: <5397F188.7040507@ualberta.ca> Thank you Jed. Your point is well taken. We are trying to improve upon an existing method, and this is the first attempt. Based on this, we shall proceed. Could you direct me to a reference that discusses issues with ADI ? Also, if I use Lapack's tridiagonal solver, I suppose I shall only need the diagonals, so would it be better to assemble the diagonals directly as an array without using a Mat object and its methods for assembly? Assembling the matrix and then extracting the diagonals I expect is inefficient in comparison ? Thanks, Shriram On 06/10/2014 05:02 PM, Jed Brown wrote: > Shriram Srinivasan writes: > >> Apologies if my question is answered elsewhere already. I could not >> find this discussed in the manual or mailing list. >> >> I have been using matsolvermumps through ksp to solve my linear system. >> I wanted to simply get it working so I was not concerned about the >> structure of the matrix. >> >> >> 1) My matrix is tridiagonal, but the elements of the diagonals are not >> all equal. Will a dedicated tridiagonal solver be faster than >> matsolvermumps in this case ? >> >> 2) My matrix is tridiagonal but not in the conventional sense, the minor >> diagonals are farther apart. But it is possible to renumber the nodes >> and make it a conventional tridiagonal system. In this case, again, am I >> losing efficiency using matsolvermumps ? > Yes on all counts, but use a profiler to see where the time is spent. A > good implementation will specialize the tridiagonal structure and > aggregate all the problems. It can be done efficiently, but I'm not > aware of general-purpose libraries. > > However, I reiterate my earlier statements that ADI is an antiquated > method that has fallen out of favor since it is a poor parallel > algorithm and has large splitting errors. From mairhofer at itt.uni-stuttgart.de Wed Jun 11 06:47:03 2014 From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Wed, 11 Jun 2014 13:47:03 +0200 Subject: [petsc-users] Problem with assembling 2 matrices In-Reply-To: <5389EF23.30603@itt.uni-stuttgart.de> References: <5389EF23.30603@itt.uni-stuttgart.de> Message-ID: <539841B7.9020406@itt.uni-stuttgart.de> Dear PETSc-Team, I am trying to set a global array X as column 0 of a dense matrix A and another global array Y as the first column of a second dense matrix B. Everything works perfectly fine in serial and parallel as long as I do this only for one of the matrices. However, when I try to do it for both matrices, the entries of matrix A seem to be overwritten by the values of matrix B. This is the output I get (first vectors X and Y, then matrices A and B): Vector Object:Vec_0x84000000_0 1 MPI processes type: mpi Process [0] 1 1 1 Vector Object:Vec_0x84000000_1 1 MPI processes type: mpi Process [0] 2 2 2 Matrix Object: 1 MPI processes type: seqdense 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 Matrix Object: 1 MPI processes type: seqdense 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 I also attached the Fortran code: program main implicit none #include #include #include #include #include #include #include #include #include #include PetscInt i,j,m,n PetscErrorCode ierr PetscScalar one Mat A Mat B PetscInt row(3) DM da Vec X Vec Y PetscScalar,pointer :: X_pointer(:) PetscScalar,pointer :: Y_pointer(:) PetscInt xs,xm !Initialize call PetscInitialize(PETSC_NULL_CHARACTER,ierr) m = 3 !col n = 3 !row call DMDACreate1d(PETSC_COMM_WORLD, & !MPI communicator & DMDA_BOUNDARY_NONE, & !Boundary type at boundary of physical domain & n, & !global dimension of array (if negative number, then value can be changed by user via command line!) & 1, & !number of degrees of freedom per grid point (number of unknowns at each grid point) & 0, & !number of ghost points accessible to local vectors & PETSC_NULL_INTEGER, & !could be an array to specify the number of grid points per processor & da, & !the resulting distributed array object & ierr) call DMDAGetCorners(da, & !the distributed array & xs, & !corner index in x direction & PETSC_NULL_INTEGER, & !corner index in y direction & PETSC_NULL_INTEGER, & !corner index in z direction & xm, & !width of locally owned part in x direction & PETSC_NULL_INTEGER, & !width of locally owned part in y direction & PETSC_NULL_INTEGER, & !width of locally owned part in z direction & ierr) !error check one = 1.0 !set indices of matrix rows to be set Do i = xs,xs+xm-1 row(i) = i END DO !Set up vector X and Matrix A call DMCreateGlobalVector(da,X,ierr) call VecSet(X,one,ierr) !set all entries of global array to 1 call VecGetArrayF90(X,X_pointer,ierr) call MatCreateDense(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,PETSC_NULL_CHARACTER,A,ierr) call MatSetValues(A,xm,row(xs:xs+xm-1),1,0,X_pointer(1:xm),INSERT_VALUES,ierr) call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) !Set up vector Y and matrix B call DMCreateGlobalVector(da,Y,ierr) call VecSet(Y,2.d0*one,ierr) !set all entries of global array to 2 call VecGetArrayF90(Y,Y_pointer,ierr) call MatCreateDense(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,PETSC_NULL_CHARACTER,B,ierr) call MatSetValues(B,xm,row(xs:xs+xm-1),1,0,Y_pointer(1:xm),INSERT_VALUES,ierr) call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) !print out matrices and vectors to check entries call VecView(X,PETSC_VIEWER_STDOUT_WORLD,ierr) call VecView(Y,PETSC_VIEWER_STDOUT_WORLD,ierr) call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) !free objects call VecRestoreArrayF90(X,X_pointer,ierr) call VecDestroy(X,ierr) call MatDestroy(A,ierr) call VecRestoreArrayF90(Y,Y_pointer,ierr) call VecDestroy(Y,ierr) call MatDestroy(B,ierr) call PetscFinalize(ierr) end Am I doing something wrong in my code? Thank you for your help! Jonas From knepley at gmail.com Wed Jun 11 07:43:35 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 11 Jun 2014 07:43:35 -0500 Subject: [petsc-users] Problem with assembling 2 matrices In-Reply-To: <539841B7.9020406@itt.uni-stuttgart.de> References: <5389EF23.30603@itt.uni-stuttgart.de> <539841B7.9020406@itt.uni-stuttgart.de> Message-ID: On Wed, Jun 11, 2014 at 6:47 AM, Jonas Mairhofer < mairhofer at itt.uni-stuttgart.de> wrote: > Dear PETSc-Team, > > I am trying to set a global array X as column 0 of a dense matrix A and > another global array Y as the first column of a second dense matrix B. > Everything works perfectly fine in serial and parallel as long as I do > this only for one of the matrices. However, when I try to do it for both > matrices, the entries of matrix A seem to be overwritten by the values of > matrix B. > > This is the output I get (first vectors X and Y, then matrices A and B): > > Vector Object:Vec_0x84000000_0 1 MPI processes > type: mpi > Process [0] > 1 > 1 > 1 > Vector Object:Vec_0x84000000_1 1 MPI processes > type: mpi > Process [0] > 2 > 2 > 2 > Matrix Object: 1 MPI processes > type: seqdense > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > Matrix Object: 1 MPI processes > type: seqdense > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > > I also attached the Fortran code: > > > > > program main > implicit none > > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > PetscInt i,j,m,n > PetscErrorCode ierr > > PetscScalar one > Mat A > Mat B > > PetscInt row(3) > DM da > Vec X > Vec Y > PetscScalar,pointer :: X_pointer(:) > PetscScalar,pointer :: Y_pointer(:) > PetscInt xs,xm > > !Initialize > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > m = 3 !col > n = 3 !row > > > call DMDACreate1d(PETSC_COMM_WORLD, & !MPI communicator > & DMDA_BOUNDARY_NONE, & > !Boundary type at boundary of physical domain > & n, & !global > dimension of array (if negative number, then value can be changed by user > via command line!) > & 1, & > !number of degrees of freedom per grid point (number of unknowns at each > grid point) > & 0, & > !number of ghost points accessible to local vectors > & PETSC_NULL_INTEGER, & > !could be an array to specify the number of grid points per processor > & da, & !the > resulting distributed array object > & ierr) > > call DMDAGetCorners(da, & !the distributed array > & xs, & !corner index in x direction > & PETSC_NULL_INTEGER, & !corner index in y direction > & PETSC_NULL_INTEGER, & !corner index in z direction > & xm, & !width of locally owned part in x > direction > & PETSC_NULL_INTEGER, & !width of locally owned part in y > direction > & PETSC_NULL_INTEGER, & !width of locally owned part in z > direction > & ierr) !error check > > > > one = 1.0 > > > !set indices of matrix rows to be set > Do i = xs,xs+xm-1 > row(i) = i > END DO > > > !Set up vector X and Matrix A > call DMCreateGlobalVector(da,X,ierr) > call VecSet(X,one,ierr) !set all entries of global array to 1 > call VecGetArrayF90(X,X_pointer,ierr) > > call MatCreateDense(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_ > DECIDE,n,m,PETSC_NULL_CHARACTER,A,ierr) > The bug is here (and in the other create call). You should not be using PETSC_NULL_CHARACTER, but PETSC_NULL_SCALAR. This is a good reason for reconsidering the use of Fortran. It just cannot warn you about damaging errors like this. Thanks, Matt > call MatSetValues(A,xm,row(xs:xs+xm-1),1,0,X_pointer(1:xm), > INSERT_VALUES,ierr) > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > > !Set up vector Y and matrix B > call DMCreateGlobalVector(da,Y,ierr) > call VecSet(Y,2.d0*one,ierr) !set all entries of global array > to 2 > call VecGetArrayF90(Y,Y_pointer,ierr) > > call MatCreateDense(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_ > DECIDE,n,m,PETSC_NULL_CHARACTER,B,ierr) > call MatSetValues(B,xm,row(xs:xs+xm-1),1,0,Y_pointer(1:xm), > INSERT_VALUES,ierr) > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) > > > !print out matrices and vectors to check entries > call VecView(X,PETSC_VIEWER_STDOUT_WORLD,ierr) > call VecView(Y,PETSC_VIEWER_STDOUT_WORLD,ierr) > call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) > > !free objects > call VecRestoreArrayF90(X,X_pointer,ierr) > call VecDestroy(X,ierr) > call MatDestroy(A,ierr) > call VecRestoreArrayF90(Y,Y_pointer,ierr) > call VecDestroy(Y,ierr) > call MatDestroy(B,ierr) > > > call PetscFinalize(ierr) > end > > > > Am I doing something wrong in my code? > Thank you for your help! > Jonas > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mairhofer at itt.uni-stuttgart.de Wed Jun 11 08:08:01 2014 From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Wed, 11 Jun 2014 15:08:01 +0200 Subject: [petsc-users] Problem with assembling 2 matrices In-Reply-To: References: <5389EF23.30603@itt.uni-stuttgart.de> <539841B7.9020406@itt.uni-stuttgart.de> Message-ID: <539854B1.7010105@itt.uni-stuttgart.de> Thank you! And of course you are right, the more i work with PETSc the more I think about changing from Fortran to C... Am 11.06.2014 14:43, schrieb Matthew Knepley: > On Wed, Jun 11, 2014 at 6:47 AM, Jonas Mairhofer > > wrote: > > Dear PETSc-Team, > > I am trying to set a global array X as column 0 of a dense matrix > A and another global array Y as the first column of a second dense > matrix B. > Everything works perfectly fine in serial and parallel as long as > I do this only for one of the matrices. However, when I try to do > it for both matrices, the entries of matrix A seem to be > overwritten by the values of matrix B. > > This is the output I get (first vectors X and Y, then matrices A > and B): > > Vector Object:Vec_0x84000000_0 1 MPI processes > type: mpi > Process [0] > 1 > 1 > 1 > Vector Object:Vec_0x84000000_1 1 MPI processes > type: mpi > Process [0] > 2 > 2 > 2 > Matrix Object: 1 MPI processes > type: seqdense > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > Matrix Object: 1 MPI processes > type: seqdense > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 2.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > > I also attached the Fortran code: > > > > > program main > implicit none > > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > PetscInt i,j,m,n > PetscErrorCode ierr > > PetscScalar one > Mat A > Mat B > > PetscInt row(3) > DM da > Vec X > Vec Y > PetscScalar,pointer :: X_pointer(:) > PetscScalar,pointer :: Y_pointer(:) > PetscInt xs,xm > > !Initialize > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > m = 3 !col > n = 3 !row > > > call DMDACreate1d(PETSC_COMM_WORLD, & !MPI communicator > & DMDA_BOUNDARY_NONE, & !Boundary type at > boundary of physical domain > & n, & !global dimension of array (if negative > number, then value can be changed by user via command line!) > & 1, & !number of degrees of freedom per grid > point (number of unknowns at each grid point) > & 0, & !number of ghost points accessible to > local vectors > & PETSC_NULL_INTEGER, & !could be an array to > specify the number of grid points per processor > & da, & !the resulting distributed array object > & ierr) > > call DMDAGetCorners(da, & !the distributed array > & xs, & !corner index in x direction > & PETSC_NULL_INTEGER, & !corner index in y direction > & PETSC_NULL_INTEGER, & !corner index in z direction > & xm, & !width of locally owned part in x > direction > & PETSC_NULL_INTEGER, & !width of locally owned part > in y direction > & PETSC_NULL_INTEGER, & !width of locally owned part > in z direction > & ierr) !error check > > > > one = 1.0 > > > !set indices of matrix rows to be set > Do i = xs,xs+xm-1 > row(i) = i > END DO > > > !Set up vector X and Matrix A > call DMCreateGlobalVector(da,X,ierr) > call VecSet(X,one,ierr) !set all entries of global > array to 1 > call VecGetArrayF90(X,X_pointer,ierr) > > call > MatCreateDense(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,PETSC_NULL_CHARACTER,A,ierr) > > > > The bug is here (and in the other create call). You should not be > using PETSC_NULL_CHARACTER, but PETSC_NULL_SCALAR. > > This is a good reason for reconsidering the use of Fortran. It just > cannot warn you about damaging errors like this. > > Thanks, > > Matt > > call > MatSetValues(A,xm,row(xs:xs+xm-1),1,0,X_pointer(1:xm),INSERT_VALUES,ierr) > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > > !Set up vector Y and matrix B > call DMCreateGlobalVector(da,Y,ierr) > call VecSet(Y,2.d0*one,ierr) !set all entries of global > array to 2 > call VecGetArrayF90(Y,Y_pointer,ierr) > > call > MatCreateDense(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,PETSC_NULL_CHARACTER,B,ierr) > > call > MatSetValues(B,xm,row(xs:xs+xm-1),1,0,Y_pointer(1:xm),INSERT_VALUES,ierr) > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) > > > !print out matrices and vectors to check entries > call VecView(X,PETSC_VIEWER_STDOUT_WORLD,ierr) > call VecView(Y,PETSC_VIEWER_STDOUT_WORLD,ierr) > call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) > > !free objects > call VecRestoreArrayF90(X,X_pointer,ierr) > call VecDestroy(X,ierr) > call MatDestroy(A,ierr) > call VecRestoreArrayF90(Y,Y_pointer,ierr) > call VecDestroy(Y,ierr) > call MatDestroy(B,ierr) > > > call PetscFinalize(ierr) > end > > > > Am I doing something wrong in my code? > Thank you for your help! > Jonas > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajsai24 at gmail.com Wed Jun 11 08:18:09 2014 From: rajsai24 at gmail.com (Sai Rajeshwar) Date: Wed, 11 Jun 2014 18:48:09 +0530 Subject: [petsc-users] (no subject) Message-ID: Dear sir, Im a masters student from Indian Institute of Technology delhi. Im working on PETSc.. for performance, which is my area of interest. Can you please help me in knowing 'How to run PETSc on MIC' . That would be of great help to me. *thanks and regards..* *M. Sai Rajeswar* *M-tech Computer Technology* *IIT Delhi----------------------------------Cogito Ergo Sum---------* -------------- next part -------------- An HTML attachment was scrubbed... URL: From gbisht at lbl.gov Wed Jun 11 09:41:58 2014 From: gbisht at lbl.gov (Gautam Bisht) Date: Wed, 11 Jun 2014 07:41:58 -0700 Subject: [petsc-users] Ensuring non-negative solutions Message-ID: PETSc, I had few questions regarding adaptive time stepping inTS. - Can temporal adaptivity in TS ensure non-negative solution of a PDE? - For coupled multi-physics problems, can TS ensure non-negative solution for only one of the physics (eg. ensuring non-negative surface pressure for a coupled surface-subsurface flow simulation)? - Are there examples for such a case? At this stage I'm interested in finding out if ensuring non-negative solution is feasible and what would be required for such an implementation. If you could point me to some references that would be much appreciated. Thanks, -Gautam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rupp at iue.tuwien.ac.at Wed Jun 11 10:04:25 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Wed, 11 Jun 2014 17:04:25 +0200 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: <53986FF9.3060701@iue.tuwien.ac.at> Hi, > Im a masters student from Indian Institute of Technology delhi. Im > working on PETSc.. for performance, which is my area of interest. Can > you please help me in knowing 'How to run PETSc on MIC' . That would > be of great help to me. my experience is that 'performance' and 'MIC' for bandwidth-limited operations don't go together. Regardless, you can use ViennaCL by building via --download-viennacl for using the MIC via OpenCL, but you are usually much better off with a proper multi-socket CPU node. Feel free to have a look at my recent slides from the Intl. OpenCL Workshop here: http://iwocl.org/wp-content/uploads/iwocl-2014-tech-presentation-Karl-Rupp.pdf PDF page 32 shows that in the OpenCL case one achieves only up to 20% of peak bandwidth for 1900 different kernel configurations even for simple kernels such as vector copy, vector addition, dot products, or dense matrix-vector products. With some tricks one can probably get 30%, but that's it. PETSc does not provide any 'native' OpenMP execution on MIC for similar reasons. Best regards, Karli From jyang29 at uh.edu Wed Jun 11 11:44:14 2014 From: jyang29 at uh.edu (Brian Yang) Date: Wed, 11 Jun 2014 11:44:14 -0500 Subject: [petsc-users] Irregular Matrix Partitioning problem Message-ID: Hi all, To solve a linear system, Ax=b, I am trying to solve a irregular matrix partitioning problem. For example, I want to create a parallel sparse matrix A, which is a square and 1000*1000. Due to the requirements of the problem, for the first and last 100 rows, I just want to put 1.0 in the main diagonal. For the middle 800 rows, I will set other values around the main diagonal. So what I want to do is that first and last computing node take care of the first and last 100 rows and other nodes distribute the work in between. Is there any specific way to achieve this considering it's a parallel matrix? If yes, should I also be care of the method to create the related parallel vector? Thanks. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jun 11 11:51:06 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 11 Jun 2014 11:51:06 -0500 Subject: [petsc-users] Irregular Matrix Partitioning problem In-Reply-To: References: Message-ID: On Wed, Jun 11, 2014 at 11:44 AM, Brian Yang wrote: > Hi all, > > To solve a linear system, Ax=b, I am trying to solve a irregular matrix > partitioning problem. > > For example, I want to create a parallel sparse matrix A, which is a > square and 1000*1000. > > Due to the requirements of the problem, for the first and last 100 rows, I > just want to put 1.0 in the main diagonal. > > For the middle 800 rows, I will set other values around the main diagonal. > So what I want to do is that first and last computing node take care of the > first and last 100 rows and other nodes distribute the work in between. > > Is there any specific way to achieve this considering it's a parallel > matrix? If yes, should I also be care of the method to create the related > parallel vector? > You are completely in charge of matrix partitioning when you call MatSetSizes() which sets the local and global sizes. Does this make sense? Thanks, Matt > Thanks. > > Brian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jun 11 13:06:40 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 11 Jun 2014 13:06:40 -0500 Subject: [petsc-users] Error message for DMShell using MG preconditioner to solve S In-Reply-To: <537CF9E3.4010908@columbi.edu> References: <537BCA1E.6010508@columbi.edu> <537CF9E3.4010908@columbi.edu> Message-ID: On Wed, May 21, 2014 at 2:09 PM, Luc Berger-Vergiat wrote: > So I just pulled an updated version of petsc-dev today (I switched from > the *next* branch to the *master* branch due to some compilation error > existing with the last commit on *next*). > I still have the same error and I believe this is the whole error message > I have. > I mean I am running multiple time steps for my simulation so I have the > same message at each time step, but I don't think that it is important to > report these duplicates, is it? > Sorry I am just getting back to this. I asked for more of the error message, because I assumed that some other PETSc function was calling DMGetGlobalVector(), not you, since you say it goes away with a different PC. If you are calling it and its generating this error, it is because you are calling it before setting a Vec in the DMShell. Thanks, Matt > Best, > Luc > > On 05/20/2014 09:14 PM, Matthew Knepley wrote: > > > On Tue, May 20, 2014 at 4:33 PM, Luc Berger-Vergiat > wrote: > >> Hi all, >> I am running an FEM simulation that uses Petsc as a linear solver. >> I am setting up ISs and pass them to a DMShell in order to use the >> FieldSplit capabilities of Petsc. >> >> When I pass the following options to Petsc: >> >> " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur >> -pc_fieldsplit_schur_factorization_type full >> -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields 1,2 >> -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly >> -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres >> -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log" >> >> I get an error message: >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: >> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or >> DMShellSetCreateGlobalVector() >> [0]PETSC ERROR: See >> http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >> shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46 GIT >> Date: 2014-03-26 22:20:51 -0500 >> [0]PETSC ERROR: /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap >> on a arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014 >> [0]PETSC ERROR: Configure options --download-cmake --download-hypre >> --download-metis --download-mpich --download-parmetis --with-debugging=no >> --with-share-libraries=no >> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in >> /home/luc/research/petsc/src/dm/impls/shell/dmshell.c >> [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in >> /home/luc/research/petsc/src/dm/interface/dm.c >> [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in >> /home/luc/research/petsc/src/dm/interface/dmget.c >> > > Always always always give the entire error message. > > Matt > > >> I am not really sure why this happens but it only happens when >> -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no >> problems. I attached the ksp_view in case that's any use. >> >> -- >> Best, >> Luc >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jun 11 13:20:43 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 11 Jun 2014 13:20:43 -0500 Subject: [petsc-users] Irregular Matrix Partitioning problem In-Reply-To: References: Message-ID: Just give the full size matrix and vectors to KSP as usual and use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCREDISTRIBUTE.html note that a matrix of 1000 rows is not large enough for parallelism. Generally you need at least 10,000 rows per process for good parallel performance. Barry On Jun 11, 2014, at 11:44 AM, Brian Yang wrote: > Hi all, > > To solve a linear system, Ax=b, I am trying to solve a irregular matrix partitioning problem. > > For example, I want to create a parallel sparse matrix A, which is a square and 1000*1000. > > Due to the requirements of the problem, for the first and last 100 rows, I just want to put 1.0 in the main diagonal. > > For the middle 800 rows, I will set other values around the main diagonal. So what I want to do is that first and last computing node take care of the first and last 100 rows and other nodes distribute the work in between. > > Is there any specific way to achieve this considering it's a parallel matrix? If yes, should I also be care of the method to create the related parallel vector? > > Thanks. > > Brian From brianyang1106 at gmail.com Wed Jun 11 13:21:30 2014 From: brianyang1106 at gmail.com (Brian Yang) Date: Wed, 11 Jun 2014 13:21:30 -0500 Subject: [petsc-users] Irregular Matrix Partitioning problem In-Reply-To: References: Message-ID: Yes, thanks. So I am guessing that I need to take care of the first and last rank and the middle part separately. On Wed, Jun 11, 2014 at 11:51 AM, Matthew Knepley wrote: > On Wed, Jun 11, 2014 at 11:44 AM, Brian Yang wrote: > >> Hi all, >> >> To solve a linear system, Ax=b, I am trying to solve a irregular matrix >> partitioning problem. >> >> For example, I want to create a parallel sparse matrix A, which is a >> square and 1000*1000. >> >> Due to the requirements of the problem, for the first and last 100 rows, >> I just want to put 1.0 in the main diagonal. >> >> For the middle 800 rows, I will set other values around the main >> diagonal. So what I want to do is that first and last computing node take >> care of the first and last 100 rows and other nodes distribute the work in >> between. >> >> Is there any specific way to achieve this considering it's a parallel >> matrix? If yes, should I also be care of the method to create the related >> parallel vector? >> > > You are completely in charge of matrix partitioning when you call > > MatSetSizes() > > which sets the local and global sizes. Does this make sense? > > Thanks, > > Matt > > >> Thanks. >> >> Brian >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Brian Yang U of Houston -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jun 11 14:05:16 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 11 Jun 2014 14:05:16 -0500 Subject: [petsc-users] Plex mesh reference element In-Reply-To: References: Message-ID: On Mon, Feb 17, 2014 at 1:23 PM, Dharmendar Reddy wrote: > Hello, > My petsc code is at version e4acbc4 on next. I get undefined > reference error: > undefined reference to `petscdualspacecreatereferencecell_' > > Here is the code, i used. > > print*,'Testing Reference Cell' > call PetscDualSpaceCreate(comm, Q, ierr) > Kdim=3 > call PetscDualSpaceCreateReferenceCell(Q,Kdim, PETSC_TRUE, K, ierr) > call PetscViewerPushFormat(PETSC_VIEWER_STDOUT_WORLD, > PETSC_VIEWER_ASCII_INFO_DETAIL, ierr) > call DMView(K, PETSC_VIEWER_STDOUT_WORLD, ierr) > I am very sorry. This fell through the email cracks. I have put in the Fortran bindings, and docs for all the PetscDualSpace functions into next. Thanks, Matt > Thanks > Reddy > > On Mon, Feb 17, 2014 at 7:28 AM, Matthew Knepley > wrote: > > On Tue, Feb 11, 2014 at 7:50 AM, Dharmendar Reddy < > dharmareddy84 at gmail.com> > > wrote: > >> > >> Hello, > >> Where do i find the information on reference elements used by > >> plex for local numbering. > > > > > > You can use > > > > PetscDualSpaceCreateReferenceCell(sp, dim, isSimplex, &refCell) > > PetscViewerPushFormat(PETSC_VIEWER_STDOUT_WORLD, > > PETSC_VIEWER_ASCII_INFO_DETAIL) > > DMView(refCell, PETSC_VIEWER_STDOUT_WORLD) > > > > Matt > > > >> I currently use the following for the tetrahedron. Is this correct ? > >> what is the corresponding information for a triangle ? > >> > >> ! Vertex coordiantes > >> vert(1:3,1)=[0.0, 0.0, 0.0] > >> vert(1:3,2)=[1.0, 0.0, 0.0] > >> vert(1:3,3)=[0.0, 1.0, 0.0] > >> vert(1:3,4)=[0.0, 0.0, 1.0] > >> > >> ! vertex intdices > >> vs(1:4)= [1, 2, 3, 4] > >> > >> ! Edges id to node Id > >> edge(:,1) = [3, 4] > >> edge(:,2) = [2, 4] > >> edge(:,3) = [2, 3] > >> edge(:,4) = [1, 4] > >> edge(:,5) = [1, 3] > >> edge(:,6) = [1, 2] > >> > >> ! Faces id to node id > >> face(1:3,1)=[2,3,4] > >> face(1:3,2)=[1,3,4] > >> face(1:3,3)=[1,2,4] > >> face(1:3,4)=[1,2,3] > >> numFacet = 4 > >> > >> ! Tets > >> tet(1:4,1) = [1,2,3,4] > >> thanks > >> Reddy > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajsai24 at gmail.com Thu Jun 12 01:28:18 2014 From: rajsai24 at gmail.com (Sai Rajeshwar) Date: Thu, 12 Jun 2014 11:58:18 +0530 Subject: [petsc-users] (no subject) In-Reply-To: <53986FF9.3060701@iue.tuwien.ac.at> References: <53986FF9.3060701@iue.tuwien.ac.at> Message-ID: ok, so considering performance on MIC can the library MAGMA be used as an alternate to Viennacl for PETSc or FEniCS? http://www.nics.tennessee.edu/files/pdf/hpcss/04_03_LinearAlgebraPar.pdf (from slide 37 onwards) MAGMA seems to have sparse version which i think is doing all that any sparse non linear solver can do.. MAGMA-sparse.. will this be helpful in using with MIC *with regards..* *M. Sai Rajeswar* *M-tech Computer Technology* *IIT Delhi----------------------------------Cogito Ergo Sum---------* On Wed, Jun 11, 2014 at 8:34 PM, Karl Rupp wrote: > Hi, > > > Im a masters student from Indian Institute of Technology delhi. Im > >> working on PETSc.. for performance, which is my area of interest. Can >> you please help me in knowing 'How to run PETSc on MIC' . That would >> be of great help to me. >> > > my experience is that 'performance' and 'MIC' for bandwidth-limited > operations don't go together. Regardless, you can use ViennaCL by building > via > --download-viennacl > for using the MIC via OpenCL, but you are usually much better off with a > proper multi-socket CPU node. > > Feel free to have a look at my recent slides from the Intl. OpenCL > Workshop here: > http://iwocl.org/wp-content/uploads/iwocl-2014-tech- > presentation-Karl-Rupp.pdf > PDF page 32 shows that in the OpenCL case one achieves only up to 20% of > peak bandwidth for 1900 different kernel configurations even for simple > kernels such as vector copy, vector addition, dot products, or dense > matrix-vector products. With some tricks one can probably get 30%, but > that's it. > > PETSc does not provide any 'native' OpenMP execution on MIC for similar > reasons. > > Best regards, > Karli > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rupp at iue.tuwien.ac.at Thu Jun 12 03:14:27 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Thu, 12 Jun 2014 10:14:27 +0200 Subject: [petsc-users] (no subject) In-Reply-To: References: <53986FF9.3060701@iue.tuwien.ac.at> Message-ID: <53996163.6050607@iue.tuwien.ac.at> Hi, > so considering performance on MIC > > can the library MAGMA be used as an alternate to Viennacl for PETSc or > FEniCS? No, there is no interface to MAGMA in PETSc yet. Contributions are always welcome, yet it is not our priority to come up with an interface of our own. I don't think it will provide any substantial benefits, though, because there is no magic one can apply to overcome the memory wall. > > http://www.nics.tennessee.edu/files/pdf/hpcss/04_03_LinearAlgebraPar.pdf > (from slide 37 onwards) > > MAGMA seems to have sparse version which i think is doing all that any > sparse non linear solver can do.. MAGMA-sparse.. > > will this be helpful in using with MIC This depends on what you are looking for. If you are looking a maximizing FLOP rates for a fixed algorithm, then MAGMA may help you if it happens to provide an implementation for this particular algorithm. However, if you're looking for a way to minimize time-to-solution for a given problem, then it's usually better to build a good preconditioner with the many options PETSc provides, such as field-split and multigrid preconditioners. Purely CPU-based implementations usually still beat accelerator-based approaches on larger scale, simply because it allows you to use better algorithms rather than throwing massive parallelism at it, which severely restricts your options. If you really want to play with accelerators in PETSc, use GPUs (higher memory bandwidth), not MIC. Best regards, Karli From billingsjj at ornl.gov Thu Jun 12 07:45:55 2014 From: billingsjj at ornl.gov (Jay J. Billings) Date: Thu, 12 Jun 2014 08:45:55 -0400 Subject: [petsc-users] Complex examples with PETSc? Message-ID: <5399A103.5060402@ornl.gov> Everyone, What's the best example of solving a Complex problem with PETSc? Are there any examples of solving Schrodinger's equation around? Thanks, Jay -- Jay Jay Billings Oak Ridge National Laboratory Twitter Handle: @jayjaybillings From bsmith at mcs.anl.gov Thu Jun 12 16:02:48 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 12 Jun 2014 16:02:48 -0500 Subject: [petsc-users] Complex examples with PETSc? In-Reply-To: <5399A103.5060402@ornl.gov> References: <5399A103.5060402@ornl.gov> Message-ID: Jay, You need to ./configure PETSc with ?with-scalar-type=complex (yes sadly you cannot mix some real vectors/matrices in the same PETSc program as some complex ones). Then proceed as usual. PetscScalar is of type complex double and PetscReal of double. No examples of solving Schrodinger's equation around, sorry. Barry On Jun 12, 2014, at 7:45 AM, Jay J. Billings wrote: > Everyone, > > What's the best example of solving a Complex problem with PETSc? Are there any examples of solving Schrodinger's equation around? > > Thanks, > Jay > > -- > Jay Jay Billings > Oak Ridge National Laboratory > Twitter Handle: @jayjaybillings > From steve at optinemailtrade.com Thu Jun 12 18:19:37 2014 From: steve at optinemailtrade.com (steve) Date: Thu, 12 Jun 2014 16:19:37 -0700 Subject: [petsc-users] Solar & Renewable Professional Contact List Message-ID: <5E947A318CDE4222B0AAA08BF7F90DC1@DATAROCK.COM> Hello, Would you be interested in acquiring Solar Industry Professional Contact list of 2014 for your marketing initiative which includes complete key Decision Makers contact details and verified Email Addresses of - Solar Installers both PV & Thermal Solar contractors Distributors Manufacturers Utilities Solar/Wind Energy Developers Renewable Professionals EPCs Solar Integrators Solar Financial Institutes Biomass Installers Solar Roofing contractors HVAC contractors, Electrical Contractors, Solar panel Suppliers and Retailers Green Building Professionals Solar Engineering consultants Home Owners and other solar industry professionals and also solar industry trade shows and many more across US/UK/CANADA, Australia, Europe and all over the world. If your target audience is not mentioned above, kindly let me know. Target Industry: ............ Target Geography: ........... Target Job Title: ............ Few Samples: Please let me know your exact target audience you wish to reach, so that I can fetch you some samples records, this will help you analyze the quality of our services. Looking forward to hear from you Thanks and Regards, Steve Perry Online Marketing Manager Phone: 760-990-1454 -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Fri Jun 13 01:17:09 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 13 Jun 2014 06:17:09 +0000 Subject: [petsc-users] MatNestGetISs in fortran In-Reply-To: References: <64c4658aeb7441abbe20e4aa252554a2@MAR190N1.marin.local>, Message-ID: Perhaps this message from May 27 "slipped through the email cracks" as Matt puts it? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl ________________________________________ From: Klaij, Christiaan Sent: Monday, June 02, 2014 9:54 AM To: petsc-users at mcs.anl.gov Subject: RE: MatNestGetISs in fortran Just a reminder. Could you please add fortran support for MatNestGetISs? ________________________________________ From: Klaij, Christiaan Sent: Tuesday, May 27, 2014 3:47 PM To: petsc-users at mcs.anl.gov Subject: MatNestGetISs in fortran I'm trying to use MatNestGetISs in a fortran program but it seems to be missing from the fortran include file (PETSc 3.4). From asmund.ervik at ntnu.no Fri Jun 13 06:57:35 2014 From: asmund.ervik at ntnu.no (=?ISO-8859-1?Q?=C5smund_Ervik?=) Date: Fri, 13 Jun 2014 13:57:35 +0200 Subject: [petsc-users] Interpreting -log_summary, amount of communication Message-ID: <539AE72F.60909@ntnu.no> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear PETSc, First of all, bug report for the manual (for petsc-current): in Fig. 20 and 21, something has not gone well with \href and listings, so I can't understand those figures properly. I read the chapter in the manual, 12.1, but it din't answer my question. When I run a parallel code with -log_summary, how can I see details of the time spent on communication? To be more specific: I have some code that does communication at the start of each time step, and I guess KSP has to do some communications when it is solving my Poisson equation. If I understand correctly, both these communications are listed under "VecAssemblyEnd", but how do I tell the division of the time between those two? Do I have to register some stages, etc.? Below is the output from the performance summary. Is this bad in terms of the time spent on communication? This is using 4 nodes, 4 cores per node on a small cluster with Intel E5-2670 8-core CPUs. The Streams benchmark indicated that I can't really utilize more than 4 cores per node. The interconnect is 1 Gb/s ethernet. The speedup vs. 1 core is 9x. I'm solving incompressible Navier-Stokes on a 128^3 grid, with a pressure Poisson equation. In this case I used SOR and BiCGStab. This cluster is where I'm learning the ropes, and I will be using more tightly-coupled systems in the future (Infiniband). Should I expect an increase in speedup when I use those? Best regards, ?smund Ervik - ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./run on a double-real named compute-3-11.local with 16 processors, by asmunder Thu Jun 12 14:22:08 2014 Using Petsc Release Version 3.4.2, Jul, 02, 2013 Max Max/Min Avg Total Time (sec): 3.246e+03 1.00000 3.246e+03 Objects: 9.800e+02 1.00000 9.800e+02 Flops: 1.667e+12 1.00447 1.663e+12 2.661e+13 Flops/sec: 5.134e+08 1.00447 5.122e+08 8.196e+09 MPI Messages: 6.163e+05 1.33327 5.393e+05 8.629e+06 MPI Message Lengths: 4.626e+10 1.49807 7.151e+04 6.171e+11 MPI Reductions: 4.576e+05 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.2462e+03 100.0% 2.6605e+13 100.0% 8.629e+06 100.0% 7.151e+04 100.0% 4.576e+05 100.0% - ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) - ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s - ------------------------------------------------------------------------------------------------------------------------ - --- Event Stage 0: Main Stage PetscBarrier 9 1.0 1.3815e+02356592.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 VecDot 150040 1.0 5.0756e+01 2.1 3.93e+10 1.0 0.0e+00 0.0e+00 1.5e+05 1 2 0 0 33 1 2 0 0 33 12399 VecDotNorm2 75020 1.0 3.1857e+01 1.9 3.93e+10 1.0 0.0e+00 0.0e+00 7.5e+04 1 2 0 0 16 1 2 0 0 16 19754 VecNorm 76620 1.0 2.4263e+01 3.0 2.01e+10 1.0 0.0e+00 0.0e+00 7.7e+04 1 1 0 0 17 1 1 0 0 17 13245 VecCopy 1600 1.0 3.7060e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 1632 1.0 7.2079e-01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 800 1.0 3.7573e-01 1.7 2.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8931 VecAXPBYCZ 150040 1.0 3.9948e+01 1.1 7.87e+10 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 31507 VecWAXPY 150040 1.0 3.6130e+01 1.2 3.93e+10 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 17418 VecAssemblyBegin 801 1.0 8.6365e+00 8.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+03 0 0 0 0 1 0 0 0 0 1 0 VecAssemblyEnd 801 1.0 1.2031e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 157298 1.0 2.1938e+01 1.8 0.00e+00 0.0 8.6e+06 7.1e+04 2.7e+01 1 0100100 0 1 0100100 0 0 VecScatterEnd 157271 1.0 1.1521e+03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 25 0 0 0 0 25 0 0 0 0 0 MatMult 150840 1.0 1.7150e+03 1.9 7.24e+11 1.0 8.4e+06 7.0e+04 0.0e+00 42 43 98 96 0 42 43 98 96 0 6721 MatSOR 151640 1.0 1.0433e+03 1.4 7.25e+11 1.0 0.0e+00 0.0e+00 0.0e+00 26 44 0 0 0 26 44 0 0 0 11126 MatAssemblyBegin 2 1.0 7.9169e-03 9.9 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 2.1147e-02 1.2 0.00e+00 0.0 1.1e+02 1.8e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 1 1.0 3.4809e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 800 1.0 2.9176e+03 1.0 1.67e+12 1.0 8.4e+06 7.0e+04 4.6e+05 90100 98 96100 90100 98 96100 9119 PCSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 151640 1.0 1.0436e+03 1.4 7.25e+11 1.0 0.0e+00 0.0e+00 0.0e+00 26 44 0 0 0 26 44 0 0 0 11123 - ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. - --- Event Stage 0: Main Stage Vector 859 812 840120768 0 Vector Scatter 39 27 17388 0 Matrix 3 0 0 0 Matrix Null Space 1 0 0 0 Distributed Mesh 4 0 0 0 Bipartite Graph 8 0 0 0 Index Set 55 55 2636500 0 IS L to G Mapping 7 0 0 0 Krylov Solver 1 0 0 0 DMKSP interface 1 0 0 0 Preconditioner 1 0 0 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 0.000110197 Average time for zero size MPI_Send(): 2.65092e-05 #PETSc Option Table entries: - -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Fri Sep 13 14:49:18 2013 Configure options: PETSC_ARCH=double-real --with-precision=double - --with-scalar-type=real --with-cc=mpicc --with-cxx=mpicxx - --with-fc=mpif90 --with-mpiexec=mpiexec --with-debugging=0 - --COPTFLAGS="-O2 -fp-model extended" --FOPTFLAGS="-O2 -fltconsistency" - --with-blas-lapack-dir=/share/apps/modulessoftware/intel/compilers/13.0.1/mkl/lib/intel64 - --with-64-bit-indices=0 --with-clanguage=c++ --with-shared-libraries=0 - --download-ml --download-hypre - ----------------------------------------- Libraries compiled on Fri Sep 13 14:49:18 2013 on rocks.hpc.ntnu.no Machine characteristics: Linux-2.6.18-308.1.1.el5-x86_64-with-redhat-5.6-Tikanga Using PETSc directory: /share/apps/modulessoftware/petsc/petsc-3.4.2 Using PETSc arch: double-real - ----------------------------------------- Using C compiler: mpicxx -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -O2 -fltconsistency ${FOPTFLAGS} ${FFLAGS} - ----------------------------------------- Using include paths: - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/include - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/include - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/include - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/include - -I/share/apps/modulessoftware/openmpi/openmpi-1.7.2-intel/include - ----------------------------------------- Using C linker: mpicxx Using Fortran linker: mpif90 Using libraries: - -Wl,-rpath,/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib - -L/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib - -lpetsc - -Wl,-rpath,/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib - -L/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib -lml - -lHYPRE - -Wl,-rpath,/share/apps/modulessoftware/intel/compilers/13.0.1/mkl/lib/intel64 - -L/share/apps/modulessoftware/intel/compilers/13.0.1/mkl/lib/intel64 - -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 - -lpthread - -L/share/apps/modulessoftware/openmpi/openmpi-1.7.2-intel/lib - -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/compiler/lib/intel64 - -L/share/apps/modulessoftware/intel/compilers/opt/intel/mic/coi/host-linux-release/lib - -L/share/apps/modulessoftware/intel/compilers/opt/intel/mic/myo/lib - -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/mpirt/lib/intel64 - -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/ipp/lib/intel64 - -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/mkl/lib/intel64 - -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/tbb/lib/intel64 - -L/gpfs/shareapps/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/compiler/lib/intel64 - -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -lmpi_usempif08 - -lmpi_usempi_ignore_tkr -lmpi_mpifh -lifport -lifcore -lm -lm - -lmpi_cxx -ldl -lmpi -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts - -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -ldl - ----------------------------------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTmucvAAoJED+FDAHgGz19vsYH/3E+g74VQFYIAIcf4tN/99WR c2ofaByZyXbU9e7NiQyn0gbqwBjDtYKOWe8vMRkWx7AdVBgS0z2ChjpZHK5TrtlF tW2JNztHBB7hgTisd5/2N5toNiCQWxUJu4/8jzbvjoaXrfU+aV3igLTLNbcT/2Rz KSmPxxc77JYj55vd4v8E8yxA1sfwppMCcyTwzlOSGRO8yiie1fgaDvQFySeoNEL5 ZMBwicNH4YBFYmEI8TH0DP6AjElW9mQOsEM+ktpupxmoFxwG3ciMKxrzpt3ID8Dw X6gv+F8F73tzsLN09SPkjmz/vPtoS03om9ZnkQYm+qaLQ+n1wz6RcnpG/Bo3y6Q= =DTtk -----END PGP SIGNATURE----- From jed at jedbrown.org Fri Jun 13 07:43:34 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 13 Jun 2014 07:43:34 -0500 Subject: [petsc-users] Ensuring non-negative solutions In-Reply-To: References: Message-ID: <87ppidkmix.fsf@jedbrown.org> Gautam Bisht writes: > PETSc, > > I had few questions regarding adaptive time stepping inTS. > > - Can temporal adaptivity in TS ensure non-negative solution of a PDE? If short steps are sufficient to prevent negative solutions, then it can. The simplest way is to set your own step acceptance criteria and reject steps that become negative (or set a domain violation for the function, in which case the solve will fail and the step will be rejected, shortening the step). > - For coupled multi-physics problems, can TS ensure non-negative solution > for only one of the physics (eg. ensuring non-negative surface pressure for > a coupled surface-subsurface flow simulation)? Yes, as above. > - Are there examples for such a case? Not currently, but it would be good to develop some. Depending on the physical process by which negative values occur, it may be sufficient to use an SSP integrator and respect some a priori step size criteria. > At this stage I'm interested in finding out if ensuring non-negative > solution is feasible and what would be required for such an implementation. > If you could point me to some references that would be much appreciated. > > Thanks, > -Gautam. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From anush at bu.edu Fri Jun 13 13:23:54 2014 From: anush at bu.edu (Anush Krishnan) Date: Fri, 13 Jun 2014 14:23:54 -0400 Subject: [petsc-users] Multi-DOF DMDA Vec Message-ID: Hello petsc-users, I created a vector using DMDACreate with 3 degrees of freedom. Is it possible for me to access each vector corresponding to a degree of freedom? Seeing that I need to access the array as [k][j][i][dof], does it mean that the values of each component are not contiguous? Also, what is the difference between DMDAVecGetArray and DMDAVecGetArrayDOF? Thank you, Anush -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jun 13 13:43:13 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 13 Jun 2014 13:43:13 -0500 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: References: Message-ID: <87d2eclkfy.fsf@jedbrown.org> Anush Krishnan writes: > Hello petsc-users, > > I created a vector using DMDACreate with 3 degrees of freedom. Is it > possible for me to access each vector corresponding to a degree of freedom? > Seeing that I need to access the array as [k][j][i][dof], does it mean that > the values of each component are not contiguous? The values are interlaced. This is generally better for memory performance (cache reuse). See, for example, the PETSc-FUN3D papers or the various discretization frameworks that rediscover this every once in a while. > Also, what is the difference between DMDAVecGetArray and DMDAVecGetArrayDOF? With DMDAVecGetArray for multi-component problems, you usually write typedef struct {PetscScalar u,v,w;} Field; Field ***x; DMDAVecGetArray(dm,X,&x); ... x[k][j][i].u = 1; With DMDAVecGetArrayDOF, you use an extra set of indices instead of the struct. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From anush at bu.edu Fri Jun 13 13:56:47 2014 From: anush at bu.edu (Anush Krishnan) Date: Fri, 13 Jun 2014 14:56:47 -0400 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: <87d2eclkfy.fsf@jedbrown.org> References: <87d2eclkfy.fsf@jedbrown.org> Message-ID: Thanks, Jed! On 13 June 2014 14:43, Jed Brown wrote: > Anush Krishnan writes: > > > Hello petsc-users, > > > > I created a vector using DMDACreate with 3 degrees of freedom. Is it > > possible for me to access each vector corresponding to a degree of > freedom? > > Seeing that I need to access the array as [k][j][i][dof], does it mean > that > > the values of each component are not contiguous? > > The values are interlaced. This is generally better for memory > performance (cache reuse). See, for example, the PETSc-FUN3D papers or > the various discretization frameworks that rediscover this every once in > a while. > > > Also, what is the difference between DMDAVecGetArray and > DMDAVecGetArrayDOF? > > With DMDAVecGetArray for multi-component problems, you usually write > > typedef struct {PetscScalar u,v,w;} Field; > Field ***x; > DMDAVecGetArray(dm,X,&x); > ... > x[k][j][i].u = 1; > > With DMDAVecGetArrayDOF, you use an extra set of indices instead of the > struct. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jun 13 13:59:08 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 13 Jun 2014 13:59:08 -0500 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: <87d2eclkfy.fsf@jedbrown.org> References: <87d2eclkfy.fsf@jedbrown.org> Message-ID: <2B5F659B-5182-46D6-8D21-89E9E69E2DB1@mcs.anl.gov> On Jun 13, 2014, at 1:43 PM, Jed Brown wrote: > Anush Krishnan writes: > >> Hello petsc-users, >> >> I created a vector using DMDACreate with 3 degrees of freedom. Is it >> possible for me to access each vector corresponding to a degree of freedom? >> Seeing that I need to access the array as [k][j][i][dof], does it mean that >> the values of each component are not contiguous? > > The values are interlaced. This is generally better for memory > performance (cache reuse). See, for example, the PETSc-FUN3D papers or > the various discretization frameworks that rediscover this every once in > a while. You can pull out a single set of DOF with VecStrideGather() and put it back with VecStrideScatter() see also VecStrideGatherAll(), VecStrideScatterAll() also see VecStrideNorm() etc. > >> Also, what is the difference between DMDAVecGetArray and DMDAVecGetArrayDOF? > > With DMDAVecGetArray for multi-component problems, you usually write > > typedef struct {PetscScalar u,v,w;} Field; > Field ***x; > DMDAVecGetArray(dm,X,&x); > ... > x[k][j][i].u = 1; > > With DMDAVecGetArrayDOF, you use an extra set of indices instead of the > struct. From bsmith at mcs.anl.gov Fri Jun 13 15:18:43 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 13 Jun 2014 15:18:43 -0500 Subject: [petsc-users] Interpreting -log_summary, amount of communication In-Reply-To: <539AE72F.60909@ntnu.no> References: <539AE72F.60909@ntnu.no> Message-ID: On Jun 13, 2014, at 6:57 AM, ?smund Ervik wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear PETSc, > > First of all, bug report for the manual (for petsc-current): in Fig. > 20 and 21, something has not gone well with \href and listings, so I > can't understand those figures properly. > > I read the chapter in the manual, 12.1, but it din't answer my > question. When I run a parallel code with -log_summary, how can I see > details of the time spent on communication? To be more specific: I > have some code that does communication at the start of each time step, > and I guess KSP has to do some communications when it is solving my > Poisson equation. If I understand correctly, both these communications > are listed under "VecAssemblyEnd", but how do I tell the division of > the time between those two? Do I have to register some stages, etc.? It is not really possibly to say ?oh 80% of the time is in computation and 20% in communication? From your log file VecScatterEnd 157271 1.0 1.1521e+03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 25 0 0 0 0 25 0 0 0 0 0 See the 25 near the end, this is the percent of total time going into the VecScatter end which is the communication (except for reductions) in the KSPSolve(). So lots of time in the communication but this is expected with an ethernet network. When you switch to a better network this will drop a good amount. > > Below is the output from the performance summary. Is this bad in terms > of the time spent on communication? This is using 4 nodes, 4 cores per > node on a small cluster with Intel E5-2670 8-core CPUs. The Streams > benchmark indicated that I can't really utilize more than 4 cores per > node. The interconnect is 1 Gb/s ethernet. The speedup vs. 1 core is > 9x. I'm solving incompressible Navier-Stokes on a 128^3 grid, with a > pressure Poisson equation. In this case I used SOR and BiCGStab. Likely you will want to use -pc_type gamg for this solve > > This cluster is where I'm learning the ropes, and I will be using more > tightly-coupled systems in the future (Infiniband). Should I expect an > increase in speedup when I use those? Yes. Your current machine is terrible for parallelism. > Average time for MPI_Barrier(): 0.000110197 > Average time for zero size MPI_Send(): 2.65092e-05 From my perspective your code is running reasonably on this machine. Barry > > Best regards, > ?smund Ervik > > > > > - ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > ./run on a double-real named compute-3-11.local with 16 processors, by > asmunder Thu Jun 12 14:22:08 2014 > Using Petsc Release Version 3.4.2, Jul, 02, 2013 > > Max Max/Min Avg Total > Time (sec): 3.246e+03 1.00000 3.246e+03 > Objects: 9.800e+02 1.00000 9.800e+02 > Flops: 1.667e+12 1.00447 1.663e+12 2.661e+13 > Flops/sec: 5.134e+08 1.00447 5.122e+08 8.196e+09 > MPI Messages: 6.163e+05 1.33327 5.393e+05 8.629e+06 > MPI Message Lengths: 4.626e+10 1.49807 7.151e+04 6.171e+11 > MPI Reductions: 4.576e+05 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length > N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 3.2462e+03 100.0% 2.6605e+13 100.0% 8.629e+06 > 100.0% 7.151e+04 100.0% 4.576e+05 100.0% > > - > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in > this phase > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in > this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > - > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > - > ------------------------------------------------------------------------------------------------------------------------ > > - --- Event Stage 0: Main Stage > > PetscBarrier 9 1.0 1.3815e+02356592.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > VecDot 150040 1.0 5.0756e+01 2.1 3.93e+10 1.0 0.0e+00 > 0.0e+00 1.5e+05 1 2 0 0 33 1 2 0 0 33 12399 > VecDotNorm2 75020 1.0 3.1857e+01 1.9 3.93e+10 1.0 0.0e+00 > 0.0e+00 7.5e+04 1 2 0 0 16 1 2 0 0 16 19754 > VecNorm 76620 1.0 2.4263e+01 3.0 2.01e+10 1.0 0.0e+00 > 0.0e+00 7.7e+04 1 1 0 0 17 1 1 0 0 17 13245 > VecCopy 1600 1.0 3.7060e-01 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 1632 1.0 7.2079e-01 4.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 800 1.0 3.7573e-01 1.7 2.10e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8931 > VecAXPBYCZ 150040 1.0 3.9948e+01 1.1 7.87e+10 1.0 0.0e+00 > 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 31507 > VecWAXPY 150040 1.0 3.6130e+01 1.2 3.93e+10 1.0 0.0e+00 > 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 17418 > VecAssemblyBegin 801 1.0 8.6365e+00 8.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 2.4e+03 0 0 0 0 1 0 0 0 0 1 0 > VecAssemblyEnd 801 1.0 1.2031e-03 1.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 157298 1.0 2.1938e+01 1.8 0.00e+00 0.0 8.6e+06 > 7.1e+04 2.7e+01 1 0100100 0 1 0100100 0 0 > VecScatterEnd 157271 1.0 1.1521e+03 3.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 25 0 0 0 0 25 0 0 0 0 0 > MatMult 150840 1.0 1.7150e+03 1.9 7.24e+11 1.0 8.4e+06 > 7.0e+04 0.0e+00 42 43 98 96 0 42 43 98 96 0 6721 > MatSOR 151640 1.0 1.0433e+03 1.4 7.25e+11 1.0 0.0e+00 > 0.0e+00 0.0e+00 26 44 0 0 0 26 44 0 0 0 11126 > MatAssemblyBegin 2 1.0 7.9169e-03 9.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyEnd 2 1.0 2.1147e-02 1.2 0.00e+00 0.0 1.1e+02 > 1.8e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 1 1.0 3.4809e-03 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 800 1.0 2.9176e+03 1.0 1.67e+12 1.0 8.4e+06 > 7.0e+04 4.6e+05 90100 98 96100 90100 98 96100 9119 > PCSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > PCApply 151640 1.0 1.0436e+03 1.4 7.25e+11 1.0 0.0e+00 > 0.0e+00 0.0e+00 26 44 0 0 0 26 44 0 0 0 11123 > - > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' > Mem. > Reports information only for process 0. > > - --- Event Stage 0: Main Stage > > Vector 859 812 840120768 0 > Vector Scatter 39 27 17388 0 > Matrix 3 0 0 0 > Matrix Null Space 1 0 0 0 > Distributed Mesh 4 0 0 0 > Bipartite Graph 8 0 0 0 > Index Set 55 55 2636500 0 > IS L to G Mapping 7 0 0 0 > Krylov Solver 1 0 0 0 > DMKSP interface 1 0 0 0 > Preconditioner 1 0 0 0 > Viewer 1 0 0 0 > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 0.000110197 > Average time for zero size MPI_Send(): 2.65092e-05 > #PETSc Option Table entries: > - -log_summary > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure run at: Fri Sep 13 14:49:18 2013 > Configure options: PETSC_ARCH=double-real --with-precision=double > - --with-scalar-type=real --with-cc=mpicc --with-cxx=mpicxx > - --with-fc=mpif90 --with-mpiexec=mpiexec --with-debugging=0 > - --COPTFLAGS="-O2 -fp-model extended" --FOPTFLAGS="-O2 -fltconsistency" > - > --with-blas-lapack-dir=/share/apps/modulessoftware/intel/compilers/13.0.1/mkl/lib/intel64 > - --with-64-bit-indices=0 --with-clanguage=c++ --with-shared-libraries=0 > - --download-ml --download-hypre > - ----------------------------------------- > Libraries compiled on Fri Sep 13 14:49:18 2013 on rocks.hpc.ntnu.no > Machine characteristics: > Linux-2.6.18-308.1.1.el5-x86_64-with-redhat-5.6-Tikanga > Using PETSc directory: /share/apps/modulessoftware/petsc/petsc-3.4.2 > Using PETSc arch: double-real > - ----------------------------------------- > > Using C compiler: mpicxx -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} > Using Fortran compiler: mpif90 -O2 -fltconsistency ${FOPTFLAGS} > ${FFLAGS} > - ----------------------------------------- > Using include paths: > - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/include > - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/include > - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/include > - -I/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/include > - -I/share/apps/modulessoftware/openmpi/openmpi-1.7.2-intel/include > - ----------------------------------------- > > Using C linker: mpicxx > Using Fortran linker: mpif90 > Using libraries: > - -Wl,-rpath,/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib > - -L/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib > - -lpetsc > - -Wl,-rpath,/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib > - -L/share/apps/modulessoftware/petsc/petsc-3.4.2/double-real/lib -lml > - -lHYPRE > - > -Wl,-rpath,/share/apps/modulessoftware/intel/compilers/13.0.1/mkl/lib/intel64 > - -L/share/apps/modulessoftware/intel/compilers/13.0.1/mkl/lib/intel64 > - -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 > - -lpthread > - -L/share/apps/modulessoftware/openmpi/openmpi-1.7.2-intel/lib > - > -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/compiler/lib/intel64 > - > -L/share/apps/modulessoftware/intel/compilers/opt/intel/mic/coi/host-linux-release/lib > - -L/share/apps/modulessoftware/intel/compilers/opt/intel/mic/myo/lib > - > -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/mpirt/lib/intel64 > - > -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/ipp/lib/intel64 > - > -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/mkl/lib/intel64 > - > -L/share/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/tbb/lib/intel64 > - > -L/gpfs/shareapps/apps/modulessoftware/intel/compilers/13.0.1/composer_xe_2013.1.117/compiler/lib/intel64 > - -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -lmpi_usempif08 > - -lmpi_usempi_ignore_tkr -lmpi_mpifh -lifport -lifcore -lm -lm > - -lmpi_cxx -ldl -lmpi -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts > - -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -ldl > - ----------------------------------------- > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.22 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQEcBAEBAgAGBQJTmucvAAoJED+FDAHgGz19vsYH/3E+g74VQFYIAIcf4tN/99WR > c2ofaByZyXbU9e7NiQyn0gbqwBjDtYKOWe8vMRkWx7AdVBgS0z2ChjpZHK5TrtlF > tW2JNztHBB7hgTisd5/2N5toNiCQWxUJu4/8jzbvjoaXrfU+aV3igLTLNbcT/2Rz > KSmPxxc77JYj55vd4v8E8yxA1sfwppMCcyTwzlOSGRO8yiie1fgaDvQFySeoNEL5 > ZMBwicNH4YBFYmEI8TH0DP6AjElW9mQOsEM+ktpupxmoFxwG3ciMKxrzpt3ID8Dw > X6gv+F8F73tzsLN09SPkjmz/vPtoS03om9ZnkQYm+qaLQ+n1wz6RcnpG/Bo3y6Q= > =DTtk > -----END PGP SIGNATURE----- From anush at bu.edu Fri Jun 13 21:04:03 2014 From: anush at bu.edu (Anush Krishnan) Date: Fri, 13 Jun 2014 22:04:03 -0400 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: <2B5F659B-5182-46D6-8D21-89E9E69E2DB1@mcs.anl.gov> References: <87d2eclkfy.fsf@jedbrown.org> <2B5F659B-5182-46D6-8D21-89E9E69E2DB1@mcs.anl.gov> Message-ID: On 13 June 2014 14:59, Barry Smith wrote: > > On Jun 13, 2014, at 1:43 PM, Jed Brown wrote: > > > Anush Krishnan writes: > > > >> Hello petsc-users, > >> > >> I created a vector using DMDACreate with 3 degrees of freedom. Is it > >> possible for me to access each vector corresponding to a degree of > freedom? > >> Seeing that I need to access the array as [k][j][i][dof], does it mean > that > >> the values of each component are not contiguous? > > > > The values are interlaced. This is generally better for memory > > performance (cache reuse). See, for example, the PETSc-FUN3D papers or > > the various discretization frameworks that rediscover this every once in > > a while. > > You can pull out a single set of DOF with VecStrideGather() and put > it back with VecStrideScatter() see also VecStrideGatherAll(), > VecStrideScatterAll() also see VecStrideNorm() etc. > > With regard to the interlaced memory performing better: If I used three vectors created from the same DMDA for each degree of freedom, how different would that be in performance compared to a fully interlaced vector? Wouldn't cache reuse be about the same for both cases? > > > >> Also, what is the difference between DMDAVecGetArray and > DMDAVecGetArrayDOF? > > > > With DMDAVecGetArray for multi-component problems, you usually write > > > > typedef struct {PetscScalar u,v,w;} Field; > > Field ***x; > > DMDAVecGetArray(dm,X,&x); > > ... > > x[k][j][i].u = 1; > > > > With DMDAVecGetArrayDOF, you use an extra set of indices instead of the > > struct. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jun 13 21:09:34 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 13 Jun 2014 21:09:34 -0500 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: References: <87d2eclkfy.fsf@jedbrown.org> <2B5F659B-5182-46D6-8D21-89E9E69E2DB1@mcs.anl.gov> Message-ID: <87mwdgi6n5.fsf@jedbrown.org> Anush Krishnan writes: > With regard to the interlaced memory performing better: If I used three > vectors created from the same DMDA for each degree of freedom, how > different would that be in performance compared to a fully interlaced > vector? Wouldn't cache reuse be about the same for both cases? No, when you traverse the grid accessing all three components, you will have three times as many prefetch streams (typically reducing prefetch capability, thus generating more cold cache misses) and will spill irregularly over cache lines more frequently, thus reducing the effective cache size. This can result in an integer-factor slowdown as compared to interlaced storage. By all means, run the experiment, but the expected result for memory bandwidth/cache-limited operations is that interlaced delivers significantly better performance. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Fri Jun 13 21:22:40 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 13 Jun 2014 21:22:40 -0500 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: <87mwdgi6n5.fsf@jedbrown.org> References: <87d2eclkfy.fsf@jedbrown.org> <2B5F659B-5182-46D6-8D21-89E9E69E2DB1@mcs.anl.gov> <87mwdgi6n5.fsf@jedbrown.org> Message-ID: The main reason to ?pull out? a single component is, for example, to solve a linear system for that single component; that is, to work on that single component a great deal. You wouldn?t pull out the individual components to iterate on them all together. Barry On Jun 13, 2014, at 9:09 PM, Jed Brown wrote: > Anush Krishnan writes: >> With regard to the interlaced memory performing better: If I used three >> vectors created from the same DMDA for each degree of freedom, how >> different would that be in performance compared to a fully interlaced >> vector? Wouldn't cache reuse be about the same for both cases? > > No, when you traverse the grid accessing all three components, you will > have three times as many prefetch streams (typically reducing prefetch > capability, thus generating more cold cache misses) and will spill > irregularly over cache lines more frequently, thus reducing the > effective cache size. This can result in an integer-factor slowdown as > compared to interlaced storage. By all means, run the experiment, but > the expected result for memory bandwidth/cache-limited operations is > that interlaced delivers significantly better performance. From anush at bu.edu Fri Jun 13 21:41:25 2014 From: anush at bu.edu (Anush Krishnan) Date: Fri, 13 Jun 2014 22:41:25 -0400 Subject: [petsc-users] Multi-DOF DMDA Vec In-Reply-To: References: <87d2eclkfy.fsf@jedbrown.org> <2B5F659B-5182-46D6-8D21-89E9E69E2DB1@mcs.anl.gov> <87mwdgi6n5.fsf@jedbrown.org> Message-ID: On 13 June 2014 22:22, Barry Smith wrote: > > The main reason to ?pull out? a single component is, for example, to > solve a linear system for that single component; that is, to work on that > single component a great deal. You wouldn?t pull out the individual > components to iterate on them all together. > I needed to pull out a single component to perform a matrix-vector multiply for further processing, but I realised I could just rearrange the matrix instead. Thank you, Barry and Jed. That was very helpful. > > Barry > > On Jun 13, 2014, at 9:09 PM, Jed Brown wrote: > > > Anush Krishnan writes: > >> With regard to the interlaced memory performing better: If I used three > >> vectors created from the same DMDA for each degree of freedom, how > >> different would that be in performance compared to a fully interlaced > >> vector? Wouldn't cache reuse be about the same for both cases? > > > > No, when you traverse the grid accessing all three components, you will > > have three times as many prefetch streams (typically reducing prefetch > > capability, thus generating more cold cache misses) and will spill > > irregularly over cache lines more frequently, thus reducing the > > effective cache size. This can result in an integer-factor slowdown as > > compared to interlaced storage. By all means, run the experiment, but > > the expected result for memory bandwidth/cache-limited operations is > > that interlaced delivers significantly better performance. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hus003 at ucsd.edu Sat Jun 14 17:18:32 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 14 Jun 2014 22:18:32 +0000 Subject: [petsc-users] Using operators in KSP Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> I try to program 2D Stokes equation solver, so it is a linear PDE, and there are u, v, p on every grid point. One way on my mind is to form a matrix-free block matrix. In that way, if the discretization is n by n, then the matrix is 3n by 3n. However, I'm also thinking if it is possible to define the PDE operator as what DMDASNESSetFunctionLocal does in SNES example ex19? In that example, the unknowns (u, v, omega, T) are defined as a struct of four PestsScalar on every grid point, and then the interface converts PestcScalar** to Vec, and an operator instead of a matrix is formed. Is there a function in KSP similar to DMDASNESSetFunctionLocal in SNES? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat Jun 14 17:40:18 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 14 Jun 2014 17:40:18 -0500 Subject: [petsc-users] Using operators in KSP In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87sin7f73h.fsf@jedbrown.org> "Sun, Hui" writes: > I try to program 2D Stokes equation solver, so it is a linear PDE, and > there are u, v, p on every grid point. One way on my mind is to form a > matrix-free block matrix. In that way, if the discretization is n by > n, then the matrix is 3n by 3n. However, I'm also thinking if it is > possible to define the PDE operator as what DMDASNESSetFunctionLocal > does in SNES example ex19? In that example, the unknowns (u, v, omega, > T) are defined as a struct of four PestsScalar on every grid point, > and then the interface converts PestcScalar** to Vec, and an operator > instead of a matrix is formed. > > Is there a function in KSP similar to DMDASNESSetFunctionLocal in SNES? DMKSPSetComputeOperators() is sort of similar, but you still build matrices. You can use MatShell to wrap a "matrix-free" (unassembled) linear operator. If you like the interface DMDASNESSetFunctionLocal, I recommend just using that interface and -snes_type ksponly (just do one linear solve and don't bother checking for convergence; recommended for linear problems formulated using SNES). This is also the easiest route if you might later generalize to a nonlinear problem (e.g., non-Newtonian rheology or Navier-Stokes). Note that most preconditioners will require a matrix to be assembled. You can set a sparsity pattern for the Stokes operator (or use the pattern from DMDA if it's good enough) and use coloring (default) to assemble a matrix. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat Jun 14 17:43:12 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 14 Jun 2014 17:43:12 -0500 Subject: [petsc-users] Using operators in KSP In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <35C5FC64-B8D3-45EE-87CC-89B5FCA40984@mcs.anl.gov> MatCreateShell() is what I think you need. You provide a ?matrix-free? linear operator with MatShellSetOperation(mat,MATOP_MULT,yourfunction) and yourfunction applies the linear operator any way it likes. You then provide this matrix as the first Mat argument to KSPSetOperators(). Barry On Jun 14, 2014, at 5:18 PM, Sun, Hui wrote: > I try to program 2D Stokes equation solver, so it is a linear PDE, and there are u, v, p on every grid point. One way on my mind is to form a matrix-free block matrix. In that way, if the discretization is n by n, then the matrix is 3n by 3n. However, I'm also thinking if it is possible to define the PDE operator as what DMDASNESSetFunctionLocal does in SNES example ex19? In that example, the unknowns (u, v, omega, T) are defined as a struct of four PestsScalar on every grid point, and then the interface converts PestcScalar** to Vec, and an operator instead of a matrix is formed. > > Is there a function in KSP similar to DMDASNESSetFunctionLocal in SNES? From quecat001 at gmail.com Sun Jun 15 09:51:02 2014 From: quecat001 at gmail.com (Que Cat) Date: Sun, 15 Jun 2014 09:51:02 -0500 Subject: [petsc-users] FE mesh partition Message-ID: Dear Petsc-users, I have tried to partition a FE mesh by following part 3.5 (page 68) of the manual. If I do the partition by element, then I have the "is" for new destination of elements and "isg" for their new number. Is there any way to also partition the node and renumber the node based on the partition? It would be helpful if you could point me to any similar examples. Thank you. Que -------------- next part -------------- An HTML attachment was scrubbed... URL: From jianjun.xiao at kit.edu Sun Jun 15 15:32:32 2014 From: jianjun.xiao at kit.edu (Xiao, Jianjun (IKET)) Date: Sun, 15 Jun 2014 22:32:32 +0200 Subject: [petsc-users] memory speed & the speed of sparse matrix computation Message-ID: <56D054AF2E93E044AC1D2685709D2868D8C575E6D5@KIT-MSX-07.kit.edu> Hello, In PETSc FAQ, I found such a statement below: This is because the speed of sparse matrix computations is almost totally determined by the speed of the memory, not the speed of the CPU. I am not an expert in matrix solving. Could anybody explain why the memory speed is so critical for the speed of sparse matrix computation? Is there any reference? Thank you. Best regards JJ From knepley at gmail.com Sun Jun 15 19:18:22 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 15 Jun 2014 19:18:22 -0500 Subject: [petsc-users] memory speed & the speed of sparse matrix computation In-Reply-To: <56D054AF2E93E044AC1D2685709D2868D8C575E6D5@KIT-MSX-07.kit.edu> References: <56D054AF2E93E044AC1D2685709D2868D8C575E6D5@KIT-MSX-07.kit.edu> Message-ID: On Sun, Jun 15, 2014 at 3:32 PM, Xiao, Jianjun (IKET) wrote: > Hello, > > In PETSc FAQ, I found such a statement below: > > This is because the speed of sparse matrix computations is almost totally > determined by the speed of the memory, not the speed of the CPU. > > I am not an expert in matrix solving. Could anybody explain why the memory > speed is so critical for the speed of sparse matrix computation? Is there > any reference? > Yes, http://www.mcs.anl.gov/~kaushik/Papers/pcfd99_gkks.pdf Matt > Thank you. > > Best regards > JJ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hus003 at ucsd.edu Mon Jun 16 00:40:40 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Mon, 16 Jun 2014 05:40:40 +0000 Subject: [petsc-users] Using operators in KSP In-Reply-To: <35C5FC64-B8D3-45EE-87CC-89B5FCA40984@mcs.anl.gov> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU>, <35C5FC64-B8D3-45EE-87CC-89B5FCA40984@mcs.anl.gov> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C2444@XMAIL-MBX-BH1.AD.UCSD.EDU> Thank you Barry and Jed for answering my question. Here is another question: If I have a nonlinear PDE, for example, Poisson Boltzmann with an inhomogeneous dielectric constant epsilon. I want to solve for psi, but not epsilon. But to form the SetFunctionLocal, one needs to provide psi as well as epsilon. The user defined function which is passed to DMDASNESSetFunctionLocal should be of the format PetscErrorCode (*func)(DMDALocalInfo*,void*,void*,void*), where the first void* corresponds to x, the dimensional pointer to state at which to evaluate residual; and the second void* corresponds to f, the dimensional pointer to residual, write the residual here; and the third void* is the optional context passed above. The scalar field epsilon is not part of x, neither is it part of f, it seems that the only choice is to pass it in the third void*. However, if I have some other parameters, should I set up a struct, which includes those other parameters, and a distributed Vec for epsilon, and pass it in the third void*? Best, Hui ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Saturday, June 14, 2014 3:43 PM To: Sun, Hui Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using operators in KSP MatCreateShell() is what I think you need. You provide a ?matrix-free? linear operator with MatShellSetOperation(mat,MATOP_MULT,yourfunction) and yourfunction applies the linear operator any way it likes. You then provide this matrix as the first Mat argument to KSPSetOperators(). Barry On Jun 14, 2014, at 5:18 PM, Sun, Hui wrote: > I try to program 2D Stokes equation solver, so it is a linear PDE, and there are u, v, p on every grid point. One way on my mind is to form a matrix-free block matrix. In that way, if the discretization is n by n, then the matrix is 3n by 3n. However, I'm also thinking if it is possible to define the PDE operator as what DMDASNESSetFunctionLocal does in SNES example ex19? In that example, the unknowns (u, v, omega, T) are defined as a struct of four PestsScalar on every grid point, and then the interface converts PestcScalar** to Vec, and an operator instead of a matrix is formed. > > Is there a function in KSP similar to DMDASNESSetFunctionLocal in SNES? From knepley at gmail.com Mon Jun 16 05:09:08 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 16 Jun 2014 05:09:08 -0500 Subject: [petsc-users] Using operators in KSP In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C2444@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <35C5FC64-B8D3-45EE-87CC-89B5FCA40984@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6C2444@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: On Mon, Jun 16, 2014 at 12:40 AM, Sun, Hui wrote: > Thank you Barry and Jed for answering my question. Here is another > question: If I have a nonlinear PDE, for example, Poisson Boltzmann with an > inhomogeneous dielectric constant epsilon. I want to solve for psi, but not > epsilon. But to form the SetFunctionLocal, one needs to provide psi as well > as epsilon. > > The user defined function which is passed to DMDASNESSetFunctionLocal > should be of the format > PetscErrorCode (*func)(DMDALocalInfo*,void*,void*,void*), > where the first void* corresponds to x, the dimensional pointer to state > at which to evaluate residual; and the second void* corresponds to f, the > dimensional pointer to residual, write the residual here; and the third > void* is the optional context passed above. > > The scalar field epsilon is not part of x, neither is it part of f, it > seems that the only choice is to pass it in the third void*. However, if I > have some other parameters, should I set up a struct, which includes those > other parameters, and a distributed Vec for epsilon, and pass it in the > third void*? > Yes, that is the idea. For example, http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex5.c.html defines a struct with problem parameters (so does ex19) Thanks, Matt > Best, > Hui > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Saturday, June 14, 2014 3:43 PM > To: Sun, Hui > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Using operators in KSP > > MatCreateShell() is what I think you need. You provide a ?matrix-free? > linear operator with MatShellSetOperation(mat,MATOP_MULT,yourfunction) and > yourfunction applies the linear operator any way it likes. > > You then provide this matrix as the first Mat argument to > KSPSetOperators(). > > > Barry > > > > On Jun 14, 2014, at 5:18 PM, Sun, Hui wrote: > > > I try to program 2D Stokes equation solver, so it is a linear PDE, and > there are u, v, p on every grid point. One way on my mind is to form a > matrix-free block matrix. In that way, if the discretization is n by n, > then the matrix is 3n by 3n. However, I'm also thinking if it is possible > to define the PDE operator as what DMDASNESSetFunctionLocal does in SNES > example ex19? In that example, the unknowns (u, v, omega, T) are defined as > a struct of four PestsScalar on every grid point, and then the interface > converts PestcScalar** to Vec, and an operator instead of a matrix is > formed. > > > > Is there a function in KSP similar to DMDASNESSetFunctionLocal in SNES? > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jun 16 07:50:41 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 16 Jun 2014 07:50:41 -0500 Subject: [petsc-users] Using operators in KSP In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C2444@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C23E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <35C5FC64-B8D3-45EE-87CC-89B5FCA40984@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6C2444@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87zjhdc926.fsf@jedbrown.org> "Sun, Hui" writes: > The scalar field epsilon is not part of x, neither is it part of f, it > seems that the only choice is to pass it in the third void*. However, > if I have some other parameters, should I set up a struct, which > includes those other parameters, and a distributed Vec for epsilon, > and pass it in the third void*? Yes, this is the typical approach. See src/snes/examples/tutorials/ex28.c for an example of a way to structured code that reuses local physics for a 2-field problem and can solve for either field or both fields. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From oliver.browne at upm.es Mon Jun 16 08:59:33 2014 From: oliver.browne at upm.es (Oliver Browne) Date: Mon, 16 Jun 2014 15:59:33 +0200 Subject: [petsc-users] ILU MPI Message-ID: <6691d286e5c16e3b1bbf282b658bddde@upm.es> Hi, Could someone explain if there is any difference in these two approaches when it comes to MPI. When using ILU and running with; -sub_pc_type ilu -sub_pc_factor_levels 1 -sub_ksp_type preonly or using the external package hypre; -pc_type hypre -pc_hypre_type euclid -pc_hypre_euclid_levels k Thanks Ollie From jed at jedbrown.org Mon Jun 16 09:03:16 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 16 Jun 2014 09:03:16 -0500 Subject: [petsc-users] ILU MPI In-Reply-To: <6691d286e5c16e3b1bbf282b658bddde@upm.es> References: <6691d286e5c16e3b1bbf282b658bddde@upm.es> Message-ID: <87oaxtc5p7.fsf@jedbrown.org> Oliver Browne writes: > Hi, > > Could someone explain if there is any difference in these two approaches > when it comes to MPI. When using ILU and running with; > > -sub_pc_type ilu -sub_pc_factor_levels 1 -sub_ksp_type preonly This is (assuming you didn't change it) block Jacobi with ILU in blocks. > or using the external package hypre; > > -pc_type hypre -pc_hypre_type euclid -pc_hypre_euclid_levels k This is parallel ILU. This might be stronger, but also likely more expensive. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Mon Jun 16 10:41:40 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 16 Jun 2014 10:41:40 -0500 Subject: [petsc-users] MatNestGetISs in fortran In-Reply-To: References: <64c4658aeb7441abbe20e4aa252554a2@MAR190N1.marin.local>, Message-ID: perhaps this routine does not need custom fortran interface. Does the attached src/mat/impls/nest/ftn-auto/matnestf.c work [with petsc-3.4]? If so - I'll add this to petsc dev [master] thanks, Satish On Fri, 13 Jun 2014, Klaij, Christiaan wrote: > Perhaps this message from May 27 "slipped through the email cracks" as Matt puts it? > > Chris > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > ________________________________________ > From: Klaij, Christiaan > Sent: Monday, June 02, 2014 9:54 AM > To: petsc-users at mcs.anl.gov > Subject: RE: MatNestGetISs in fortran > > Just a reminder. Could you please add fortran support for MatNestGetISs? > ________________________________________ > From: Klaij, Christiaan > Sent: Tuesday, May 27, 2014 3:47 PM > To: petsc-users at mcs.anl.gov > Subject: MatNestGetISs in fortran > > I'm trying to use MatNestGetISs in a fortran program but it seems to be missing from the fortran include file (PETSc 3.4). > > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: matnestf.c URL: From alan.avbs at rocketmail.com Mon Jun 16 12:07:39 2014 From: alan.avbs at rocketmail.com (alan.avbs at rocketmail.com) Date: Mon, 16 Jun 2014 10:07:39 -0700 Subject: [petsc-users] MatNestGetISs in fortran In-Reply-To: Message-ID: <1402938459.90659.YahooMailAndroidMobile@web122005.mail.ne1.yahoo.com> Sent from Yahoo Mail on Android -------------- next part -------------- An HTML attachment was scrubbed... URL: From keceli at gmail.com Tue Jun 17 11:17:28 2014 From: keceli at gmail.com (=?UTF-8?Q?murat_ke=C3=A7eli?=) Date: Tue, 17 Jun 2014 11:17:28 -0500 Subject: [petsc-users] memory speed & the speed of sparse matrix computation In-Reply-To: References: <56D054AF2E93E044AC1D2685709D2868D8C575E6D5@KIT-MSX-07.kit.edu> Message-ID: The link gives 404 error. On Sun, Jun 15, 2014 at 7:18 PM, Matthew Knepley wrote: > On Sun, Jun 15, 2014 at 3:32 PM, Xiao, Jianjun (IKET) > wrote: >> >> Hello, >> >> In PETSc FAQ, I found such a statement below: >> >> This is because the speed of sparse matrix computations is almost totally >> determined by the speed of the memory, not the speed of the CPU. >> >> I am not an expert in matrix solving. Could anybody explain why the memory >> speed is so critical for the speed of sparse matrix computation? Is there >> any reference? > > > Yes, http://www.mcs.anl.gov/~kaushik/Papers/pcfd99_gkks.pdf > > Matt > >> >> Thank you. >> >> Best regards >> JJ > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener From pvsang002 at gmail.com Tue Jun 17 14:12:04 2014 From: pvsang002 at gmail.com (Sang pham van) Date: Tue, 17 Jun 2014 15:12:04 -0400 Subject: [petsc-users] SNES divergence Message-ID: Hi, I am using DM structure and SNES to solve a 3D problem. In the problem I have 3 variables. I got SNES converged with a grid. Obtain result are physically right. However when I refine the grid, SNES does not always converge, the reason of divergence is line search fail or linear solver failed. (I also tried other type of SNES, line search seems to be the one best fits my problem) Can you please give me a suggestion to figure out the problem with my solver? What options should I use to have pure Newton method in SNES? Is there any advance option I can use with line search to improve SNES convergence. Thank you very much. Minh. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jun 17 14:20:42 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 17 Jun 2014 14:20:42 -0500 Subject: [petsc-users] SNES divergence In-Reply-To: References: Message-ID: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> On Jun 17, 2014, at 2:12 PM, Sang pham van wrote: > Hi, > > I am using DM structure and SNES to solve a 3D problem. In the problem I have 3 variables. > > I got SNES converged with a grid. Obtain result are physically right. However when I refine the grid, SNES does not always converge, With the refined grid how are you starting the solution? Do you use the interpolated solution from the coarser grid (called grid sequencing) or just some ?not good? initial guess? > the reason of divergence is line search fail or linear solver failed. (I also tried other type of SNES, line search seems to be the one best fits my problem) > > Can you please give me a suggestion to figure out the problem with my solver? What options should I use to have pure Newton method in SNES? Is there any advance option I can use with line search to improve SNES convergence. > You should use grid sequencing, not only does it get convergence when you may not otherwise get it but it will also solve the problem faster. With PETSc DM you can use -snes_grid_sequence n or SNESSetGridSequence() in the code to do n levels of grid sequencing. Barry > Thank you very much. > > Minh. > From pvsang002 at gmail.com Tue Jun 17 14:30:11 2014 From: pvsang002 at gmail.com (Sang pham van) Date: Tue, 17 Jun 2014 15:30:11 -0400 Subject: [petsc-users] SNES divergence In-Reply-To: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> References: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> Message-ID: Thanks Barry, To refine the grid, I just put more points in the direction. For both fine and coarse meshes, I used simple initial guess (say constant values in whole domain for all variables). By using grid sequencing, is the finest mesh is the one I first input the solver? Can you let me know what options should I use to have pure Newton method? S. On Tue, Jun 17, 2014 at 3:20 PM, Barry Smith wrote: > > On Jun 17, 2014, at 2:12 PM, Sang pham van wrote: > > > Hi, > > > > I am using DM structure and SNES to solve a 3D problem. In the problem I > have 3 variables. > > > > I got SNES converged with a grid. Obtain result are physically right. > However when I refine the grid, SNES does not always converge, > > With the refined grid how are you starting the solution? Do you use the > interpolated solution from the coarser grid (called grid sequencing) or > just some ?not good? initial guess? > > > the reason of divergence is line search fail or linear solver failed. (I > also tried other type of SNES, line search seems to be the one best fits my > problem) > > > > Can you please give me a suggestion to figure out the problem with my > solver? What options should I use to have pure Newton method in SNES? Is > there any advance option I can use with line search to improve SNES > convergence. > > > > You should use grid sequencing, not only does it get convergence when > you may not otherwise get it but it will also solve the problem faster. > With PETSc DM you can use -snes_grid_sequence n or SNESSetGridSequence() > in the code to do n levels of grid sequencing. > > Barry > > > Thank you very much. > > > > Minh. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jun 17 14:46:54 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 17 Jun 2014 14:46:54 -0500 Subject: [petsc-users] SNES divergence In-Reply-To: References: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> Message-ID: <6AC313E4-592F-46D9-BC42-BA39E1BF467F@mcs.anl.gov> On Jun 17, 2014, at 2:30 PM, Sang pham van wrote: > Thanks Barry, > > To refine the grid, I just put more points in the direction. For both fine and coarse meshes, I used simple initial guess (say constant values in whole domain for all variables). > > By using grid sequencing, is the finest mesh is the one I first input the solver? No, you pass in the coarse one. > > Can you let me know what options should I use to have pure Newton method? Not sure what you mean by pure Newton method, maybe without grid sequencing? It simply may not be possible to get convergence from a ?poor? initial guess. One should always use grid sequencing if possible. Barry > > S. > > > On Tue, Jun 17, 2014 at 3:20 PM, Barry Smith wrote: > > On Jun 17, 2014, at 2:12 PM, Sang pham van wrote: > > > Hi, > > > > I am using DM structure and SNES to solve a 3D problem. In the problem I have 3 variables. > > > > I got SNES converged with a grid. Obtain result are physically right. However when I refine the grid, SNES does not always converge, > > With the refined grid how are you starting the solution? Do you use the interpolated solution from the coarser grid (called grid sequencing) or just some ?not good? initial guess? > > > the reason of divergence is line search fail or linear solver failed. (I also tried other type of SNES, line search seems to be the one best fits my problem) > > > > Can you please give me a suggestion to figure out the problem with my solver? What options should I use to have pure Newton method in SNES? Is there any advance option I can use with line search to improve SNES convergence. > > > > You should use grid sequencing, not only does it get convergence when you may not otherwise get it but it will also solve the problem faster. With PETSc DM you can use -snes_grid_sequence n or SNESSetGridSequence() in the code to do n levels of grid sequencing. > > Barry > > > Thank you very much. > > > > Minh. > > > > From pvsang002 at gmail.com Tue Jun 17 15:03:40 2014 From: pvsang002 at gmail.com (Sang pham van) Date: Tue, 17 Jun 2014 16:03:40 -0400 Subject: [petsc-users] SNES divergence In-Reply-To: <6AC313E4-592F-46D9-BC42-BA39E1BF467F@mcs.anl.gov> References: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> <6AC313E4-592F-46D9-BC42-BA39E1BF467F@mcs.anl.gov> Message-ID: Hi Barry, I want to try the original version of Newton method, no line search nor trust region... Using grid sequencing confuses me at the point that I don't know explicitly the final mesh size, or even if I can estimate the mesh size from the input -snes_grid_sequence n it's not so convenient for me when some of my routines need such info from very first moment. I wish PETSc could allow us to pass in the finest mesh for grid sequencing. S, On Tue, Jun 17, 2014 at 3:46 PM, Barry Smith wrote: > > On Jun 17, 2014, at 2:30 PM, Sang pham van wrote: > > > Thanks Barry, > > > > To refine the grid, I just put more points in the direction. For both > fine and coarse meshes, I used simple initial guess (say constant values in > whole domain for all variables). > > > > By using grid sequencing, is the finest mesh is the one I first input > the solver? > > No, you pass in the coarse one. > > > > Can you let me know what options should I use to have pure Newton method? > > Not sure what you mean by pure Newton method, maybe without grid > sequencing? It simply may not be possible to get convergence from a ?poor? > initial guess. One should always use grid sequencing if possible. > > Barry > > > > > S. > > > > > > On Tue, Jun 17, 2014 at 3:20 PM, Barry Smith wrote: > > > > On Jun 17, 2014, at 2:12 PM, Sang pham van wrote: > > > > > Hi, > > > > > > I am using DM structure and SNES to solve a 3D problem. In the problem > I have 3 variables. > > > > > > I got SNES converged with a grid. Obtain result are physically right. > However when I refine the grid, SNES does not always converge, > > > > With the refined grid how are you starting the solution? Do you use > the interpolated solution from the coarser grid (called grid sequencing) or > just some ?not good? initial guess? > > > > > the reason of divergence is line search fail or linear solver failed. > (I also tried other type of SNES, line search seems to be the one best fits > my problem) > > > > > > Can you please give me a suggestion to figure out the problem with my > solver? What options should I use to have pure Newton method in SNES? Is > there any advance option I can use with line search to improve SNES > convergence. > > > > > > > You should use grid sequencing, not only does it get convergence when > you may not otherwise get it but it will also solve the problem faster. > With PETSc DM you can use -snes_grid_sequence n or SNESSetGridSequence() > in the code to do n levels of grid sequencing. > > > > Barry > > > > > Thank you very much. > > > > > > Minh. > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jun 17 15:09:58 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 17 Jun 2014 15:09:58 -0500 Subject: [petsc-users] SNES divergence In-Reply-To: References: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> <6AC313E4-592F-46D9-BC42-BA39E1BF467F@mcs.anl.gov> Message-ID: <46328E64-A7F2-4273-877C-3A6EAB847540@mcs.anl.gov> On Jun 17, 2014, at 3:03 PM, Sang pham van wrote: > Hi Barry, > > I want to try the original version of Newton method, no line search nor trust region? This will often fail, original version Newton just doesn?t work without a good initial solution. > Using grid sequencing confuses me at the point that I don't know explicitly the final mesh size, or even if I can estimate the mesh size from the input -snes_grid_sequence n it's not so convenient for me when some of my routines need such info from very first moment. To do grid sequencing your code needs to be able to compute the function on several grids, and compute the Jacobian on several grids. If your code is hardwired to work on only one grid (with global variables or whatever) you cannot use grid sequencing and you out of luck. You should organize your code so that it doesn?t depend on any hardwired grid sizes. Barry > I wish PETSc could allow us to pass in the finest mesh for grid sequencing. > > > S, > > > On Tue, Jun 17, 2014 at 3:46 PM, Barry Smith wrote: > > On Jun 17, 2014, at 2:30 PM, Sang pham van wrote: > > > Thanks Barry, > > > > To refine the grid, I just put more points in the direction. For both fine and coarse meshes, I used simple initial guess (say constant values in whole domain for all variables). > > > > By using grid sequencing, is the finest mesh is the one I first input the solver? > > No, you pass in the coarse one. > > > > Can you let me know what options should I use to have pure Newton method? > > Not sure what you mean by pure Newton method, maybe without grid sequencing? It simply may not be possible to get convergence from a ?poor? initial guess. One should always use grid sequencing if possible. > > Barry > > > > > S. > > > > > > On Tue, Jun 17, 2014 at 3:20 PM, Barry Smith wrote: > > > > On Jun 17, 2014, at 2:12 PM, Sang pham van wrote: > > > > > Hi, > > > > > > I am using DM structure and SNES to solve a 3D problem. In the problem I have 3 variables. > > > > > > I got SNES converged with a grid. Obtain result are physically right. However when I refine the grid, SNES does not always converge, > > > > With the refined grid how are you starting the solution? Do you use the interpolated solution from the coarser grid (called grid sequencing) or just some ?not good? initial guess? > > > > > the reason of divergence is line search fail or linear solver failed. (I also tried other type of SNES, line search seems to be the one best fits my problem) > > > > > > Can you please give me a suggestion to figure out the problem with my solver? What options should I use to have pure Newton method in SNES? Is there any advance option I can use with line search to improve SNES convergence. > > > > > > > You should use grid sequencing, not only does it get convergence when you may not otherwise get it but it will also solve the problem faster. With PETSc DM you can use -snes_grid_sequence n or SNESSetGridSequence() in the code to do n levels of grid sequencing. > > > > Barry > > > > > Thank you very much. > > > > > > Minh. > > > > > > > > > From balay at mcs.anl.gov Tue Jun 17 15:31:37 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 17 Jun 2014 15:31:37 -0500 Subject: [petsc-users] memory speed & the speed of sparse matrix computation In-Reply-To: References: <56D054AF2E93E044AC1D2685709D2868D8C575E6D5@KIT-MSX-07.kit.edu> Message-ID: alternate url: http://www.cs.illinois.edu/~wgropp/bib/papers/1999/pcfd99/gkks.ps satish On Tue, 17 Jun 2014, murat ke?eli wrote: > The link gives 404 error. > > On Sun, Jun 15, 2014 at 7:18 PM, Matthew Knepley wrote: > > On Sun, Jun 15, 2014 at 3:32 PM, Xiao, Jianjun (IKET) > > wrote: > >> > >> Hello, > >> > >> In PETSc FAQ, I found such a statement below: > >> > >> This is because the speed of sparse matrix computations is almost totally > >> determined by the speed of the memory, not the speed of the CPU. > >> > >> I am not an expert in matrix solving. Could anybody explain why the memory > >> speed is so critical for the speed of sparse matrix computation? Is there > >> any reference? > > > > > > Yes, http://www.mcs.anl.gov/~kaushik/Papers/pcfd99_gkks.pdf > > > > Matt > > > >> > >> Thank you. > >> > >> Best regards > >> JJ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments > > is infinitely more interesting than any results to which their experiments > > lead. > > -- Norbert Wiener > From keceli at gmail.com Tue Jun 17 15:41:38 2014 From: keceli at gmail.com (=?UTF-8?Q?murat_ke=C3=A7eli?=) Date: Tue, 17 Jun 2014 15:41:38 -0500 Subject: [petsc-users] memory speed & the speed of sparse matrix computation In-Reply-To: References: <56D054AF2E93E044AC1D2685709D2868D8C575E6D5@KIT-MSX-07.kit.edu> Message-ID: Thank you Satish. Murat On Tue, Jun 17, 2014 at 3:31 PM, Satish Balay wrote: > alternate url: > > http://www.cs.illinois.edu/~wgropp/bib/papers/1999/pcfd99/gkks.ps > > satish > > On Tue, 17 Jun 2014, murat ke?eli wrote: > >> The link gives 404 error. >> >> On Sun, Jun 15, 2014 at 7:18 PM, Matthew Knepley wrote: >> > On Sun, Jun 15, 2014 at 3:32 PM, Xiao, Jianjun (IKET) >> > wrote: >> >> >> >> Hello, >> >> >> >> In PETSc FAQ, I found such a statement below: >> >> >> >> This is because the speed of sparse matrix computations is almost totally >> >> determined by the speed of the memory, not the speed of the CPU. >> >> >> >> I am not an expert in matrix solving. Could anybody explain why the memory >> >> speed is so critical for the speed of sparse matrix computation? Is there >> >> any reference? >> > >> > >> > Yes, http://www.mcs.anl.gov/~kaushik/Papers/pcfd99_gkks.pdf >> > >> > Matt >> > >> >> >> >> Thank you. >> >> >> >> Best regards >> >> JJ >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their experiments >> > is infinitely more interesting than any results to which their experiments >> > lead. >> > -- Norbert Wiener >> From akurlej at gmail.com Tue Jun 17 17:03:51 2014 From: akurlej at gmail.com (Arthur Kurlej) Date: Tue, 17 Jun 2014 17:03:51 -0500 Subject: [petsc-users] Retrieving a SubMatrix from a Matrix in parallel Message-ID: Hi all, I wish to extract a parallel AIJ submatrix (B) from another parallel AIJ matrix (A), in particular, if I have an NxN matrix, I wish to extract the first N-1 rows and N-1 columns of A. I can do this perfectly fine when running my code sequentially using ISCreateStride and MatGetSubMatrix like so: ISCreateStride(PETSC_COMM_WORLD,N,1,1,&is); MatGetSubMatrix(A,is,is,MAT_INITIAL_MATRIX,&B); And it behaves as I would expect, but when running in parallel, the B matrix increases past it's allocated size of N-1xN-1. I've pasted the example of a simple example output below: ORIGINAL SEQUENTIAL MATRIX row 0: (0, 1) (1, 2) (2, 3) (3, 4) row 1: (0, 1) (1, 2) (2, 3) (3, 4) row 2: (0, 1) (1, 2) (2, 3) (3, 4) row 3: (0, 1) (1, 2) (2, 3) (3, 4) SEQUENTIAL SUBMATRIX row 0: (0, 1) (1, 2) (2, 3) row 1: (0, 1) (1, 2) (2, 3) row 2: (0, 1) (1, 2) (2, 3) ORIGINAL PARALLEL MATRIX row 0: (0, 1) (1, 2) (2, 3) (3, 4) row 1: (0, 1) (1, 2) (2, 3) (3, 4) row 2: (0, 1) (1, 2) (2, 3) (3, 4) row 3: (0, 1) (1, 2) (2, 3) (3, 4) PARALLEL "SUBMATRIX" row 0: (3, 1) (4, 2) (5, 3) row 1: (3, 1) (4, 2) (5, 3) row 2: (3, 1) (4, 2) (5, 3) row 3: (3, 1) (4, 2) (5, 3) row 4: (3, 1) (4, 2) (5, 3) row 5: (3, 1) (4, 2) (5, 3) I'm not really sure what is going on with the index set, and I would really appreciate some help. Thanks, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jun 17 17:16:25 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 17 Jun 2014 17:16:25 -0500 Subject: [petsc-users] Retrieving a SubMatrix from a Matrix in parallel In-Reply-To: References: Message-ID: On Jun 17, 2014, at 5:03 PM, Arthur Kurlej wrote: > Hi all, > > I wish to extract a parallel AIJ submatrix (B) from another parallel AIJ matrix (A), in particular, if I have an NxN matrix, I wish to extract the first N-1 rows and N-1 columns of A. > > I can do this perfectly fine when running my code sequentially using ISCreateStride and MatGetSubMatrix like so: > ISCreateStride(PETSC_COMM_WORLD,N,1,1,&is); ISCreateStride() does not work this way. Each process should provide only the part of the stride space of rows that it wants. What you indicate above has each process taking the stride space from 1 to N. So for your parallel matrix below you could have process 0 use in ISCreateStride( , 2,0,1,&isrow) while process 1 has ( ,1,2,1,&isrow) so you are saying the first process wants rows 0 and 1 while the second process wants rows 2. Meanwhile both processes want columns 0,1,2 so they would each create an IS with ISCreateStride( , 3,0,1,&iscol) Barry > MatGetSubMatrix(A,is,is,MAT_INITIAL_MATRIX,&B); > > And it behaves as I would expect, but when running in parallel, the B matrix increases past it's allocated size of N-1xN-1. I've pasted the example of a simple example output below: > > > ORIGINAL SEQUENTIAL MATRIX > row 0: (0, 1) (1, 2) (2, 3) (3, 4) > row 1: (0, 1) (1, 2) (2, 3) (3, 4) > row 2: (0, 1) (1, 2) (2, 3) (3, 4) > row 3: (0, 1) (1, 2) (2, 3) (3, 4) > SEQUENTIAL SUBMATRIX > row 0: (0, 1) (1, 2) (2, 3) > row 1: (0, 1) (1, 2) (2, 3) > row 2: (0, 1) (1, 2) (2, 3) > > > ORIGINAL PARALLEL MATRIX > row 0: (0, 1) (1, 2) (2, 3) (3, 4) > row 1: (0, 1) (1, 2) (2, 3) (3, 4) > row 2: (0, 1) (1, 2) (2, 3) (3, 4) > row 3: (0, 1) (1, 2) (2, 3) (3, 4) > PARALLEL "SUBMATRIX" > row 0: (3, 1) (4, 2) (5, 3) > row 1: (3, 1) (4, 2) (5, 3) > row 2: (3, 1) (4, 2) (5, 3) > row 3: (3, 1) (4, 2) (5, 3) > row 4: (3, 1) (4, 2) (5, 3) > row 5: (3, 1) (4, 2) (5, 3) > > > I'm not really sure what is going on with the index set, and I would really appreciate some help. > > > Thanks, > Arthur > From pvsang002 at gmail.com Tue Jun 17 20:26:14 2014 From: pvsang002 at gmail.com (Sang pham van) Date: Tue, 17 Jun 2014 21:26:14 -0400 Subject: [petsc-users] SNES divergence In-Reply-To: <46328E64-A7F2-4273-877C-3A6EAB847540@mcs.anl.gov> References: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> <6AC313E4-592F-46D9-BC42-BA39E1BF467F@mcs.anl.gov> <46328E64-A7F2-4273-877C-3A6EAB847540@mcs.anl.gov> Message-ID: Hi Barry, When using -snes_grid_sequence , how does SNES refines the mesh? Does it detect where should the mesh be refine locally like an adaptive refinement? Thank you. S. On Tue, Jun 17, 2014 at 4:09 PM, Barry Smith wrote: > > On Jun 17, 2014, at 3:03 PM, Sang pham van wrote: > > > Hi Barry, > > > > I want to try the original version of Newton method, no line search nor > trust region? > > This will often fail, original version Newton just doesn?t work without > a good initial solution. > > > Using grid sequencing confuses me at the point that I don't know > explicitly the final mesh size, or even if I can estimate the mesh size > from the input -snes_grid_sequence n it's not so convenient for me when > some of my routines need such info from very first moment. > > To do grid sequencing your code needs to be able to compute the > function on several grids, and compute the Jacobian on several grids. If > your code is hardwired to work on only one grid (with global variables or > whatever) you cannot use grid sequencing and you out of luck. You should > organize your code so that it doesn?t depend on any hardwired grid sizes. > > Barry > > > > > I wish PETSc could allow us to pass in the finest mesh for grid > sequencing. > > > > > > S, > > > > > > On Tue, Jun 17, 2014 at 3:46 PM, Barry Smith wrote: > > > > On Jun 17, 2014, at 2:30 PM, Sang pham van wrote: > > > > > Thanks Barry, > > > > > > To refine the grid, I just put more points in the direction. For both > fine and coarse meshes, I used simple initial guess (say constant values in > whole domain for all variables). > > > > > > By using grid sequencing, is the finest mesh is the one I first input > the solver? > > > > No, you pass in the coarse one. > > > > > > Can you let me know what options should I use to have pure Newton > method? > > > > Not sure what you mean by pure Newton method, maybe without grid > sequencing? It simply may not be possible to get convergence from a ?poor? > initial guess. One should always use grid sequencing if possible. > > > > Barry > > > > > > > > S. > > > > > > > > > On Tue, Jun 17, 2014 at 3:20 PM, Barry Smith > wrote: > > > > > > On Jun 17, 2014, at 2:12 PM, Sang pham van > wrote: > > > > > > > Hi, > > > > > > > > I am using DM structure and SNES to solve a 3D problem. In the > problem I have 3 variables. > > > > > > > > I got SNES converged with a grid. Obtain result are physically > right. However when I refine the grid, SNES does not always converge, > > > > > > With the refined grid how are you starting the solution? Do you use > the interpolated solution from the coarser grid (called grid sequencing) or > just some ?not good? initial guess? > > > > > > > the reason of divergence is line search fail or linear solver > failed. (I also tried other type of SNES, line search seems to be the one > best fits my problem) > > > > > > > > Can you please give me a suggestion to figure out the problem with > my solver? What options should I use to have pure Newton method in SNES? Is > there any advance option I can use with line search to improve SNES > convergence. > > > > > > > > > > You should use grid sequencing, not only does it get convergence > when you may not otherwise get it but it will also solve the problem > faster. With PETSc DM you can use -snes_grid_sequence n or > SNESSetGridSequence() in the code to do n levels of grid sequencing. > > > > > > Barry > > > > > > > Thank you very much. > > > > > > > > Minh. > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jun 17 21:14:42 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 17 Jun 2014 21:14:42 -0500 Subject: [petsc-users] SNES divergence In-Reply-To: References: <28EE022E-CE78-4F88-8DBB-55E645A747D1@mcs.anl.gov> <6AC313E4-592F-46D9-BC42-BA39E1BF467F@mcs.anl.gov> <46328E64-A7F2-4273-877C-3A6EAB847540@mcs.anl.gov> Message-ID: <275E214B-376A-4E28-A08F-8EE3450AAC21@mcs.anl.gov> On Jun 17, 2014, at 8:26 PM, Sang pham van wrote: > Hi Barry, > > When using -snes_grid_sequence , how does SNES refines the mesh? Does it detect where should the mesh be refine locally like an adaptive refinement? No, with DMDA it merely cuts each dimension in half. You can control the refinement slightly with http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMDASetRefinementFactor.html Barry > > Thank you. > > S. > > > On Tue, Jun 17, 2014 at 4:09 PM, Barry Smith wrote: > > On Jun 17, 2014, at 3:03 PM, Sang pham van wrote: > > > Hi Barry, > > > > I want to try the original version of Newton method, no line search nor trust region? > > This will often fail, original version Newton just doesn?t work without a good initial solution. > > > Using grid sequencing confuses me at the point that I don't know explicitly the final mesh size, or even if I can estimate the mesh size from the input -snes_grid_sequence n it's not so convenient for me when some of my routines need such info from very first moment. > > To do grid sequencing your code needs to be able to compute the function on several grids, and compute the Jacobian on several grids. If your code is hardwired to work on only one grid (with global variables or whatever) you cannot use grid sequencing and you out of luck. You should organize your code so that it doesn?t depend on any hardwired grid sizes. > > Barry > > > > > I wish PETSc could allow us to pass in the finest mesh for grid sequencing. > > > > > > S, > > > > > > On Tue, Jun 17, 2014 at 3:46 PM, Barry Smith wrote: > > > > On Jun 17, 2014, at 2:30 PM, Sang pham van wrote: > > > > > Thanks Barry, > > > > > > To refine the grid, I just put more points in the direction. For both fine and coarse meshes, I used simple initial guess (say constant values in whole domain for all variables). > > > > > > By using grid sequencing, is the finest mesh is the one I first input the solver? > > > > No, you pass in the coarse one. > > > > > > Can you let me know what options should I use to have pure Newton method? > > > > Not sure what you mean by pure Newton method, maybe without grid sequencing? It simply may not be possible to get convergence from a ?poor? initial guess. One should always use grid sequencing if possible. > > > > Barry > > > > > > > > S. > > > > > > > > > On Tue, Jun 17, 2014 at 3:20 PM, Barry Smith wrote: > > > > > > On Jun 17, 2014, at 2:12 PM, Sang pham van wrote: > > > > > > > Hi, > > > > > > > > I am using DM structure and SNES to solve a 3D problem. In the problem I have 3 variables. > > > > > > > > I got SNES converged with a grid. Obtain result are physically right. However when I refine the grid, SNES does not always converge, > > > > > > With the refined grid how are you starting the solution? Do you use the interpolated solution from the coarser grid (called grid sequencing) or just some ?not good? initial guess? > > > > > > > the reason of divergence is line search fail or linear solver failed. (I also tried other type of SNES, line search seems to be the one best fits my problem) > > > > > > > > Can you please give me a suggestion to figure out the problem with my solver? What options should I use to have pure Newton method in SNES? Is there any advance option I can use with line search to improve SNES convergence. > > > > > > > > > > You should use grid sequencing, not only does it get convergence when you may not otherwise get it but it will also solve the problem faster. With PETSc DM you can use -snes_grid_sequence n or SNESSetGridSequence() in the code to do n levels of grid sequencing. > > > > > > Barry > > > > > > > Thank you very much. > > > > > > > > Minh. > > > > > > > > > > > > > > > > From jifengzhao2015 at u.northwestern.edu Wed Jun 18 15:37:15 2014 From: jifengzhao2015 at u.northwestern.edu (jifeng zhao) Date: Wed, 18 Jun 2014 15:37:15 -0500 Subject: [petsc-users] Performance improvement (finite element model + slepc) Message-ID: Hello, I am a new user to Petsc + Slepc. I am trying to extract natural frequency of a finite element model using Slepc. The way I do is 1. Use other software (Abaqus) to assembly the stiffness and mass matrix. 2. Use Slepc to solve a generalized eigenvalue problem. K x = lamda M x with K, M being stiffness and mass matrix. I wrote my petsc/slepc code based on the examples on slepc web. They all compiled and working correctly. The question I am raising here is what solvers (solver combinations) should I use to be most efficient? Right now I am using "bcgsl" (BiCGSTAB) solver for KSP linear solvers, "JD" jacobian-davison for eigen solver, and "bjacobi" (block jacobian) for my preconditioner. It works, but I need it to be more efficient to solver big problem (millions of degrees of freedom). I am not an expert on knowing how these solvers are different at all! Is there anybody who has extracted eigenvalues of a Finite element model using Slepc? How can I possibly improve the performance? Thank you! PS: my running command reads like: ./eigen_solver -f1 petsc_stiff1.dat -f2 petsc_mass1.dat -eps_nev 40 -eps_target 0.0 -eps_type jd -st_type precond -st_ksp_type bcgsl -st_pc_type bjacobi -st_ksp_rtol 0.001 -eps_tol 1e-5 -eps_harmonic -- Jifeng Zhao PhD candidate at Northwestern University, US Theoretical and Applied Mechanics Program -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Jun 19 03:32:38 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 19 Jun 2014 10:32:38 +0200 Subject: [petsc-users] Performance improvement (finite element model + slepc) In-Reply-To: References: Message-ID: El 18/06/2014, a las 22:37, jifeng zhao escribi?: > Hello, > > I am a new user to Petsc + Slepc. I am trying to extract natural frequency of a finite element model using Slepc. The way I do is > > 1. Use other software (Abaqus) to assembly the stiffness and mass matrix. > 2. Use Slepc to solve a generalized eigenvalue problem. K x = lamda M x > with K, M being stiffness and mass matrix. > > I wrote my petsc/slepc code based on the examples on slepc web. They all compiled and working correctly. > > The question I am raising here is what solvers (solver combinations) should I use to be most efficient? > > Right now I am using "bcgsl" (BiCGSTAB) solver for KSP linear solvers, "JD" jacobian-davison for eigen solver, and "bjacobi" (block jacobian) for my preconditioner. It works, but I need it to be more efficient to solver big problem (millions of degrees of freedom). I am not an expert on knowing how these solvers are different at all! > > Is there anybody who has extracted eigenvalues of a Finite element model using Slepc? How can I possibly improve the performance? > > Thank you! > > PS: my running command reads like: > ./eigen_solver -f1 petsc_stiff1.dat -f2 petsc_mass1.dat -eps_nev 40 -eps_target 0.0 -eps_type jd -st_type precond -st_ksp_type bcgsl -st_pc_type bjacobi -st_ksp_rtol 0.001 -eps_tol 1e-5 -eps_harmonic > > -- > Jifeng Zhao > PhD candidate at Northwestern University, US > Theoretical and Applied Mechanics Program For not-too-difficult problems, GD will be faster than JD. The options for tuning Davidson solvers are described here: http://dx.doi.org/10.1145/2543696 You can also try a preconditioner provided by an external package such as Hypre or pARMS. Alternatively, you can try with Krylov-Schur and exact shift-and-invert (with a parallel external solver such as MUMPS). Jose From werner.vetter at vetenco.ch Thu Jun 19 06:47:23 2014 From: werner.vetter at vetenco.ch (Werner Vetter) Date: Thu, 19 Jun 2014 13:47:23 +0200 Subject: [petsc-users] Cross compilation Message-ID: <53A2CDCB.9020604@vetenco.ch> Hi all I use Slepc + Petsc to calculate eigenvalues. I have written my program under Linux in Fortran 77. I configured Petsc + Slepc also under Linux and build static libraries. Everything works so far very well. Now I want to cross compile the program to build a target for Windows. For other programs I use mingw to do this. How do I have to configure Petsc + Slepc with mingw to build the static libraries. Thanks Werner From cedric.doucet at inria.fr Thu Jun 19 07:14:39 2014 From: cedric.doucet at inria.fr (Cedric Doucet) Date: Thu, 19 Jun 2014 14:14:39 +0200 (CEST) Subject: [petsc-users] Assignement operator overloading for vectors In-Reply-To: References: Message-ID: <673438294.3657963.1403180079214.JavaMail.zimbra@inria.fr> Hello, I need to overload the assignement operator for a C++ class which has a Petsc vector as a data member: class Foo { public: Foo & operator=(Foo const & copy); private: Vec m_vec; }; The algorithm for overloading the assignement operator should look like this : Foo & Foo::operator=(Foo const & copy) { if ( this != copy ) { // destroy this->m_vec // allocate this->m_vec with the same size as copy.m_vec's size // copy the content of copy.m_vec into this->m_vec } return *this; } I thought that VecCopy(copy.m_vec,m_vec) does everything I need but I have an error during the execution. So I tried to call first VecDuplicate(copy.m_vec,&m_vec) and then VecCopy(copy.m_vec,m_vec) but I still have an error. How shoud assignement operator overloading be implemented? Do I have to call VecDestroy before calling VecDuplicate? Must input vectors have the same size in VecCopy? Thank you very much for your help! Best regards, C?dric Doucet From cedric.doucet at inria.fr Thu Jun 19 07:30:56 2014 From: cedric.doucet at inria.fr (Cedric Doucet) Date: Thu, 19 Jun 2014 14:30:56 +0200 (CEST) Subject: [petsc-users] Assignement operator overloading for vectors In-Reply-To: <673438294.3657963.1403180079214.JavaMail.zimbra@inria.fr> References: <673438294.3657963.1403180079214.JavaMail.zimbra@inria.fr> Message-ID: <1730861503.3666442.1403181056053.JavaMail.zimbra@inria.fr> I have the same error in copy contructor : Foo::Foo(Foo const & copy) { VecDuplicate(copy.m_vec,&m_vec); VecCopy(copy.m_vec,m_vec); } What am I doing wrong here? C?dric ----- Mail original ----- > De: "Cedric Doucet" > ?: petsc-users at mcs.anl.gov > Envoy?: Jeudi 19 Juin 2014 14:14:39 > Objet: Assignement operator overloading for vectors > > Hello, > > I need to overload the assignement operator for a C++ class which has a Petsc > vector as a data member: > > class Foo > { > public: > Foo & operator=(Foo const & copy); > private: > Vec m_vec; > }; > > The algorithm for overloading the assignement operator should look like this > : > > Foo & Foo::operator=(Foo const & copy) > { > if ( this != copy ) > { > // destroy this->m_vec > // allocate this->m_vec with the same size as copy.m_vec's size > // copy the content of copy.m_vec into this->m_vec > } > return *this; > } > > I thought that VecCopy(copy.m_vec,m_vec) does everything I need but I have an > error during the execution. > So I tried to call first VecDuplicate(copy.m_vec,&m_vec) and then > VecCopy(copy.m_vec,m_vec) but I still have an error. > How shoud assignement operator overloading be implemented? > Do I have to call VecDestroy before calling VecDuplicate? > Must input vectors have the same size in VecCopy? > > Thank you very much for your help! > > Best regards, > > C?dric Doucet From bsmith at mcs.anl.gov Thu Jun 19 07:56:36 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 19 Jun 2014 07:56:36 -0500 Subject: [petsc-users] Cross compilation In-Reply-To: <53A2CDCB.9020604@vetenco.ch> References: <53A2CDCB.9020604@vetenco.ch> Message-ID: On Jun 19, 2014, at 6:47 AM, Werner Vetter wrote: > Hi all > I use Slepc + Petsc to calculate eigenvalues. I have written my program under Linux in Fortran 77. I configured Petsc + Slepc also under Linux and build static libraries. Everything works so far very well. > Now I want to cross compile the program to build a target for Windows. For other programs I use mingw to do this. > How do I have to configure Petsc + Slepc with mingw to build the static libraries. Are you planning to use the resulting static libraries from Microsoft Developers studio? If so you need to follow the instructions at http://www.mcs.anl.gov/petsc/documentation/installation.html#windows It is also possible to build PETSc using the cygwin gnu gcc and gfortran but we don?t generally recommend it. Unfortunately we don?t have a way to compile PETSc with mingw Barry > > Thanks > Werner From bsmith at mcs.anl.gov Thu Jun 19 08:06:57 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 19 Jun 2014 08:06:57 -0500 Subject: [petsc-users] Assignement operator overloading for vectors In-Reply-To: <673438294.3657963.1403180079214.JavaMail.zimbra@inria.fr> References: <673438294.3657963.1403180079214.JavaMail.zimbra@inria.fr> Message-ID: ?an error? is not very informative. Always send all possible information about errors you get; cut and paste all error messages etc.Also better to send the entire (hopefully small) code that reproduces the problem, than abstract snippets. Barry On Jun 19, 2014, at 7:14 AM, Cedric Doucet wrote: > Hello, > > I need to overload the assignement operator for a C++ class which has a Petsc vector as a data member: > > class Foo > { > public: > Foo & operator=(Foo const & copy); > private: > Vec m_vec; > }; > > The algorithm for overloading the assignement operator should look like this : > > Foo & Foo::operator=(Foo const & copy) > { > if ( this != copy ) > { > // destroy this->m_vec > // allocate this->m_vec with the same size as copy.m_vec's size > // copy the content of copy.m_vec into this->m_vec > } > return *this; > } > > I thought that VecCopy(copy.m_vec,m_vec) does everything I need but I have an error during the execution. > So I tried to call first VecDuplicate(copy.m_vec,&m_vec) and then VecCopy(copy.m_vec,m_vec) but I still have an error. > How shoud assignement operator overloading be implemented? > Do I have to call VecDestroy before calling VecDuplicate? > Must input vectors have the same size in VecCopy? > > Thank you very much for your help! > > Best regards, > > C?dric Doucet From bsmith at mcs.anl.gov Thu Jun 19 08:16:18 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 19 Jun 2014 08:16:18 -0500 Subject: [petsc-users] Assignement operator overloading for vectors In-Reply-To: <2139550936.3695284.1403183588513.JavaMail.zimbra@inria.fr> References: <673438294.3657963.1403180079214.JavaMail.zimbra@inria.fr> <2139550936.3695284.1403183588513.JavaMail.zimbra@inria.fr> Message-ID: Write a tiny standalone code that only does the one thing of trying to do the copy constructor and send that (there is no need to send the entire complicated code). Barry On Jun 19, 2014, at 8:13 AM, Cedric Doucet wrote: > > Hello, > > thank you for your answer. > > It is difficult to send the code as it is very big and difficult to read without some documentation. > > The problem appears in the copy contructor of class like this : > > class Foo { > public: > Foo(){} > Foo(Foo const & copy) > { > VecDuplicate(copy.m_vec,&m_vec); > VecCopy(copy.m_vec,m_vec); > } > private: > Vec m_vec; > } > > The copy constructor of Foo is implicitly called in the code during the copy of some std::vector. > > Here is the error message: > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, unknown > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: ./main on a debug-openmpi-1.6.5 named pl-59031 by cdoucet Thu Jun 19 14:59:33 2014 > [0]PETSC ERROR: Libraries linked from /home/cdoucet/local/petsc/master/debug-openmpi-1.6.5/lib > [0]PETSC ERROR: Configure run at Fri May 23 17:21:26 2014 > [0]PETSC ERROR: Configure options CXXFLAGS=-m64 CFLAGS=-m64 FCFLAGS=-m64 FFLAGS=-m64 LDFLAGS=-m64 --with-debugging=yes --with-dynamic-loading --with-shared-libraries --download-metis=1 --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1 --download-blacs=1 --download-mumps=1 --with-mpi-dir=/home/cdoucet/local/openmpi/1.6.5 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: VecDuplicate() line 511 in /home/cdoucet/local/petsc/master/src/vec/vec/interface/vector.c > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, unknown > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: ./main on a debug-openmpi-1.6.5 named pl-59031 by cdoucet Thu Jun 19 14:59:33 2014 > [0]PETSC ERROR: Libraries linked from /home/cdoucet/local/petsc/master/debug-openmpi-1.6.5/lib > [0]PETSC ERROR: Configure run at Fri May 23 17:21:26 2014 > [0]PETSC ERROR: Configure options CXXFLAGS=-m64 CFLAGS=-m64 FCFLAGS=-m64 FFLAGS=-m64 LDFLAGS=-m64 --with-debugging=yes --with-dynamic-loading --with-shared-libraries --download-metis=1 --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1 --download-blacs=1 --download-mumps=1 --with-mpi-dir=/home/cdoucet/local/openmpi/1.6.5 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: VecCopy() line 1681 in /home/cdoucet/local/petsc/master/src/vec/vec/interface/vector.c > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Corrupt argument: > see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind! > [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, unknown > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: ./main on a debug-openmpi-1.6.5 named pl-59031 by cdoucet Thu Jun 19 14:59:33 2014 > [0]PETSC ERROR: Libraries linked from /home/cdoucet/local/petsc/master/debug-openmpi-1.6.5/lib > [0]PETSC ERROR: Configure run at Fri May 23 17:21:26 2014 > [0]PETSC ERROR: Configure options CXXFLAGS=-m64 CFLAGS=-m64 FCFLAGS=-m64 FFLAGS=-m64 LDFLAGS=-m64 --with-debugging=yes --with-dynamic-loading --with-shared-libraries --download-metis=1 --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1 --download-blacs=1 --download-mumps=1 --with-mpi-dir=/home/cdoucet/local/openmpi/1.6.5 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: VecDuplicate() line 511 in /home/cdoucet/local/petsc/master/src/vec/vec/interface/vector.c > Erreur de segmentation (core dumped) > > > ----- Mail original ----- >> De: "Barry Smith" >> ?: "Cedric Doucet" >> Cc: petsc-users at mcs.anl.gov >> Envoy?: Jeudi 19 Juin 2014 15:06:57 >> Objet: Re: [petsc-users] Assignement operator overloading for vectors >> >> >> ?an error? is not very informative. Always send all possible information >> about errors you get; cut and paste all error messages etc.Also better to >> send the entire (hopefully small) code that reproduces the problem, than >> abstract snippets. >> >> Barry >> >> On Jun 19, 2014, at 7:14 AM, Cedric Doucet wrote: >> >>> Hello, >>> >>> I need to overload the assignement operator for a C++ class which has a >>> Petsc vector as a data member: >>> >>> class Foo >>> { >>> public: >>> Foo & operator=(Foo const & copy); >>> private: >>> Vec m_vec; >>> }; >>> >>> The algorithm for overloading the assignement operator should look like >>> this : >>> >>> Foo & Foo::operator=(Foo const & copy) >>> { >>> if ( this != copy ) >>> { >>> // destroy this->m_vec >>> // allocate this->m_vec with the same size as copy.m_vec's size >>> // copy the content of copy.m_vec into this->m_vec >>> } >>> return *this; >>> } >>> >>> I thought that VecCopy(copy.m_vec,m_vec) does everything I need but I have >>> an error during the execution. >>> So I tried to call first VecDuplicate(copy.m_vec,&m_vec) and then >>> VecCopy(copy.m_vec,m_vec) but I still have an error. >>> How shoud assignement operator overloading be implemented? >>> Do I have to call VecDestroy before calling VecDuplicate? >>> Must input vectors have the same size in VecCopy? >>> >>> Thank you very much for your help! >>> >>> Best regards, >>> >>> C?dric Doucet >> >> From Vincent.De-Groof at uibk.ac.at Fri Jun 20 06:24:51 2014 From: Vincent.De-Groof at uibk.ac.at (De Groof, Vincent Frans Maria) Date: Fri, 20 Jun 2014 11:24:51 +0000 Subject: [petsc-users] Access to factored matrix. Message-ID: <17A78B9D13564547AC894B88C159674720397814@XMBX4.uibk.ac.at> Hi, I'm looking for a way to access the factor of a matrix to be able to change it. I'm using CHOLMOD for the factorization which uses the cholmod_factor object. The data contained in this object is basically what I want to get access to. Ideally, I'll be able to use the petsc-functions as is. Make some changes to the factored matrix and afterwards continue usingthe petsc-functions. Do you have any suggestions how this could be done? I had a look at PCFactorGetMatrix, but the matrix does not seem to contain numerical data of the factor. thanks, Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jun 20 06:50:49 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 20 Jun 2014 06:50:49 -0500 Subject: [petsc-users] Access to factored matrix. In-Reply-To: <17A78B9D13564547AC894B88C159674720397814@XMBX4.uibk.ac.at> References: <17A78B9D13564547AC894B88C159674720397814@XMBX4.uibk.ac.at> Message-ID: <87egyjer52.fsf@jedbrown.org> "De Groof, Vincent Frans Maria" writes: > Hi, > > > I'm looking for a way to access the factor of a matrix to be able to > change it. I'm using CHOLMOD for the factorization which uses the > cholmod_factor object. The data contained in this object is basically > what I want to get access to. Ideally, I'll be able to use the > petsc-functions as is. Make some changes to the factored matrix and > afterwards continue usingthe petsc-functions. What changes do you want to make? This is typically not supported by direct solver packages. You can include the private header and dig into it to your hearts content, but PETSc doesn't supply a public interface for this. #include <../src/mat/impls/sbaij/seq/cholmod/cholmodimpl.h> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jifengzhao2015 at u.northwestern.edu Fri Jun 20 11:15:26 2014 From: jifengzhao2015 at u.northwestern.edu (jifeng zhao) Date: Fri, 20 Jun 2014 11:15:26 -0500 Subject: [petsc-users] Performance improvement (finite element model + slepc) In-Reply-To: References: Message-ID: Hi Jose, Thank you very much. The instruction you gave is of particular useful. I do think I can use deflation feature to reduce my subspace a lot. I will let you know if I made further progress. One more thing to check with you. I am right now using bJacobi as preconditioner. I know it is among one of the simplest preconditioners, but it turns out it gives faster convergence than SOR (maybe ASM). Is Hypre or pARMS designed to be suited for FEM model, and will usually be faster than bjacobi? Since Hypre or pARMS is external packages not readily available in SLEPc, I need ask my server administrator to install it. So it is a not-so-easy job for me. Thank you so much, Jifeng Zhao Best regards, Jifeng Zhao On Thu, Jun 19, 2014 at 3:32 AM, Jose E. Roman wrote: > > El 18/06/2014, a las 22:37, jifeng zhao escribi?: > > > Hello, > > > > I am a new user to Petsc + Slepc. I am trying to extract natural > frequency of a finite element model using Slepc. The way I do is > > > > 1. Use other software (Abaqus) to assembly the stiffness and mass matrix. > > 2. Use Slepc to solve a generalized eigenvalue problem. K x = lamda M x > > with K, M being stiffness and mass matrix. > > > > I wrote my petsc/slepc code based on the examples on slepc web. They all > compiled and working correctly. > > > > The question I am raising here is what solvers (solver combinations) > should I use to be most efficient? > > > > Right now I am using "bcgsl" (BiCGSTAB) solver for KSP linear solvers, > "JD" jacobian-davison for eigen solver, and "bjacobi" (block jacobian) for > my preconditioner. It works, but I need it to be more efficient to solver > big problem (millions of degrees of freedom). I am not an expert on knowing > how these solvers are different at all! > > > > Is there anybody who has extracted eigenvalues of a Finite element model > using Slepc? How can I possibly improve the performance? > > > > Thank you! > > > > PS: my running command reads like: > > ./eigen_solver -f1 petsc_stiff1.dat -f2 petsc_mass1.dat -eps_nev 40 > -eps_target 0.0 -eps_type jd -st_type precond -st_ksp_type bcgsl > -st_pc_type bjacobi -st_ksp_rtol 0.001 -eps_tol 1e-5 -eps_harmonic > > > > -- > > Jifeng Zhao > > PhD candidate at Northwestern University, US > > Theoretical and Applied Mechanics Program > > For not-too-difficult problems, GD will be faster than JD. The options for > tuning Davidson solvers are described here: > http://dx.doi.org/10.1145/2543696 > You can also try a preconditioner provided by an external package such as > Hypre or pARMS. > > Alternatively, you can try with Krylov-Schur and exact shift-and-invert > (with a parallel external solver such as MUMPS). > > Jose > > -- Jifeng Zhao PhD candidate at Northwestern University, US Theoretical and Applied Mechanics Program -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Jun 20 12:36:14 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 20 Jun 2014 19:36:14 +0200 Subject: [petsc-users] Performance improvement (finite element model + slepc) In-Reply-To: References: Message-ID: <3A07FAA1-C96F-408E-B374-9E935931DA02@dsic.upv.es> El 20/06/2014, a las 18:15, jifeng zhao escribi?: > Hi Jose, > > Thank you very much. The instruction you gave is of particular useful. I do think I can use deflation feature to reduce my subspace a lot. I will let you know if I made further progress. > > One more thing to check with you. I am right now using bJacobi as preconditioner. I know it is among one of the simplest preconditioners, but it turns out it gives faster convergence than SOR (maybe ASM). Is Hypre or pARMS designed to be suited for FEM model, and will usually be faster than bjacobi? > > Since Hypre or pARMS is external packages not readily available in SLEPc, I need ask my server administrator to install it. So it is a not-so-easy job for me. > > Thank you so much, > Jifeng Zhao I don't know if they will be faster than block Jacobi. Jose From hus003 at ucsd.edu Sat Jun 21 13:25:28 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 21 Jun 2014 18:25:28 +0000 Subject: [petsc-users] Vec Set DM and Mat Set DM Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> I'm thinking about defining a distributed Vec using grid information. The usual way to do that is to call VecCreateMPI, or VecCreate and VecSetSizes. However, that does not necessarily distribute Vec according to the grid information, DM. I'm thinking of doing something like: ierr = DMDAGetInfo(da,0,&mx,&my,0,0,0,0,0,0,0,0,0,0);CHKERRQ(ierr); ierr = DMDAGetCorners(da,&xs,&ys,NULL,&xm,&ym,NULL); and then define the Vec according to (xs,ys,xm,ym). But I'm not sure what exactly should I do. There are functions KSPSetDM and SNESSetDM that allows one to pass the DM information into the KSP and SNES. Are there some functions that allow one to pass DM info to a Vec or Mat? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat Jun 21 13:28:52 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 21 Jun 2014 13:28:52 -0500 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87mwd66rrv.fsf@jedbrown.org> "Sun, Hui" writes: > I'm thinking about defining a distributed Vec using grid information. The usual way to do that is to call VecCreateMPI, or VecCreate and VecSetSizes. However, that does not necessarily distribute Vec according to the grid information, DM. I'm thinking of doing something like: > > ierr = DMDAGetInfo(da,0,&mx,&my,0,0,0,0,0,0,0,0,0,0);CHKERRQ(ierr); > > ierr = DMDAGetCorners(da,&xs,&ys,NULL,&xm,&ym,NULL); > > > and then define the Vec according to (xs,ys,xm,ym). Use DMCreateGlobalVector() and DMCreateLocalVector(). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From hus003 at ucsd.edu Sat Jun 21 13:58:19 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 21 Jun 2014 18:58:19 +0000 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <87mwd66rrv.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87mwd66rrv.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C29D2@XMAIL-MBX-BH1.AD.UCSD.EDU> Thank you. ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, June 21, 2014 11:28 AM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Vec Set DM and Mat Set DM "Sun, Hui" writes: > I'm thinking about defining a distributed Vec using grid information. The usual way to do that is to call VecCreateMPI, or VecCreate and VecSetSizes. However, that does not necessarily distribute Vec according to the grid information, DM. I'm thinking of doing something like: > > ierr = DMDAGetInfo(da,0,&mx,&my,0,0,0,0,0,0,0,0,0,0);CHKERRQ(ierr); > > ierr = DMDAGetCorners(da,&xs,&ys,NULL,&xm,&ym,NULL); > > > and then define the Vec according to (xs,ys,xm,ym). Use DMCreateGlobalVector() and DMCreateLocalVector(). From hus003 at ucsd.edu Sat Jun 21 17:34:18 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 21 Jun 2014 22:34:18 +0000 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <87mwd66rrv.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87mwd66rrv.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU> If I have a DM object with dof=3. So in each grid point, there are 3 components. Now if I set a Mat according to this DM object, I have something like: MatSetValuesStencil(jac,1,&row,5,col,v,INSERT_VALUES); What shall I put in col? Usually I should have something like: col[0].i = i; col[0].j = j-1; col[1].i = i; col[1].j = j; etc ... But now I want to do something like: col[0].i.u = i; col[0].j.u = j-1; col[0].i.v = i; col[0].j.v = j-1; col[0].i.p = i; col[0].j.p = j-1; etc ... What is the right syntax for doing this? After setting up the stencils I want to call: MatSetValuesStencil(jac,1,&row,15,col,v,INSERT_VALUES); Should I put 15 as the number of columns being entered. (5 point stencil, 3 degrees of freedom in each point) Best, Hui ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, June 21, 2014 11:28 AM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Vec Set DM and Mat Set DM "Sun, Hui" writes: > I'm thinking about defining a distributed Vec using grid information. The usual way to do that is to call VecCreateMPI, or VecCreate and VecSetSizes. However, that does not necessarily distribute Vec according to the grid information, DM. I'm thinking of doing something like: > > ierr = DMDAGetInfo(da,0,&mx,&my,0,0,0,0,0,0,0,0,0,0);CHKERRQ(ierr); > > ierr = DMDAGetCorners(da,&xs,&ys,NULL,&xm,&ym,NULL); > > > and then define the Vec according to (xs,ys,xm,ym). Use DMCreateGlobalVector() and DMCreateLocalVector(). From jed at jedbrown.org Sat Jun 21 17:43:27 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 21 Jun 2014 17:43:27 -0500 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> <87mwd66rrv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <871tuh7uk0.fsf@jedbrown.org> "Sun, Hui" writes: > If I have a DM object with dof=3. So in each grid point, there are 3 components. Now if I set a Mat according to this DM object, I have something like: > MatSetValuesStencil(jac,1,&row,5,col,v,INSERT_VALUES); > > What shall I put in col? Usually I should have something like: > col[0].i = i; col[0].j = j-1; col[1].i = i; col[1].j = j; etc ... > > But now I want to do something like: > col[0].i.u = i; col[0].j.u = j-1; col[0].i.v = i; col[0].j.v = j-1; col[0].i.p = i; col[0].j.p = j-1; etc ... MatStencil has col[0].c = 0 {1,2,...}, but I recommend using MatSetValuesBlockedStencil() if possible. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From hus003 at ucsd.edu Sat Jun 21 18:35:47 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 21 Jun 2014 23:35:47 +0000 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <871tuh7uk0.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> <87mwd66rrv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU>, <871tuh7uk0.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C29F8@XMAIL-MBX-BH1.AD.UCSD.EDU> I was reading examples about MatSetValuesStencil and MatSetValuesBlockStencil, I cannot see any difference. Why would recommend MatSetValuesBlockedStencil please? It seems that there are many examples about MatSetValuesStencil but the only example about MatSetValuesBlockedStencil is snes/.../ex48. Hui ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, June 21, 2014 3:43 PM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Vec Set DM and Mat Set DM "Sun, Hui" writes: > If I have a DM object with dof=3. So in each grid point, there are 3 components. Now if I set a Mat according to this DM object, I have something like: > MatSetValuesStencil(jac,1,&row,5,col,v,INSERT_VALUES); > > What shall I put in col? Usually I should have something like: > col[0].i = i; col[0].j = j-1; col[1].i = i; col[1].j = j; etc ... > > But now I want to do something like: > col[0].i.u = i; col[0].j.u = j-1; col[0].i.v = i; col[0].j.v = j-1; col[0].i.p = i; col[0].j.p = j-1; etc ... MatStencil has col[0].c = 0 {1,2,...}, but I recommend using MatSetValuesBlockedStencil() if possible. From hus003 at ucsd.edu Sat Jun 21 18:42:03 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 21 Jun 2014 23:42:03 +0000 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C29F8@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> <87mwd66rrv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU>, <871tuh7uk0.fsf@jedbrown.org>, <7501CC2B7BBCC44A92ECEEC316170ECB6C29F8@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6C2A03@XMAIL-MBX-BH1.AD.UCSD.EDU> Sorry, maybe I see some difference. I'm reading line 1253 from http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex48.c.html MatSetValuesBlockedStencil(B,8,rc,8,rc,&Ke[0][0],ADD_VALUES); The Ke is 16 by 16, but rc is of length 8, so that means a 2 by 2 block is written into each stencil point of B. Is that correct? Now, why the compiler necessarily knows that Ke is of size 16 by 16? Best, Hui ________________________________________ From: Sun, Hui Sent: Saturday, June 21, 2014 4:35 PM To: Jed Brown; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Vec Set DM and Mat Set DM I was reading examples about MatSetValuesStencil and MatSetValuesBlockStencil, I cannot see any difference. Why would recommend MatSetValuesBlockedStencil please? It seems that there are many examples about MatSetValuesStencil but the only example about MatSetValuesBlockedStencil is snes/.../ex48. Hui ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, June 21, 2014 3:43 PM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Vec Set DM and Mat Set DM "Sun, Hui" writes: > If I have a DM object with dof=3. So in each grid point, there are 3 components. Now if I set a Mat according to this DM object, I have something like: > MatSetValuesStencil(jac,1,&row,5,col,v,INSERT_VALUES); > > What shall I put in col? Usually I should have something like: > col[0].i = i; col[0].j = j-1; col[1].i = i; col[1].j = j; etc ... > > But now I want to do something like: > col[0].i.u = i; col[0].j.u = j-1; col[0].i.v = i; col[0].j.v = j-1; col[0].i.p = i; col[0].j.p = j-1; etc ... MatStencil has col[0].c = 0 {1,2,...}, but I recommend using MatSetValuesBlockedStencil() if possible. From jed at jedbrown.org Sun Jun 22 00:27:58 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 22 Jun 2014 00:27:58 -0500 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C2A03@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> <87mwd66rrv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU> <871tuh7uk0.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29F8@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6C2A03@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87simx5x9d.fsf@jedbrown.org> "Sun, Hui" writes: > Sorry, maybe I see some difference. I'm reading line 1253 from > http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex48.c.html > > MatSetValuesBlockedStencil(B,8,rc,8,rc,&Ke[0][0],ADD_VALUES); > > The Ke is 16 by 16, but rc is of length 8, so that means a 2 by 2 > block is written into each stencil point of B. Is that correct? Now, > why the compiler necessarily knows that Ke is of size 16 by 16? Ke is declared that way in this example, but MatSetValuesBlockedStencil just interprets the array passed as an m*bs by n*bs array in row-major ordering. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Sun Jun 22 00:28:28 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 22 Jun 2014 00:28:28 -0500 Subject: [petsc-users] Vec Set DM and Mat Set DM In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6C29F8@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6C29A8@XMAIL-MBX-BH1.AD.UCSD.EDU> <87mwd66rrv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29E6@XMAIL-MBX-BH1.AD.UCSD.EDU> <871tuh7uk0.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6C29F8@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87ppi15x8j.fsf@jedbrown.org> "Sun, Hui" writes: > I was reading examples about MatSetValuesStencil and > MatSetValuesBlockStencil, I cannot see any difference. Why would > recommend MatSetValuesBlockedStencil please? The indices operate on blocks, so there is less bookkeeping. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bogdan at lmn.pub.ro Mon Jun 23 04:16:39 2014 From: bogdan at lmn.pub.ro (Bogdan Dita) Date: Mon, 23 Jun 2014 09:16:39 -0000 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: <274e1b592f3c2232a9b2a946c19bd7b6.squirrel@wm.lmn.pub.ro> Hello, I wanted to see how well Umfpack performs in PETSc but i encountered a problem regarding the matrix type used and the way is distributed. I'm trying to solve for multiple frequencies in parallel, where system matrix A = omega*E-C, and omega is a frequency dependent scalar. Can i send matrix E and C to each process, plus a set of frequencies to compute omega and form A_k = omega_k * E - C and then solve for every omega? Is this even possible given that Umfpack only works with SeqAij? Best regards, Bogdan ---- Bogdan DITA, PhD Student UPB - EE Dept.- CIEAC-LMN Splaiul Independentei 313 060042 Bucharest, Romania email: bogdan at lmn.pub.ro From dave.mayhem23 at gmail.com Mon Jun 23 06:12:30 2014 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 23 Jun 2014 13:12:30 +0200 Subject: [petsc-users] (no subject) In-Reply-To: <274e1b592f3c2232a9b2a946c19bd7b6.squirrel@wm.lmn.pub.ro> References: <274e1b592f3c2232a9b2a946c19bd7b6.squirrel@wm.lmn.pub.ro> Message-ID: Yes, just assemble the same sequential matrices on each rank. To do this, create the matrix using the communicator PETSC_COMM_SELF Cheers Dave On Monday, 23 June 2014, Bogdan Dita wrote: > > > Hello, > > I wanted to see how well Umfpack performs in PETSc but i encountered a > problem regarding the matrix type used and the way is distributed. I'm > trying to solve for multiple frequencies in parallel, where system matrix > A = omega*E-C, and omega is a frequency dependent scalar. Can i send > matrix E and C to each process, plus a set of frequencies to compute omega > and form A_k = omega_k * E - C and then solve for every omega? Is this > even possible given that Umfpack only works with SeqAij? > > > Best regards, > Bogdan > > > ---- > Bogdan DITA, PhD Student > UPB - EE Dept.- CIEAC-LMN > Splaiul Independentei 313 > 060042 Bucharest, Romania > email: bogdan at lmn.pub.ro > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon Jun 23 06:14:34 2014 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 23 Jun 2014 13:14:34 +0200 Subject: [petsc-users] (no subject) In-Reply-To: References: <274e1b592f3c2232a9b2a946c19bd7b6.squirrel@wm.lmn.pub.ro> Message-ID: And use the same sequential communicator when you create the KSP on each rank On Monday, 23 June 2014, Dave May wrote: > Yes, just assemble the same sequential matrices on each rank. To do this, > create the matrix using the communicator PETSC_COMM_SELF > > Cheers > Dave > > On Monday, 23 June 2014, Bogdan Dita > wrote: > >> >> >> Hello, >> >> I wanted to see how well Umfpack performs in PETSc but i encountered a >> problem regarding the matrix type used and the way is distributed. I'm >> trying to solve for multiple frequencies in parallel, where system matrix >> A = omega*E-C, and omega is a frequency dependent scalar. Can i send >> matrix E and C to each process, plus a set of frequencies to compute omega >> and form A_k = omega_k * E - C and then solve for every omega? Is this >> even possible given that Umfpack only works with SeqAij? >> >> >> Best regards, >> Bogdan >> >> >> ---- >> Bogdan DITA, PhD Student >> UPB - EE Dept.- CIEAC-LMN >> Splaiul Independentei 313 >> 060042 Bucharest, Romania >> email: bogdan at lmn.pub.ro >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiangwen84 at gmail.com Mon Jun 23 10:33:10 2014 From: jiangwen84 at gmail.com (Wen Jiang) Date: Mon, 23 Jun 2014 11:33:10 -0400 Subject: [petsc-users] pardiso interface Message-ID: Hi, Could anyone tell me whether the current version of petsc supports pardiso? If so, are there any tricks to configure and use it? Thanks. Regards, Wen -------------- next part -------------- An HTML attachment was scrubbed... URL: From wumeng07maths at qq.com Mon Jun 23 11:01:06 2014 From: wumeng07maths at qq.com (=?ISO-8859-1?B?T28gICAgICA=?=) Date: Tue, 24 Jun 2014 00:01:06 +0800 Subject: [petsc-users] PETSC ERROR: PetscInitialize() Message-ID: Dear all, Recently, I try to write a simple PETSC code for solving linear system problem. At the beginning of main function, I use PetscInitialize(). However, when I run this program with "-start_in_debugger", there is a PETSC ERROR caused by this line. The PETSC ERROR is: PETSC: Attaching gdb to /Users/meng/MyWork/BuildTools/bin/Parameterization of pid 7342 on display /tmp/launch-0lpYbb/org.macosforge.xquartz:0 on machine vis163d Warning: locale not supported by Xlib, locale set to C Do you have some suggestions to deal with this problem? Thanks a lot! Meng -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jun 23 11:09:02 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jun 2014 11:09:02 -0500 Subject: [petsc-users] PETSC ERROR: PetscInitialize() In-Reply-To: References: Message-ID: What happens at this point? Did the xterm open with the debugger or not? What is printed below seems like only a warning? If you are running with only a single process you can do -start_in_debugger noxterm and it will open the debugger in the current window instead of starting a new xterm. Barry On Jun 23, 2014, at 11:01 AM, Oo wrote: > > Dear all, > > Recently, I try to write a simple PETSC code for solving linear system problem. > At the beginning of main function, I use PetscInitialize(). > However, when I run this program with "-start_in_debugger", > there is a PETSC ERROR caused by this line. > > > The PETSC ERROR is: > PETSC: Attaching gdb to /Users/meng/MyWork/BuildTools/bin/Parameterization of pid 7342 on display /tmp/launch-0lpYbb/org.macosforge.xquartz:0 on machine vis163d > Warning: locale not supported by Xlib, locale set to C > > Do you have some suggestions to deal with this problem? > > Thanks a lot! > > Meng From balay at mcs.anl.gov Mon Jun 23 11:13:28 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 23 Jun 2014 11:13:28 -0500 Subject: [petsc-users] PETSC ERROR: PetscInitialize() In-Reply-To: References: Message-ID: On Mon, 23 Jun 2014, Barry Smith wrote: > If you are running with only a single process you can do -start_in_debugger noxterm and it will open the debugger in the current window instead of starting a new xterm. > or just: gdb executable Satish From mpovolot at purdue.edu Mon Jun 23 11:48:18 2014 From: mpovolot at purdue.edu (Michael Povolotskyi) Date: Mon, 23 Jun 2014 12:48:18 -0400 Subject: [petsc-users] pardiso interface In-Reply-To: References: Message-ID: <53A85A52.6050901@purdue.edu> On 6/23/2014 11:33 AM, Wen Jiang wrote: > Hi, > > Could anyone tell me whether the current version of petsc supports > pardiso? If so, are there any tricks to configure and use it? Thanks. > > Regards, > Wen We have built the interface and sent it to PETSc team. It works nicely for us. Dear Petsc developers, have you approved it for the public release? Michael. From bsmith at mcs.anl.gov Mon Jun 23 11:53:09 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jun 2014 11:53:09 -0500 Subject: [petsc-users] pardiso interface In-Reply-To: <53A85A52.6050901@purdue.edu> References: <53A85A52.6050901@purdue.edu> Message-ID: <23A64FBB-DB0F-4AF9-9A95-BF7BF1799E3C@mcs.anl.gov> On Jun 23, 2014, at 11:48 AM, Michael Povolotskyi wrote: > On 6/23/2014 11:33 AM, Wen Jiang wrote: >> Hi, >> >> Could anyone tell me whether the current version of petsc supports pardiso? If so, are there any tricks to configure and use it? Thanks. >> >> Regards, >> Wen > We have built the interface and sent it to PETSc team. > It works nicely for us. > Dear Petsc developers, have you approved it for the public release? > Michael. Hmm, I don?t remember seeing it. Could you please let us know where we can download it or send it again. Barry From jed at jedbrown.org Mon Jun 23 11:59:28 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 23 Jun 2014 11:59:28 -0500 Subject: [petsc-users] pardiso interface In-Reply-To: <23A64FBB-DB0F-4AF9-9A95-BF7BF1799E3C@mcs.anl.gov> References: <53A85A52.6050901@purdue.edu> <23A64FBB-DB0F-4AF9-9A95-BF7BF1799E3C@mcs.anl.gov> Message-ID: <87fviv36kv.fsf@jedbrown.org> Barry Smith writes: > Hmm, I don?t remember seeing it. Could you please let us know where we can download it or send it again. The PR is on bitbucket: https://bitbucket.org/petsc/petsc/pull-request/105/added-support-for-mkl-pardiso-solver/commits Since this uses the Pardiso in MKL, we should be able to test it without the licensing problems of upstream Pardiso. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From wumeng07maths at qq.com Mon Jun 23 13:32:02 2014 From: wumeng07maths at qq.com (=?ISO-8859-1?B?T28gICAgICA=?=) Date: Tue, 24 Jun 2014 02:32:02 +0800 Subject: [petsc-users] PETSC ERROR: PetscInitialize() In-Reply-To: References: Message-ID: Hi Barry, There is no other error in the following. I get a result. Do you mean that this error must don't impact on the result? Thanks for your reply. :) Meng ------------------ Original ------------------ From: "Barry Smith";; Date: Jun 24, 2014 To: "Oo "; Cc: "petsc-users"; Subject: Re: [petsc-users] PETSC ERROR: PetscInitialize() What happens at this point? Did the xterm open with the debugger or not? What is printed below seems like only a warning? If you are running with only a single process you can do -start_in_debugger noxterm and it will open the debugger in the current window instead of starting a new xterm. Barry On Jun 23, 2014, at 11:01 AM, Oo wrote: > > Dear all, > > Recently, I try to write a simple PETSC code for solving linear system problem. > At the beginning of main function, I use PetscInitialize(). > However, when I run this program with "-start_in_debugger", > there is a PETSC ERROR caused by this line. > > > The PETSC ERROR is: > PETSC: Attaching gdb to /Users/meng/MyWork/BuildTools/bin/Parameterization of pid 7342 on display /tmp/launch-0lpYbb/org.macosforge.xquartz:0 on machine vis163d > Warning: locale not supported by Xlib, locale set to C > > Do you have some suggestions to deal with this problem? > > Thanks a lot! > > Meng . -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jun 23 14:01:16 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jun 2014 14:01:16 -0500 Subject: [petsc-users] PETSC ERROR: PetscInitialize() In-Reply-To: References: Message-ID: <5DC8EA9F-4B8E-42CB-9E3A-6114717C50F7@mcs.anl.gov> On Jun 23, 2014, at 1:32 PM, Oo wrote: > > Hi Barry, > There is no other error in the following. > I get a result. Did an xterm open? Did the program just run to completion? Is this your machine? org.macosforge.xquartz Are you running on some remote machine? Barry > > Do you mean that this error must don't impact on the result? > > Thanks for your reply. > :) > > Meng > > > ------------------ Original ------------------ > From: "Barry Smith";; > Date: Jun 24, 2014 > To: "Oo "; > Cc: "petsc-users"; > Subject: Re: [petsc-users] PETSC ERROR: PetscInitialize() > > > What happens at this point? Did the xterm open with the debugger or not? What is printed below seems like only a warning? > > If you are running with only a single process you can do -start_in_debugger noxterm and it will open the debugger in the current window instead of starting a new xterm. > > Barry > > On Jun 23, 2014, at 11:01 AM, Oo wrote: > > > > > Dear all, > > > > Recently, I try to write a simple PETSC code for solving linear system problem. > > At the beginning of main function, I use PetscInitialize(). > > However, when I run this program with "-start_in_debugger", > > there is a PETSC ERROR caused by this line. > > > > > > The PETSC ERROR is: > > PETSC: Attaching gdb to /Users/meng/MyWork/BuildTools/bin/Parameterization of pid 7342 on display /tmp/launch-0lpYbb/org.macosforge.xquartz:0 on machine vis163d > > Warning: locale not supported by Xlib, locale set to C > > > > Do you have some suggestions to deal with this problem? > > > > Thanks a lot! > > > > Meng > > . From jiangwen84 at gmail.com Mon Jun 23 14:46:04 2014 From: jiangwen84 at gmail.com (Wen Jiang) Date: Mon, 23 Jun 2014 15:46:04 -0400 Subject: [petsc-users] pardiso interface (Barry Smith) Message-ID: Could you tell us when it will be available to public? Thanks. Wen On Mon, Jun 23, 2014 at 1:00 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: pardiso interface (Barry Smith) > 2. Re: pardiso interface (Jed Brown) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 23 Jun 2014 11:53:09 -0500 > From: Barry Smith > To: Michael Povolotskyi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] pardiso interface > Message-ID: <23A64FBB-DB0F-4AF9-9A95-BF7BF1799E3C at mcs.anl.gov> > Content-Type: text/plain; charset="windows-1252" > > > On Jun 23, 2014, at 11:48 AM, Michael Povolotskyi > wrote: > > > On 6/23/2014 11:33 AM, Wen Jiang wrote: > >> Hi, > >> > >> Could anyone tell me whether the current version of petsc supports > pardiso? If so, are there any tricks to configure and use it? Thanks. > >> > >> Regards, > >> Wen > > We have built the interface and sent it to PETSc team. > > It works nicely for us. > > Dear Petsc developers, have you approved it for the public release? > > Michael. > > Hmm, I don?t remember seeing it. Could you please let us know where we > can download it or send it again. > > Barry > > > > > ------------------------------ > > Message: 2 > Date: Mon, 23 Jun 2014 11:59:28 -0500 > From: Jed Brown > To: Barry Smith , Michael Povolotskyi > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] pardiso interface > Message-ID: <87fviv36kv.fsf at jedbrown.org> > Content-Type: text/plain; charset="utf-8" > > Barry Smith writes: > > Hmm, I don?t remember seeing it. Could you please let us know where > we can download it or send it again. > > The PR is on bitbucket: > > > https://bitbucket.org/petsc/petsc/pull-request/105/added-support-for-mkl-pardiso-solver/commits > > Since this uses the Pardiso in MKL, we should be able to test it without > the licensing problems of upstream Pardiso. > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: not available > Type: application/pgp-signature > Size: 818 bytes > Desc: not available > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140623/81f587c9/attachment-0001.pgp > > > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 66, Issue 37 > ******************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jun 23 15:26:54 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jun 2014 15:26:54 -0500 Subject: [petsc-users] pardiso interface (Barry Smith) In-Reply-To: References: Message-ID: I?ve asked Satish to put it into next for testing. BTW: when someone puts up a publicly available branch like this, ANYONE can git it and try it out: it is just the regular PETSc plus their additions. You don?t need to wait until it is merged into PETSc (though it may be a bit more stable once it has been merged in). Barry On Jun 23, 2014, at 2:46 PM, Wen Jiang wrote: > Could you tell us when it will be available to public? Thanks. > > Wen > > On Mon, Jun 23, 2014 at 1:00 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: pardiso interface (Barry Smith) > 2. Re: pardiso interface (Jed Brown) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 23 Jun 2014 11:53:09 -0500 > From: Barry Smith > To: Michael Povolotskyi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] pardiso interface > Message-ID: <23A64FBB-DB0F-4AF9-9A95-BF7BF1799E3C at mcs.anl.gov> > Content-Type: text/plain; charset="windows-1252" > > > On Jun 23, 2014, at 11:48 AM, Michael Povolotskyi wrote: > > > On 6/23/2014 11:33 AM, Wen Jiang wrote: > >> Hi, > >> > >> Could anyone tell me whether the current version of petsc supports pardiso? If so, are there any tricks to configure and use it? Thanks. > >> > >> Regards, > >> Wen > > We have built the interface and sent it to PETSc team. > > It works nicely for us. > > Dear Petsc developers, have you approved it for the public release? > > Michael. > > Hmm, I don?t remember seeing it. Could you please let us know where we can download it or send it again. > > Barry > > > > > ------------------------------ > > Message: 2 > Date: Mon, 23 Jun 2014 11:59:28 -0500 > From: Jed Brown > To: Barry Smith , Michael Povolotskyi > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] pardiso interface > Message-ID: <87fviv36kv.fsf at jedbrown.org> > Content-Type: text/plain; charset="utf-8" > > Barry Smith writes: > > Hmm, I don?t remember seeing it. Could you please let us know where we can download it or send it again. > > The PR is on bitbucket: > > https://bitbucket.org/petsc/petsc/pull-request/105/added-support-for-mkl-pardiso-solver/commits > > Since this uses the Pardiso in MKL, we should be able to test it without > the licensing problems of upstream Pardiso. > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: not available > Type: application/pgp-signature > Size: 818 bytes > Desc: not available > URL: > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 66, Issue 37 > ******************************************* > From jed at jedbrown.org Mon Jun 23 15:41:08 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 23 Jun 2014 15:41:08 -0500 Subject: [petsc-users] pardiso interface (Barry Smith) In-Reply-To: References: Message-ID: <87bntj1hqz.fsf@jedbrown.org> Barry Smith writes: > BTW: when someone puts up a publicly available branch like this, > ANYONE can git it and try it out: it is just the regular PETSc plus > their additions. You don?t need to wait until it is merged into > PETSc (though it may be a bit more stable once it has been merged > in). And especially in cases like this where it involves an external package that is not installed uniformly, it helps to hear to hear your experience I tried this branch with MKL-x.y and apart from the "warning: a b c", it worked correctly and the performance was competitive with Package Z. (or whatever you encountered) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jiangwen84 at gmail.com Mon Jun 23 21:55:17 2014 From: jiangwen84 at gmail.com (Wen Jiang) Date: Mon, 23 Jun 2014 22:55:17 -0400 Subject: [petsc-users] multi-threaded assembly Message-ID: Dear all, I am trying to change my MPI finite element code to OPENMP one. I am not familiar with the usage of OPENMP in PETSc and could anyone give me some suggestions? To assemble the matrix in parallel using OpenMP pragmas, can I directly call MATSETVALUES(ADD_VALUES) or do I need to add some locks around it? Thanks! Wen -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jun 23 21:59:38 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 23 Jun 2014 21:59:38 -0500 Subject: [petsc-users] multi-threaded assembly In-Reply-To: References: Message-ID: <874mzb1085.fsf@jedbrown.org> Wen Jiang writes: > Dear all, > > I am trying to change my MPI finite element code to OPENMP one. I am not > familiar with the usage of OPENMP in PETSc and could anyone give me some > suggestions? > > To assemble the matrix in parallel using OpenMP pragmas, can I directly > call MATSETVALUES(ADD_VALUES) or do I need to add some locks around it? You need to ensure that only one thread is setting values on a given matrix at any one time. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From wumeng07maths at qq.com Tue Jun 24 04:43:41 2014 From: wumeng07maths at qq.com (=?ISO-8859-1?B?T28gICAgICA=?=) Date: Tue, 24 Jun 2014 17:43:41 +0800 Subject: [petsc-users] KSP iteration information out put as an txt file Message-ID: Dear All, How to put out the KSP iteration info. as an .txt file? When I used "-ksp_gmres_restart 200 -ksp_max_it 200 -ksp_monitor_true_residual -ksp_monitor_singular_value", the information of KSP iterations is printed on the terminal. How can I put them out to an independent file (like txt file, or other form)? Thanks, Meng -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jun 24 06:50:14 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 24 Jun 2014 04:50:14 -0700 Subject: [petsc-users] KSP iteration information out put as an txt file In-Reply-To: References: Message-ID: On Tue, Jun 24, 2014 at 2:43 AM, Oo wrote: > > Dear All, > > How to put out the KSP iteration info. as an .txt file? > > When I used "-ksp_gmres_restart 200 -ksp_max_it 200 > -ksp_monitor_true_residual -ksp_monitor_singular_value", > Give the filename as an argument -ksp_monitor_true_residual monitor.txt Thanks, Matt > the information of KSP iterations is printed on the terminal. > How can I put them out to an independent file (like txt file, or other > form)? > > Thanks, > > Meng > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiangwen84 at gmail.com Tue Jun 24 08:54:48 2014 From: jiangwen84 at gmail.com (Wen Jiang) Date: Tue, 24 Jun 2014 09:54:48 -0400 Subject: [petsc-users] multi-threaded assembly In-Reply-To: <874mzb1085.fsf@jedbrown.org> References: <874mzb1085.fsf@jedbrown.org> Message-ID: Thanks for this information. Could you tell me an efficient way to do this in PETSc? I am planning to use at least 32 threads and need to minimize the synchronization overhead Any suggestions? Thanks! Wen On Mon, Jun 23, 2014 at 10:59 PM, Jed Brown wrote: > Wen Jiang writes: > > > Dear all, > > > > I am trying to change my MPI finite element code to OPENMP one. I am not > > familiar with the usage of OPENMP in PETSc and could anyone give me some > > suggestions? > > > > To assemble the matrix in parallel using OpenMP pragmas, can I directly > > call MATSETVALUES(ADD_VALUES) or do I need to add some locks around it? > > You need to ensure that only one thread is setting values on a given > matrix at any one time. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jun 24 09:00:01 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 24 Jun 2014 08:00:01 -0600 Subject: [petsc-users] multi-threaded assembly In-Reply-To: References: <874mzb1085.fsf@jedbrown.org> Message-ID: <87mwd2z9um.fsf@jedbrown.org> Wen Jiang writes: > Thanks for this information. Could you tell me an efficient way to do this > in PETSc? I am planning to use at least 32 threads and need to minimize the > synchronization overhead Any suggestions? You're probably better off using more MPI processes. Perhaps surprisingly given the hype, pure MPI remains the fastest option in many cases. And always profile the simplest method. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From fd.kong at siat.ac.cn Tue Jun 24 09:50:38 2014 From: fd.kong at siat.ac.cn (Fande Kong) Date: Tue, 24 Jun 2014 22:50:38 +0800 Subject: [petsc-users] Any examples to output a dmplex mesh as a hdf5 file? Message-ID: Hi all, There are some functions called DMPlex_load_hdf5 and DMPlex_view_hdf5 in petsc-dev. They are really good functions for outputting the solution as a hdf5 file in parallel. Are there any examples to show how to use these functions? Or are there some printed hdf5 and xdmf files that can be visualized by paraview? Thanks, Fande, -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashwinsrnth at gmail.com Tue Jun 24 21:15:49 2014 From: ashwinsrnth at gmail.com (Ashwin Srinath) Date: Tue, 24 Jun 2014 22:15:49 -0400 Subject: [petsc-users] Viewing DM Vector stored on Multiple GPUs Message-ID: Hello, petsc-users I'm having trouble *viewing* an mpicusp vector. Here's the simplest case that reproduces the problem: int main(int argc, char** argv) { PetscInitialize(&argc, &argv, NULL, NULL); DM da; Vec V; DMDACreate2d( PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DMDA_STENCIL_BOX, 5, 5, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, &da); DMCreateGlobalVector(da, &V); VecSet(V, 1); VecView(V, PETSC_VIEWER_STDOUT_WORLD); PetscFinalize(); return 0; } I get the error: [1]PETSC ERROR: Null argument, when expecting valid pointer [0]PETSC ERROR: Trying to copy from a null pointer I executed with the following command: mpiexec -n 2 ./main -dm_vec_type cusp -vec_type cusp Both GPUs are attached to two different processes. This program works fine for vecmpi vectors, i.e., -dm_vec_type mpi and -vec_type mpi. Also, I don't get an error unless I try to *view* the vector. Can someone please point out what I'm doing wrong? Thanks for your time, Ashwin Srinath -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashwinsrnth at gmail.com Tue Jun 24 21:23:37 2014 From: ashwinsrnth at gmail.com (Ashwin Srinath) Date: Tue, 24 Jun 2014 22:23:37 -0400 Subject: [petsc-users] Viewing DM Vector stored on Multiple GPUs In-Reply-To: References: Message-ID: Here's the error message in it's entirety: Vec Object: 2 MPI processes type: mpicusp [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [0]PETSC ERROR: Trying to copy from a null pointer [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4683-ga6c8f22 GIT Date: 2014-06-24 11:28:06 -0500 [0]PETSC ERROR: /newscratch/atrikut/issue/main on a arch-linux2-cxx-debug named node1774 by atrikut Tue Jun 24 21:59:13 2014 [0]PETSC ERROR: Configure options --with-cuda=1 --with-cusp=1 --with-cusp-dir=/home/atrikut/local/cusplibrary --with-thrust=1 --with-precision=double --with-cuda-arch=sm_21 --with-clanguage=cxx --download-txpetscgpu=1 --with-shared-libraries=1 --with-cuda-dir=/opt/cuda-toolkit/5.5.22 --with-mpi-dir=/opt/mpich2/1.4 [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: Trying to copy from a null pointer [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4683-ga6c8f22 GIT Date: 2014-06-24 11:28:06 -0500 [1]PETSC ERROR: /newscratch/atrikut/issue/main on a arch-linux2-cxx-debug named node1774 by atrikut Tue Jun 24 21:59:13 2014 [1]PETSC ERROR: Configure options --with-cuda=1 --with-cusp=1 --with-cusp-dir=/home/atrikut/local/cusplibrary --with-thrust=1 --with-precision=double --with-cuda-arch=sm_21 --with-clanguage=cxx --download-txpetscgpu=1 --with-shared-libraries=1 --with-cuda-dir=/opt/cuda-toolkit/5.5.22 --with-mpi-dir=/opt/mpich2/1.4 [1]PETSC ERROR: #1 PetscMemcpy() line 1892 in /home/atrikut/local/petsc-dev/include/petscsys.h [1]PETSC ERROR: #1 PetscMemcpy() line 1892 in /home/atrikut/local/petsc-dev/include/petscsys.h [0]PETSC ERROR: #2 VecScatterBegin_1() line 124 in /home/atrikut/local/petsc-dev/include/../src/vec/vec/utils/vpscat.h [0]PETSC ERROR: #3 VecScatterBegin() line 1724 in /home/atrikut/local/petsc-dev/src/vec/vec/utils/vscat.c [0]PETSC ERROR: #4 DMDAGlobalToNaturalBegin() line 171 in /home/atrikut/local/petsc-dev/src/dm/impls/da/dagtol.c #2 VecScatterBegin_1() line 124 in /home/atrikut/local/petsc-dev/include/../src/vec/vec/utils/vpscat.h [1]PETSC ERROR: #3 VecScatterBegin() line 1724 in /home/atrikut/local/petsc-dev/src/vec/vec/utils/vscat.c [0]PETSC ERROR: #5 VecView_MPI_DA() line 721 in /home/atrikut/local/petsc-dev/src/dm/impls/da/gr2.c [0]PETSC ERROR: [1]PETSC ERROR: #4 DMDAGlobalToNaturalBegin() line 171 in /home/atrikut/local/petsc-dev/src/dm/impls/da/dagtol.c [1]PETSC ERROR: #6 VecView() line 601 in /home/atrikut/local/petsc-dev/src/vec/vec/interface/vector.c #5 VecView_MPI_DA() line 721 in /home/atrikut/local/petsc-dev/src/dm/impls/da/gr2.c [1]PETSC ERROR: #6 VecView() line 601 in /home/atrikut/local/petsc-dev/src/vec/vec/interface/vector.c WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-vec_type value: cusp On Tue, Jun 24, 2014 at 10:15 PM, Ashwin Srinath wrote: > Hello, petsc-users > > I'm having trouble *viewing* an mpicusp vector. Here's the simplest case > that reproduces the problem: > > int main(int argc, char** argv) { > > PetscInitialize(&argc, &argv, NULL, NULL); > > DM da; > Vec V; > > DMDACreate2d( PETSC_COMM_WORLD, > DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, > DMDA_STENCIL_BOX, > 5, 5, > PETSC_DECIDE, PETSC_DECIDE, > 1, > 1, > NULL, NULL, > &da); > > DMCreateGlobalVector(da, &V); > > VecSet(V, 1); > VecView(V, PETSC_VIEWER_STDOUT_WORLD); > > PetscFinalize(); > return 0; > } > > I get the error: > [1]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Trying to copy from a null pointer > > I executed with the following command: > mpiexec -n 2 ./main -dm_vec_type cusp -vec_type cusp > Both GPUs are attached to two different processes. > > This program works fine for vecmpi vectors, i.e., -dm_vec_type mpi and > -vec_type mpi. Also, I don't get an error unless I try to *view* the > vector. Can someone please point out what I'm doing wrong? > > Thanks for your time, > Ashwin Srinath > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rupp at iue.tuwien.ac.at Wed Jun 25 04:56:49 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Wed, 25 Jun 2014 11:56:49 +0200 Subject: [petsc-users] Viewing DM Vector stored on Multiple GPUs In-Reply-To: References: Message-ID: <53AA9CE1.4010105@iue.tuwien.ac.at> Hi Ashwin, this stems from a problem related to scattering GPU data across the network (you can see VecScatterBegin() in the trace), which we are currently working on. Here is the associated pull request: https://bitbucket.org/petsc/petsc/pull-request/158/pcbjacobi-with-cusp-and-cusparse-solver It may still take some time to complete, so please remain patient. Best regards, Karli On 06/25/2014 04:23 AM, Ashwin Srinath wrote: > Here's the error message in it's entirety: > > Vec Object: 2 MPI processes > type: mpicusp > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Trying to copy from a null pointer > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4683-ga6c8f22 > GIT Date: 2014-06-24 11:28:06 -0500 > [0]PETSC ERROR: /newscratch/atrikut/issue/main on a > arch-linux2-cxx-debug named node1774 by atrikut Tue Jun 24 21:59:13 2014 > [0]PETSC ERROR: Configure options --with-cuda=1 --with-cusp=1 > --with-cusp-dir=/home/atrikut/local/cusplibrary --with-thrust=1 > --with-precision=double --with-cuda-arch=sm_21 --with-clanguage=cxx > --download-txpetscgpu=1 --with-shared-libraries=1 > --with-cuda-dir=/opt/cuda-toolkit/5.5.22 --with-mpi-dir=/opt/mpich2/1.4 > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Trying to copy from a null pointer > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4683-ga6c8f22 > GIT Date: 2014-06-24 11:28:06 -0500 > [1]PETSC ERROR: /newscratch/atrikut/issue/main on a > arch-linux2-cxx-debug named node1774 by atrikut Tue Jun 24 21:59:13 2014 > [1]PETSC ERROR: Configure options --with-cuda=1 --with-cusp=1 > --with-cusp-dir=/home/atrikut/local/cusplibrary --with-thrust=1 > --with-precision=double --with-cuda-arch=sm_21 --with-clanguage=cxx > --download-txpetscgpu=1 --with-shared-libraries=1 > --with-cuda-dir=/opt/cuda-toolkit/5.5.22 --with-mpi-dir=/opt/mpich2/1.4 > [1]PETSC ERROR: #1 PetscMemcpy() line 1892 in > /home/atrikut/local/petsc-dev/include/petscsys.h > [1]PETSC ERROR: #1 PetscMemcpy() line 1892 in > /home/atrikut/local/petsc-dev/include/petscsys.h > [0]PETSC ERROR: #2 VecScatterBegin_1() line 124 in > /home/atrikut/local/petsc-dev/include/../src/vec/vec/utils/vpscat.h > [0]PETSC ERROR: #3 VecScatterBegin() line 1724 in > /home/atrikut/local/petsc-dev/src/vec/vec/utils/vscat.c > [0]PETSC ERROR: #4 DMDAGlobalToNaturalBegin() line 171 in > /home/atrikut/local/petsc-dev/src/dm/impls/da/dagtol.c > #2 VecScatterBegin_1() line 124 in > /home/atrikut/local/petsc-dev/include/../src/vec/vec/utils/vpscat.h > [1]PETSC ERROR: #3 VecScatterBegin() line 1724 in > /home/atrikut/local/petsc-dev/src/vec/vec/utils/vscat.c > [0]PETSC ERROR: #5 VecView_MPI_DA() line 721 in > /home/atrikut/local/petsc-dev/src/dm/impls/da/gr2.c > [0]PETSC ERROR: [1]PETSC ERROR: #4 DMDAGlobalToNaturalBegin() line 171 > in /home/atrikut/local/petsc-dev/src/dm/impls/da/dagtol.c > [1]PETSC ERROR: #6 VecView() line 601 in > /home/atrikut/local/petsc-dev/src/vec/vec/interface/vector.c > #5 VecView_MPI_DA() line 721 in > /home/atrikut/local/petsc-dev/src/dm/impls/da/gr2.c > [1]PETSC ERROR: #6 VecView() line 601 in > /home/atrikut/local/petsc-dev/src/vec/vec/interface/vector.c > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-vec_type value: cusp > > > > On Tue, Jun 24, 2014 at 10:15 PM, Ashwin Srinath > wrote: > > Hello, petsc-users > > I'm having trouble /viewing/ an mpicusp vector. Here's the simplest > case that reproduces the problem: > > int main(int argc, char** argv) { > > PetscInitialize(&argc, &argv, NULL, NULL); > > DM da; > Vec V; > > DMDACreate2d( PETSC_COMM_WORLD, > DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, > DMDA_STENCIL_BOX, > 5, 5, > PETSC_DECIDE, PETSC_DECIDE, > 1, > 1, > NULL, NULL, > &da); > > DMCreateGlobalVector(da, &V); > > VecSet(V, 1); > VecView(V, PETSC_VIEWER_STDOUT_WORLD); > > PetscFinalize(); > return 0; > } > > I get the error: > [1]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Trying to copy from a null pointer > > I executed with the following command: > mpiexec -n 2 ./main -dm_vec_type cusp -vec_type cusp > Both GPUs are attached to two different processes. > > This program works fine for vecmpi vectors, i.e., -dm_vec_type mpi > and -vec_type mpi. Also, I don't get an error unless I try to /view/ > the vector. Can someone please point out what I'm doing wrong? > > Thanks for your time, > Ashwin Srinath > > > > > > From mathisfriesdorf at gmail.com Wed Jun 25 05:31:43 2014 From: mathisfriesdorf at gmail.com (Mathis Friesdorf) Date: Wed, 25 Jun 2014 12:31:43 +0200 Subject: [petsc-users] Unexpected "Out of memory error" with SLEPC Message-ID: Dear all, after a very useful email exchange with Jed Brown quite a while ago, I was able to find the lowest eigenvalue of a large matrix which is constructed as a tensor product. Admittedly the solution is a bit hacked, but is based on a Matrix shell and Armadillo and therefore reasonably fast. The problem seems to work well for smaller systems, but once the vectors reach a certain size, I get "out of memory" errors. I have tested the initialization of a vector of that size and multiplication by the matrix. This works fine and takes roughly 20GB of memory. There are 256 GB available, so I see no reason why the esp solvers should complain. Does anyone have an idea what goes wrong here? The error message is not very helpful and claims that a memory is requested that is way beyond any reasonable number: *Memory requested 18446744056529684480.* Thanks and all the best, Mathis Friesdorf *Output of the Program:* *mathis at n180:~/localisation$ ./local_plus 27* *System Size: 27--------------------------------------------------------------------------[[30558,1],0]: A high-performance Open MPI point-to-point messaging modulewas unable to find any relevant network interfaces:Module: OpenFabrics (openib) Host: n180Another transport will be used instead, although this may result inlower performance.--------------------------------------------------------------------------[0]PETSC ERROR: --------------------- Error Message ------------------------------------[0]PETSC ERROR: Out of memory. This could be due to allocating[0]PETSC ERROR: too large an object or bleeding by not properly[0]PETSC ERROR: destroying unneeded objects.[0]PETSC ERROR: Memory allocated 3221286704 Memory used by process 3229827072[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.[0]PETSC ERROR: Memory requested 18446744056529684480![0]PETSC ERROR: ------------------------------------------------------------------------[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates.[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.[0]PETSC ERROR: See docs/index.html for manual pages.[0]PETSC ERROR: ------------------------------------------------------------------------[0]PETSC ERROR: ./local_plus on a arch-linux2-cxx-debug named n180 by mathis Wed Jun 25 12:23:01 2014[0]PETSC ERROR: Libraries linked from /home/mathis/bin_nodebug/lib[0]PETSC ERROR: Configure run at Wed Jun 25 00:03:34 2014[0]PETSC ERROR: Configure options PETSC_DIR=/home/mathis/petsc-3.4.4 --with-debugging=1 COPTFLAGS="-O3 -march=p4 -mtune=p4" --with-fortran=0 -with-mpi=1 --with-mpi-dir=/usr/lib/openmpi --with-clanguage=cxx --prefix=/home/mathis/bin_nodebug[0]PETSC ERROR: ------------------------------------------------------------------------[0]PETSC ERROR: PetscMallocAlign() line 46 in /home/mathis/petsc-3.4.4/src/sys/memory/mal.c[0]PETSC ERROR: PetscTrMallocDefault() line 189 in /home/mathis/petsc-3.4.4/src/sys/memory/mtr.c[0]PETSC ERROR: VecDuplicateVecs_Contiguous() line 62 in src/vec/contiguous.c[0]PETSC ERROR: VecDuplicateVecs() line 589 in /home/mathis/petsc-3.4.4/src/vec/vec/interface/vector.c[0]PETSC ERROR: EPSAllocateSolution() line 51 in src/eps/interface/mem.c[0]PETSC ERROR: EPSSetUp_KrylovSchur() line 141 in src/eps/impls/krylov/krylovschur/krylovschur.c[0]PETSC ERROR: EPSSetUp() line 147 in src/eps/interface/setup.c[0]PETSC ERROR: EPSSolve() line 90 in src/eps/interface/solve.c[0]PETSC ERROR: main() line 48 in local_plus.cpp--------------------------------------------------------------------------MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 55.NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.You may or may not see output from other processes, depending onexactly when Open MPI kills them.--------------------------------------------------------------------------* *Output of make:* *mathis at n180:~/localisation$ make local_plusmpicxx -o local_plus.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -fPIC -I/home/mathis/armadillo-4.300.8/include -lblas -llapack -L/home/mathis/armadillo-4.300.8 -O3 -larmadillo -fomit-frame-pointer -I/home/mathis/bin_nodebug/include -I/home/mathis/bin_nodebug/include -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi -D__INSDIR__= -I/home/mathis/bin_nodebug -I/home/mathis/bin_nodebug//include -I/home/mathis/bin_nodebug/include local_plus.cpplocal_plus.cpp:22:0: warning: "__FUNCT__" redefined [enabled by default]In file included from /home/mathis/bin_nodebug/include/petscvec.h:10:0, from local_plus.cpp:10:/home/mathis/bin_nodebug/include/petscviewer.h:386:0: note: this is the location of the previous definitiong++ -o local_plus local_plus.o -Wl,-rpath,/home/mathis/bin_nodebug//lib -L/home/mathis/bin_nodebug//lib -lslepc -Wl,-rpath,/home/mathis/bin_nodebug/lib -L/home/mathis/bin_nodebug/lib -lpetsc -llapack -lblas -lpthread -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.7 -L/usr/lib/gcc/x86_64-linux-gnu/4.7 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl/bin/rm -f local_plus.o* *Code:* *//Author: Mathis Friesdorf//MathisFriesdorf at gmail.com static char help[] = "1D chain\n";#include #include #include #include #include #include #include #include #include #include PetscErrorCode BathMult(Mat H, Vec x,Vec y);PetscInt L=30,d=2,dsys;PetscErrorCode ierr;arma::mat hint = "1.0 0 0 0.0; 0 -1.0 2.0 0; 0 2.0 -1.0 0; 0 0 0 1.0;";#define __FUNCT__ "main"int main(int argc, char **argv){ Mat H; EPS eps; Vec xr,xi; PetscScalar kr,ki; PetscInt j, nconv; L = strtol(argv[1],NULL,10); dsys = pow(d,L); printf("%s","System Size: "); printf("%i",L); printf("%s","\n"); SlepcInitialize(&argc,&argv,(char*)0,help); MatCreateShell(PETSC_COMM_WORLD,dsys,dsys,dsys,dsys,NULL,&H); MatShellSetOperation(H,MATOP_MULT,(void(*)())BathMult); ierr = MatGetVecs(H,NULL,&xr); CHKERRQ(ierr); ierr = MatGetVecs(H,NULL,&xi); CHKERRQ(ierr); ierr = EPSCreate(PETSC_COMM_WORLD, &eps); CHKERRQ(ierr); ierr = EPSSetOperators(eps, H, NULL); CHKERRQ(ierr); ierr = EPSSetProblemType(eps, EPS_HEP); CHKERRQ(ierr); ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); CHKERRQ(ierr); ierr = EPSSetFromOptions( eps ); CHKERRQ(ierr); ierr = EPSSolve(eps); CHKERRQ(ierr); ierr = EPSGetConverged(eps, &nconv); CHKERRQ(ierr); for (j=0; j<1; j++) { EPSGetEigenpair(eps, j, &kr, &ki, xr, xi); printf("%s","Lowest Eigenvalue: "); PetscPrintf(PETSC_COMM_WORLD,"%9F",kr); PetscPrintf(PETSC_COMM_WORLD,"\n"); } EPSDestroy(&eps); ierr = SlepcFinalize(); return 0;}#undef __FUNCT__#define __FUNCT__ "BathMult"PetscErrorCode BathMult(Mat H, Vec x, Vec y){ PetscInt l; uint slice; PetscScalar *arrayin,*arrayout; VecGetArray(x,&arrayin); VecGetArray(y,&arrayout); arma::cube A = arma::cube(arrayin,1,1,pow(d,L), /*copy_aux_mem*/false,/*strict*/true); arma::mat result = arma::mat(arrayout,pow(d,L),1, /*copy_aux_mem*/false,/*strict*/true); for (l=0;l From rupp at iue.tuwien.ac.at Wed Jun 25 05:42:31 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Wed, 25 Jun 2014 12:42:31 +0200 Subject: [petsc-users] Unexpected "Out of memory error" with SLEPC In-Reply-To: References: Message-ID: <53AAA797.5030808@iue.tuwien.ac.at> Hi Mathis, this looks very much like an integer overflow: http://www.mcs.anl.gov/petsc/documentation/faq.html#with-64-bit-indices Best regards, Karli On 06/25/2014 12:31 PM, Mathis Friesdorf wrote: > Dear all, > > after a very useful email exchange with Jed Brown quite a while ago, I > was able to find the lowest eigenvalue of a large matrix which is > constructed as a tensor product. Admittedly the solution is a bit > hacked, but is based on a Matrix shell and Armadillo and therefore > reasonably fast. The problem seems to work well for smaller systems, but > once the vectors reach a certain size, I get "out of memory" errors. I > have tested the initialization of a vector of that size and > multiplication by the matrix. This works fine and takes roughly 20GB of > memory. There are 256 GB available, so I see no reason why the esp > solvers should complain. Does anyone have an idea what goes wrong here? > The error message is not very helpful and claims that a memory is > requested that is way beyond any reasonable number: /Memory requested > 18446744056529684480./ > > Thanks and all the best, Mathis Friesdorf > > > *Output of the Program:* > /mathis at n180:~/localisation$ ./local_plus 27/ > /System Size: 27 > -------------------------------------------------------------------------- > [[30558,1],0]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > > Module: OpenFabrics (openib) > Host: n180 > > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 3221286704 Memory used by process > 3229827072 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 18446744056529684480! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./local_plus on a arch-linux2-cxx-debug named n180 by > mathis Wed Jun 25 12:23:01 2014 > [0]PETSC ERROR: Libraries linked from /home/mathis/bin_nodebug/lib > [0]PETSC ERROR: Configure run at Wed Jun 25 00:03:34 2014 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mathis/petsc-3.4.4 > --with-debugging=1 COPTFLAGS="-O3 -march=p4 -mtune=p4" --with-fortran=0 > -with-mpi=1 --with-mpi-dir=/usr/lib/openmpi --with-clanguage=cxx > --prefix=/home/mathis/bin_nodebug > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 46 in > /home/mathis/petsc-3.4.4/src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 189 in > /home/mathis/petsc-3.4.4/src/sys/memory/mtr.c > [0]PETSC ERROR: VecDuplicateVecs_Contiguous() line 62 in > src/vec/contiguous.c > [0]PETSC ERROR: VecDuplicateVecs() line 589 in > /home/mathis/petsc-3.4.4/src/vec/vec/interface/vector.c > [0]PETSC ERROR: EPSAllocateSolution() line 51 in src/eps/interface/mem.c > [0]PETSC ERROR: EPSSetUp_KrylovSchur() line 141 in > src/eps/impls/krylov/krylovschur/krylovschur.c > [0]PETSC ERROR: EPSSetUp() line 147 in src/eps/interface/setup.c > [0]PETSC ERROR: EPSSolve() line 90 in src/eps/interface/solve.c > [0]PETSC ERROR: main() line 48 in local_plus.cpp > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 55. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > > / > *Output of make: > */mathis at n180:~/localisation$ make local_plus > mpicxx -o local_plus.o -c -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -g -fPIC > -I/home/mathis/armadillo-4.300.8/include -lblas -llapack > -L/home/mathis/armadillo-4.300.8 -O3 -larmadillo -fomit-frame-pointer > -I/home/mathis/bin_nodebug/include -I/home/mathis/bin_nodebug/include > -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi > -D__INSDIR__= -I/home/mathis/bin_nodebug > -I/home/mathis/bin_nodebug//include -I/home/mathis/bin_nodebug/include > local_plus.cpp > local_plus.cpp:22:0: warning: "__FUNCT__" redefined [enabled by default] > In file included from /home/mathis/bin_nodebug/include/petscvec.h:10:0, > from local_plus.cpp:10: > /home/mathis/bin_nodebug/include/petscviewer.h:386:0: note: this is the > location of the previous definition > g++ -o local_plus local_plus.o -Wl,-rpath,/home/mathis/bin_nodebug//lib > -L/home/mathis/bin_nodebug//lib -lslepc > -Wl,-rpath,/home/mathis/bin_nodebug/lib -L/home/mathis/bin_nodebug/lib > -lpetsc -llapack -lblas -lpthread -Wl,-rpath,/usr/lib/openmpi/lib > -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.7 > -L/usr/lib/gcc/x86_64-linux-gnu/4.7 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte > -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl > /bin/rm -f local_plus.o > / > *Code: > *///Author: Mathis Friesdorf > //MathisFriesdorf at gmail.com > > static char help[] = "1D chain\n"; > > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > PetscErrorCode BathMult(Mat H, Vec x,Vec y); > PetscInt L=30,d=2,dsys; > PetscErrorCode ierr; > arma::mat hint = "1.0 0 0 0.0; 0 -1.0 2.0 0; 0 2.0 -1.0 0; 0 0 0 1.0;"; > > #define __FUNCT__ "main" > int main(int argc, char **argv) > { > Mat H; > EPS eps; > Vec xr,xi; > PetscScalar kr,ki; > PetscInt j, nconv; > > L = strtol(argv[1],NULL,10); > dsys = pow(d,L); > printf("%s","System Size: "); > printf("%i",L); > printf("%s","\n"); > SlepcInitialize(&argc,&argv,(char*)0,help); > > MatCreateShell(PETSC_COMM_WORLD,dsys,dsys,dsys,dsys,NULL,&H); > MatShellSetOperation(H,MATOP_MULT,(void(*)())BathMult); > ierr = MatGetVecs(H,NULL,&xr); CHKERRQ(ierr); > ierr = MatGetVecs(H,NULL,&xi); CHKERRQ(ierr); > > ierr = EPSCreate(PETSC_COMM_WORLD, &eps); CHKERRQ(ierr); > ierr = EPSSetOperators(eps, H, NULL); CHKERRQ(ierr); > ierr = EPSSetProblemType(eps, EPS_HEP); CHKERRQ(ierr); > ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); CHKERRQ(ierr); > ierr = EPSSetFromOptions( eps ); CHKERRQ(ierr); > ierr = EPSSolve(eps); CHKERRQ(ierr); > ierr = EPSGetConverged(eps, &nconv); CHKERRQ(ierr); > for (j=0; j<1; j++) { > EPSGetEigenpair(eps, j, &kr, &ki, xr, xi); > printf("%s","Lowest Eigenvalue: "); > PetscPrintf(PETSC_COMM_WORLD,"%9F",kr); > PetscPrintf(PETSC_COMM_WORLD,"\n"); > } > EPSDestroy(&eps); > > ierr = SlepcFinalize(); > return 0; > } > #undef __FUNCT__ > > #define __FUNCT__ "BathMult" > PetscErrorCode BathMult(Mat H, Vec x, Vec y) > { > PetscInt l; > uint slice; > PetscScalar *arrayin,*arrayout; > > VecGetArray(x,&arrayin); > VecGetArray(y,&arrayout); > arma::cube A = arma::cube(arrayin,1,1,pow(d,L), > /*copy_aux_mem*/false,/*strict*/true); > arma::mat result = arma::mat(arrayout,pow(d,L),1, > /*copy_aux_mem*/false,/*strict*/true); > for (l=0;l A.reshape(pow(d,L-2-l),pow(d,2),pow(d,l)); > result.reshape(pow(d,L-l),pow(d,l)); > for (slice=0;slice result.col(slice) += vectorise(A.slice(slice)*hint); > } > } > arrayin = A.memptr(); > ierr = VecRestoreArray(x,&arrayin); CHKERRQ(ierr); > arrayout = result.memptr(); > ierr = VecRestoreArray(y,&arrayout); CHKERRQ(ierr); > PetscFunctionReturn(0); > } > #undef __FUNCT__/* > * From jansen.gunnar at gmail.com Wed Jun 25 08:44:23 2014 From: jansen.gunnar at gmail.com (Gunnar Jansen) Date: Wed, 25 Jun 2014 15:44:23 +0200 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc Message-ID: Hi, i try to solve a problem in parallel with MUMPS as the direct solver. As long as I run the program on only 1 node with 6 processors everything works fine! But using 2 nodes with 3 processors each gets mumps stuck in the factorization. For the purpose of testing I run the ex2.c on a resolution of 100x100 (which is of course way to small for a direct solver in parallel). The code is run with : mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 -mat_mumps_icntl_4 3 The petsc-configuration I used is: --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps --download-scalapack --download-parmetis --download-metis Is this common behavior? Or is there an error in the petsc configuration I am using here? Best, Gunnar -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Jun 25 08:52:39 2014 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 25 Jun 2014 15:52:39 +0200 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc In-Reply-To: References: Message-ID: This sounds weird. The launch line you provided doesn't include any information regarding how many processors (nodes/nodes per core to use). I presume you are using a queuing system. My guess is that there could be an issue with either (i) your job script, (ii) the configuration of the job scheduler on the machine, or (iii) the mpi installation on the machine. Have you been able to successfully run other petsc (or any mpi) codes with the same launch options (2 nodes, 3 procs per node)? Cheers. Dave On 25 June 2014 15:44, Gunnar Jansen wrote: > Hi, > > i try to solve a problem in parallel with MUMPS as the direct solver. As > long as I run the program on only 1 node with 6 processors everything works > fine! But using 2 nodes with 3 processors each gets mumps stuck in the > factorization. > > For the purpose of testing I run the ex2.c on a resolution of 100x100 > (which is of course way to small for a direct solver in parallel). > > The code is run with : > mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package > mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 > -mat_mumps_icntl_4 3 > > The petsc-configuration I used is: > --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes > --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps > --download-scalapack --download-parmetis --download-metis > > Is this common behavior? Or is there an error in the petsc configuration I > am using here? > > Best, > Gunnar > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jansen.gunnar at gmail.com Wed Jun 25 09:09:29 2014 From: jansen.gunnar at gmail.com (Gunnar Jansen) Date: Wed, 25 Jun 2014 16:09:29 +0200 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc In-Reply-To: References: Message-ID: You are right about the queuing system. The job is submitted with a PBS script specifying the number of nodes/processors. On the cluster petsc is configured in a module environment which sets the appropriate flags for compilers/rules etc. The same exact job script on the same exact nodes with a standard krylov method does not give any trouble but executes nicely on all processors (and also give the correct result). Therefore my suspicion is a missing flag in the mumps interface. Is this maybe rather a topic for the mumps-dev team? Best, Gunnar 2014-06-25 15:52 GMT+02:00 Dave May : > This sounds weird. > > The launch line you provided doesn't include any information regarding how > many processors (nodes/nodes per core to use). I presume you are using a > queuing system. My guess is that there could be an issue with either (i) > your job script, (ii) the configuration of the job scheduler on the > machine, or (iii) the mpi installation on the machine. > > Have you been able to successfully run other petsc (or any mpi) codes with > the same launch options (2 nodes, 3 procs per node)? > > Cheers. > Dave > > > > > On 25 June 2014 15:44, Gunnar Jansen wrote: > >> Hi, >> >> i try to solve a problem in parallel with MUMPS as the direct solver. As >> long as I run the program on only 1 node with 6 processors everything works >> fine! But using 2 nodes with 3 processors each gets mumps stuck in the >> factorization. >> >> For the purpose of testing I run the ex2.c on a resolution of 100x100 >> (which is of course way to small for a direct solver in parallel). >> >> The code is run with : >> mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package >> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 >> -mat_mumps_icntl_4 3 >> >> The petsc-configuration I used is: >> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes >> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps >> --download-scalapack --download-parmetis --download-metis >> >> Is this common behavior? Or is there an error in the petsc configuration >> I am using here? >> >> Best, >> Gunnar >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jun 25 10:08:57 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jun 2014 08:08:57 -0700 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc In-Reply-To: References: Message-ID: On Wed, Jun 25, 2014 at 7:09 AM, Gunnar Jansen wrote: > You are right about the queuing system. The job is submitted with a PBS > script specifying the number of nodes/processors. On the cluster petsc is > configured in a module environment which sets the appropriate flags for > compilers/rules etc. > > The same exact job script on the same exact nodes with a standard krylov > method does not give any trouble but executes nicely on all processors (and > also give the correct result). > > Therefore my suspicion is a missing flag in the mumps interface. Is this > maybe rather a topic for the mumps-dev team? > I doubt this. The whole point of MPI is to shield code from these details. Can you first try this system with SuperLU_dist? Thanks, MAtt > Best, Gunnar > > > > 2014-06-25 15:52 GMT+02:00 Dave May : > > This sounds weird. >> >> The launch line you provided doesn't include any information regarding >> how many processors (nodes/nodes per core to use). I presume you are using >> a queuing system. My guess is that there could be an issue with either (i) >> your job script, (ii) the configuration of the job scheduler on the >> machine, or (iii) the mpi installation on the machine. >> >> Have you been able to successfully run other petsc (or any mpi) codes >> with the same launch options (2 nodes, 3 procs per node)? >> >> Cheers. >> Dave >> >> >> >> >> On 25 June 2014 15:44, Gunnar Jansen wrote: >> >>> Hi, >>> >>> i try to solve a problem in parallel with MUMPS as the direct solver. As >>> long as I run the program on only 1 node with 6 processors everything works >>> fine! But using 2 nodes with 3 processors each gets mumps stuck in the >>> factorization. >>> >>> For the purpose of testing I run the ex2.c on a resolution of 100x100 >>> (which is of course way to small for a direct solver in parallel). >>> >>> The code is run with : >>> mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package >>> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 >>> -mat_mumps_icntl_4 3 >>> >>> The petsc-configuration I used is: >>> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes >>> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps >>> --download-scalapack --download-parmetis --download-metis >>> >>> Is this common behavior? Or is there an error in the petsc configuration >>> I am using here? >>> >>> Best, >>> Gunnar >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Jun 25 10:17:55 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 25 Jun 2014 10:17:55 -0500 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc In-Reply-To: References: Message-ID: Suggest running the non-mumps case with -log_summary [to confirm that '-np 6' is actually used in both cases] Secondly - you can try a 'release' version of openmpi or mpich and see if that works. [I don't see a mention of openmpi-1.9a on the website] Also you can try -log_trace to see where its hanging [or figure out how to run code in debugger on this cluster]. But that might not help in figuring out the solution to the hang.. Satish On Wed, 25 Jun 2014, Matthew Knepley wrote: > On Wed, Jun 25, 2014 at 7:09 AM, Gunnar Jansen > wrote: > > > You are right about the queuing system. The job is submitted with a PBS > > script specifying the number of nodes/processors. On the cluster petsc is > > configured in a module environment which sets the appropriate flags for > > compilers/rules etc. > > > > The same exact job script on the same exact nodes with a standard krylov > > method does not give any trouble but executes nicely on all processors (and > > also give the correct result). > > > > Therefore my suspicion is a missing flag in the mumps interface. Is this > > maybe rather a topic for the mumps-dev team? > > > > I doubt this. The whole point of MPI is to shield code from these details. > > Can you first try this system with SuperLU_dist? > > Thanks, > > MAtt > > > > Best, Gunnar > > > > > > > > 2014-06-25 15:52 GMT+02:00 Dave May : > > > > This sounds weird. > >> > >> The launch line you provided doesn't include any information regarding > >> how many processors (nodes/nodes per core to use). I presume you are using > >> a queuing system. My guess is that there could be an issue with either (i) > >> your job script, (ii) the configuration of the job scheduler on the > >> machine, or (iii) the mpi installation on the machine. > >> > >> Have you been able to successfully run other petsc (or any mpi) codes > >> with the same launch options (2 nodes, 3 procs per node)? > >> > >> Cheers. > >> Dave > >> > >> > >> > >> > >> On 25 June 2014 15:44, Gunnar Jansen wrote: > >> > >>> Hi, > >>> > >>> i try to solve a problem in parallel with MUMPS as the direct solver. As > >>> long as I run the program on only 1 node with 6 processors everything works > >>> fine! But using 2 nodes with 3 processors each gets mumps stuck in the > >>> factorization. > >>> > >>> For the purpose of testing I run the ex2.c on a resolution of 100x100 > >>> (which is of course way to small for a direct solver in parallel). > >>> > >>> The code is run with : > >>> mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package > >>> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 > >>> -mat_mumps_icntl_4 3 > >>> > >>> The petsc-configuration I used is: > >>> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes > >>> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps > >>> --download-scalapack --download-parmetis --download-metis > >>> > >>> Is this common behavior? Or is there an error in the petsc configuration > >>> I am using here? > >>> > >>> Best, > >>> Gunnar > >>> > >> > >> > > > > > From knepley at gmail.com Wed Jun 25 10:46:36 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jun 2014 08:46:36 -0700 Subject: [petsc-users] Any examples to output a dmplex mesh as a hdf5 file? In-Reply-To: References: Message-ID: On Tue, Jun 24, 2014 at 7:50 AM, Fande Kong wrote: > Hi all, > > There are some functions called DMPlex_load_hdf5 and DMPlex_view_hdf5 in > petsc-dev. They are really good functions for outputting the solution as a > hdf5 file in parallel. Are there any examples to show how to use these > functions? Or are there some printed hdf5 and xdmf files that can be > visualized by paraview? > This is very new code. I plan to write a manual section as soon as the functionality solidifies. However, here is how I am currently using it. Anywhere that you think about viewing something add ierr = PetscObjectViewFromOptions((PetscObject) obj, prefix, "-my_obj_view");CHKERRQ(ierr); Then you can use the standard option style -my_obj_view hdf5:my.h5 This extends nicely to many objects. Here is what I use for my magma dynamics output -dm_view hdf5:sol_solver_debug.h5 -magma_view_solution hdf5:sol_solver_debug.h5::append -compaction_vec_view hdf5:sol_solver_debug.h5:HDF5_VIZ:append There is still a problem in that you cannot choose multiple formats using this method. I am going to extend the view options format type:file:format:mode to allow type:file:format,format,format:mode to handle this. Thanks, Matt > Thanks, > > Fande, > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From akurlej at gmail.com Wed Jun 25 15:10:37 2014 From: akurlej at gmail.com (Arthur Kurlej) Date: Wed, 25 Jun 2014 15:10:37 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? Message-ID: Hi all, While running my code, I have found that MatMult() returns different values depending on the number of processors I use (and there is quite the variance in the values). The setup of my code is as follows (I can go into more depth/background if needed): -Generate parallel AIJ matrix of size NxN, denoted as A -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from A, denoted as C -Generate vector of length N-1, denoted as x -Find C*x=b I have already checked that A, C, and x are all equivalent when ran for any number of processors, it is only the values of vector b that varies. Does anyone have an idea about what's going on? Thanks, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jun 25 16:06:24 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 25 Jun 2014 16:06:24 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: Message-ID: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> How different are the values in b? Can you send back a few examples of the different b?s? Any idea of the condition number of C? Barry On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: > Hi all, > > While running my code, I have found that MatMult() returns different values depending on the number of processors I use (and there is quite the variance in the values). > > The setup of my code is as follows (I can go into more depth/background if needed): > -Generate parallel AIJ matrix of size NxN, denoted as A > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from A, denoted as C > -Generate vector of length N-1, denoted as x > -Find C*x=b > > I have already checked that A, C, and x are all equivalent when ran for any number of processors, it is only the values of vector b that varies. > > Does anyone have an idea about what's going on? > > > Thanks, > Arthur > From akurlej at gmail.com Wed Jun 25 16:37:59 2014 From: akurlej at gmail.com (Arthur Kurlej) Date: Wed, 25 Jun 2014 16:37:59 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> Message-ID: Hi Barry, So for the matrix C that I am currently testing (size 162x162), the condition number is roughly 10^4. For reference, I'm porting MATLAB code into PETSc, and for one processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. So I know that for one processor, my program is performing as expected. I've included examples below of values for b (also of size 162), ranging from indices 131 to 141. #processors=1: 0 1.315217173959314e-20 1.315217173959314e-20 4.843201487740107e-17 4.843201487740107e-17 8.166104700666665e-14 8.166104700666665e-14 6.303834267553249e-11 6.303834267553249e-11 2.227932688485483e-08 2.227932688485483e-08 # processors=2: 5.480410831461926e-22 2.892553944350444e-22 2.892553944350444e-22 7.524038923310717e-24 7.524038923214420e-24 -3.340766769043093e-26 -7.558372155761972e-27 5.551561288838557e-25 5.550551546879874e-25 -1.579397982093437e-22 2.655766754178065e-22 # processors = 4: 5.480410831461926e-22 2.892553944351728e-22 2.892553944351728e-22 7.524092205125593e-24 7.524092205125593e-24 -2.584939414228212e-26 -2.584939414228212e-26 0 0 -1.245940797657998e-23 -1.245940797657998e-23 # processors = 8: 5.480410831461926e-22 2.892553944023035e-22 2.892553944023035e-22 7.524065744581494e-24 7.524065744581494e-24 -2.250265175188197e-26 -2.250265175188197e-26 -6.543127892265160e-26 1.544288143499193e-317 8.788794008375919e-25 8.788794008375919e-25 Thanks, Arthur On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith wrote: > > How different are the values in b? Can you send back a few examples of > the different b?s? Any idea of the condition number of C? > > Barry > > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: > > > Hi all, > > > > While running my code, I have found that MatMult() returns different > values depending on the number of processors I use (and there is quite the > variance in the values). > > > > The setup of my code is as follows (I can go into more depth/background > if needed): > > -Generate parallel AIJ matrix of size NxN, denoted as A > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from A, > denoted as C > > -Generate vector of length N-1, denoted as x > > -Find C*x=b > > > > I have already checked that A, C, and x are all equivalent when ran for > any number of processors, it is only the values of vector b that varies. > > > > Does anyone have an idea about what's going on? > > > > > > Thanks, > > Arthur > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jun 25 18:24:05 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 25 Jun 2014 18:24:05 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> Message-ID: <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Can you send the code that reproduces this behavior? Barry On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: > Hi Barry, > > So for the matrix C that I am currently testing (size 162x162), the condition number is roughly 10^4. > > For reference, I'm porting MATLAB code into PETSc, and for one processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. So I know that for one processor, my program is performing as expected. > > I've included examples below of values for b (also of size 162), ranging from indices 131 to 141. > > #processors=1: > 0 > 1.315217173959314e-20 > 1.315217173959314e-20 > 4.843201487740107e-17 > 4.843201487740107e-17 > 8.166104700666665e-14 > 8.166104700666665e-14 > 6.303834267553249e-11 > 6.303834267553249e-11 > 2.227932688485483e-08 > 2.227932688485483e-08 > > # processors=2: > 5.480410831461926e-22 > 2.892553944350444e-22 > 2.892553944350444e-22 > 7.524038923310717e-24 > 7.524038923214420e-24 > -3.340766769043093e-26 > -7.558372155761972e-27 > 5.551561288838557e-25 > 5.550551546879874e-25 > -1.579397982093437e-22 > 2.655766754178065e-22 > > # processors = 4: > 5.480410831461926e-22 > 2.892553944351728e-22 > 2.892553944351728e-22 > 7.524092205125593e-24 > 7.524092205125593e-24 > -2.584939414228212e-26 > -2.584939414228212e-26 > 0 > 0 > -1.245940797657998e-23 > -1.245940797657998e-23 > > # processors = 8: > 5.480410831461926e-22 > 2.892553944023035e-22 > 2.892553944023035e-22 > 7.524065744581494e-24 > 7.524065744581494e-24 > -2.250265175188197e-26 > -2.250265175188197e-26 > -6.543127892265160e-26 > 1.544288143499193e-317 > 8.788794008375919e-25 > 8.788794008375919e-25 > > > Thanks, > Arthur > > > > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith wrote: > > How different are the values in b? Can you send back a few examples of the different b?s? Any idea of the condition number of C? > > Barry > > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: > > > Hi all, > > > > While running my code, I have found that MatMult() returns different values depending on the number of processors I use (and there is quite the variance in the values). > > > > The setup of my code is as follows (I can go into more depth/background if needed): > > -Generate parallel AIJ matrix of size NxN, denoted as A > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from A, denoted as C > > -Generate vector of length N-1, denoted as x > > -Find C*x=b > > > > I have already checked that A, C, and x are all equivalent when ran for any number of processors, it is only the values of vector b that varies. > > > > Does anyone have an idea about what's going on? > > > > > > Thanks, > > Arthur > > > > From mathisfriesdorf at gmail.com Thu Jun 26 05:08:23 2014 From: mathisfriesdorf at gmail.com (Mathis Friesdorf) Date: Thu, 26 Jun 2014 12:08:23 +0200 Subject: [petsc-users] Unexpected "Out of memory error" with SLEPC In-Reply-To: <53AAA797.5030808@iue.tuwien.ac.at> References: <53AAA797.5030808@iue.tuwien.ac.at> Message-ID: Dear Karl, thanks a lot! This indeed worked. I recompiled PETSC with ./configure --with-64-bit-indices=1 PETSC_ARCH = arch-linux2-c-debug-int64 and the error is gone. Again thanks for your help! All the best, Mathis On Wed, Jun 25, 2014 at 12:42 PM, Karl Rupp wrote: > Hi Mathis, > > this looks very much like an integer overflow: > http://www.mcs.anl.gov/petsc/documentation/faq.html#with-64-bit-indices > > Best regards, > Karli > > > On 06/25/2014 12:31 PM, Mathis Friesdorf wrote: > >> Dear all, >> >> after a very useful email exchange with Jed Brown quite a while ago, I >> was able to find the lowest eigenvalue of a large matrix which is >> constructed as a tensor product. Admittedly the solution is a bit >> hacked, but is based on a Matrix shell and Armadillo and therefore >> reasonably fast. The problem seems to work well for smaller systems, but >> once the vectors reach a certain size, I get "out of memory" errors. I >> have tested the initialization of a vector of that size and >> multiplication by the matrix. This works fine and takes roughly 20GB of >> memory. There are 256 GB available, so I see no reason why the esp >> solvers should complain. Does anyone have an idea what goes wrong here? >> The error message is not very helpful and claims that a memory is >> requested that is way beyond any reasonable number: /Memory requested >> 18446744056529684480./ >> >> >> Thanks and all the best, Mathis Friesdorf >> >> >> *Output of the Program:* >> /mathis at n180:~/localisation$ ./local_plus 27/ >> /System Size: 27 >> >> ------------------------------------------------------------ >> -------------- >> [[30558,1],0]: A high-performance Open MPI point-to-point messaging module >> was unable to find any relevant network interfaces: >> >> Module: OpenFabrics (openib) >> Host: n180 >> >> Another transport will be used instead, although this may result in >> lower performance. >> ------------------------------------------------------------ >> -------------- >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Out of memory. This could be due to allocating >> [0]PETSC ERROR: too large an object or bleeding by not properly >> [0]PETSC ERROR: destroying unneeded objects. >> [0]PETSC ERROR: Memory allocated 3221286704 Memory used by process >> 3229827072 >> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. >> [0]PETSC ERROR: Memory requested 18446744056529684480! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./local_plus on a arch-linux2-cxx-debug named n180 by >> mathis Wed Jun 25 12:23:01 2014 >> [0]PETSC ERROR: Libraries linked from /home/mathis/bin_nodebug/lib >> [0]PETSC ERROR: Configure run at Wed Jun 25 00:03:34 2014 >> [0]PETSC ERROR: Configure options PETSC_DIR=/home/mathis/petsc-3.4.4 >> --with-debugging=1 COPTFLAGS="-O3 -march=p4 -mtune=p4" --with-fortran=0 >> -with-mpi=1 --with-mpi-dir=/usr/lib/openmpi --with-clanguage=cxx >> --prefix=/home/mathis/bin_nodebug >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: PetscMallocAlign() line 46 in >> /home/mathis/petsc-3.4.4/src/sys/memory/mal.c >> [0]PETSC ERROR: PetscTrMallocDefault() line 189 in >> /home/mathis/petsc-3.4.4/src/sys/memory/mtr.c >> [0]PETSC ERROR: VecDuplicateVecs_Contiguous() line 62 in >> src/vec/contiguous.c >> [0]PETSC ERROR: VecDuplicateVecs() line 589 in >> /home/mathis/petsc-3.4.4/src/vec/vec/interface/vector.c >> [0]PETSC ERROR: EPSAllocateSolution() line 51 in src/eps/interface/mem.c >> [0]PETSC ERROR: EPSSetUp_KrylovSchur() line 141 in >> src/eps/impls/krylov/krylovschur/krylovschur.c >> [0]PETSC ERROR: EPSSetUp() line 147 in src/eps/interface/setup.c >> [0]PETSC ERROR: EPSSolve() line 90 in src/eps/interface/solve.c >> [0]PETSC ERROR: main() line 48 in local_plus.cpp >> ------------------------------------------------------------ >> -------------- >> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >> with errorcode 55. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> ------------------------------------------------------------ >> -------------- >> >> / >> *Output of make: >> */mathis at n180:~/localisation$ make local_plus >> >> mpicxx -o local_plus.o -c -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -g -fPIC >> -I/home/mathis/armadillo-4.300.8/include -lblas -llapack >> -L/home/mathis/armadillo-4.300.8 -O3 -larmadillo -fomit-frame-pointer >> -I/home/mathis/bin_nodebug/include -I/home/mathis/bin_nodebug/include >> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi >> -D__INSDIR__= -I/home/mathis/bin_nodebug >> -I/home/mathis/bin_nodebug//include -I/home/mathis/bin_nodebug/include >> local_plus.cpp >> local_plus.cpp:22:0: warning: "__FUNCT__" redefined [enabled by default] >> In file included from /home/mathis/bin_nodebug/include/petscvec.h:10:0, >> from local_plus.cpp:10: >> /home/mathis/bin_nodebug/include/petscviewer.h:386:0: note: this is the >> location of the previous definition >> g++ -o local_plus local_plus.o -Wl,-rpath,/home/mathis/bin_nodebug//lib >> -L/home/mathis/bin_nodebug//lib -lslepc >> -Wl,-rpath,/home/mathis/bin_nodebug/lib -L/home/mathis/bin_nodebug/lib >> -lpetsc -llapack -lblas -lpthread -Wl,-rpath,/usr/lib/openmpi/lib >> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.7 >> -L/usr/lib/gcc/x86_64-linux-gnu/4.7 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >> -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte >> -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl >> /bin/rm -f local_plus.o >> / >> *Code: >> *///Author: Mathis Friesdorf >> //MathisFriesdorf at gmail.com >> >> >> static char help[] = "1D chain\n"; >> >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> PetscErrorCode BathMult(Mat H, Vec x,Vec y); >> PetscInt L=30,d=2,dsys; >> PetscErrorCode ierr; >> arma::mat hint = "1.0 0 0 0.0; 0 -1.0 2.0 0; 0 2.0 -1.0 0; 0 0 0 1.0;"; >> >> #define __FUNCT__ "main" >> int main(int argc, char **argv) >> { >> Mat H; >> EPS eps; >> Vec xr,xi; >> PetscScalar kr,ki; >> PetscInt j, nconv; >> >> L = strtol(argv[1],NULL,10); >> dsys = pow(d,L); >> printf("%s","System Size: "); >> printf("%i",L); >> printf("%s","\n"); >> SlepcInitialize(&argc,&argv,(char*)0,help); >> >> MatCreateShell(PETSC_COMM_WORLD,dsys,dsys,dsys,dsys,NULL,&H); >> MatShellSetOperation(H,MATOP_MULT,(void(*)())BathMult); >> ierr = MatGetVecs(H,NULL,&xr); CHKERRQ(ierr); >> ierr = MatGetVecs(H,NULL,&xi); CHKERRQ(ierr); >> >> ierr = EPSCreate(PETSC_COMM_WORLD, &eps); CHKERRQ(ierr); >> ierr = EPSSetOperators(eps, H, NULL); CHKERRQ(ierr); >> ierr = EPSSetProblemType(eps, EPS_HEP); CHKERRQ(ierr); >> ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); CHKERRQ(ierr); >> ierr = EPSSetFromOptions( eps ); CHKERRQ(ierr); >> ierr = EPSSolve(eps); CHKERRQ(ierr); >> ierr = EPSGetConverged(eps, &nconv); CHKERRQ(ierr); >> for (j=0; j<1; j++) { >> EPSGetEigenpair(eps, j, &kr, &ki, xr, xi); >> printf("%s","Lowest Eigenvalue: "); >> PetscPrintf(PETSC_COMM_WORLD,"%9F",kr); >> PetscPrintf(PETSC_COMM_WORLD,"\n"); >> } >> EPSDestroy(&eps); >> >> ierr = SlepcFinalize(); >> return 0; >> } >> #undef __FUNCT__ >> >> #define __FUNCT__ "BathMult" >> PetscErrorCode BathMult(Mat H, Vec x, Vec y) >> { >> PetscInt l; >> uint slice; >> PetscScalar *arrayin,*arrayout; >> >> VecGetArray(x,&arrayin); >> VecGetArray(y,&arrayout); >> arma::cube A = arma::cube(arrayin,1,1,pow(d,L), >> /*copy_aux_mem*/false,/*strict*/true); >> arma::mat result = arma::mat(arrayout,pow(d,L),1, >> /*copy_aux_mem*/false,/*strict*/true); >> for (l=0;l> A.reshape(pow(d,L-2-l),pow(d,2),pow(d,l)); >> result.reshape(pow(d,L-l),pow(d,l)); >> for (slice=0;slice> result.col(slice) += vectorise(A.slice(slice)*hint); >> } >> } >> arrayin = A.memptr(); >> ierr = VecRestoreArray(x,&arrayin); CHKERRQ(ierr); >> arrayout = result.memptr(); >> ierr = VecRestoreArray(y,&arrayout); CHKERRQ(ierr); >> PetscFunctionReturn(0); >> } >> #undef __FUNCT__/* >> * >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jansen.gunnar at gmail.com Thu Jun 26 07:37:17 2014 From: jansen.gunnar at gmail.com (Gunnar Jansen) Date: Thu, 26 Jun 2014 14:37:17 +0200 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc In-Reply-To: References: Message-ID: Ok, I tried superlu_dist as well. Unfortunately the system seems to hang at more or less the same position. Sadly I can not check another version of openmpi since only this version is installed on the cluster at the time (which it needs to be because of CUDA for other programmers). The -info command told me that the processes were successfully started on both nodes. In the GMRES case this also leads to a clean run-through of the program. The -log_trace tells me that the problem occurs within the numeric factorization of the matrix. [5] 0.00311184 Event begin: MatLUFactorSym [1] 0.0049789 Event begin: MatLUFactorSym [3] 0.00316596 Event begin: MatLUFactorSym [4] 0.00345397 Event begin: MatLUFactorSym [0] 0.00546789 Event end: MatLUFactorSym [0] 0.0054841 Event begin: MatLUFactorNum [2] 0.00545907 Event end: MatLUFactorSym [2] 0.005476 Event begin: MatLUFactorNum [1] 0.00542402 Event end: MatLUFactorSym [1] 0.00544 Event begin: MatLUFactorNum [4] 0.00369906 Event end: MatLUFactorSym [4] 0.00372505 Event begin: MatLUFactorNum [3] 0.00371909 Event end: MatLUFactorSym [3] 0.00374603 Event begin: MatLUFactorNum [5] 0.00367594 Event end: MatLUFactorSym [5] 0.00370193 Event begin: MatLUFactorNum Any hints? 2014-06-25 17:17 GMT+02:00 Satish Balay : > Suggest running the non-mumps case with -log_summary [to confirm that > '-np 6' is actually used in both cases] > > Secondly - you can try a 'release' version of openmpi or mpich and see > if that works. [I don't see a mention of openmpi-1.9a on the website] > > Also you can try -log_trace to see where its hanging [or figure out how > to run code in debugger on this cluster]. But that might not help in > figuring out the solution to the hang.. > > Satish > > On Wed, 25 Jun 2014, Matthew Knepley wrote: > > > On Wed, Jun 25, 2014 at 7:09 AM, Gunnar Jansen > > wrote: > > > > > You are right about the queuing system. The job is submitted with a PBS > > > script specifying the number of nodes/processors. On the cluster petsc > is > > > configured in a module environment which sets the appropriate flags for > > > compilers/rules etc. > > > > > > The same exact job script on the same exact nodes with a standard > krylov > > > method does not give any trouble but executes nicely on all processors > (and > > > also give the correct result). > > > > > > Therefore my suspicion is a missing flag in the mumps interface. Is > this > > > maybe rather a topic for the mumps-dev team? > > > > > > > I doubt this. The whole point of MPI is to shield code from these > details. > > > > Can you first try this system with SuperLU_dist? > > > > > Thanks, > > > > MAtt > > > > > > > Best, Gunnar > > > > > > > > > > > > 2014-06-25 15:52 GMT+02:00 Dave May : > > > > > > This sounds weird. > > >> > > >> The launch line you provided doesn't include any information regarding > > >> how many processors (nodes/nodes per core to use). I presume you are > using > > >> a queuing system. My guess is that there could be an issue with > either (i) > > >> your job script, (ii) the configuration of the job scheduler on the > > >> machine, or (iii) the mpi installation on the machine. > > >> > > >> Have you been able to successfully run other petsc (or any mpi) codes > > >> with the same launch options (2 nodes, 3 procs per node)? > > >> > > >> Cheers. > > >> Dave > > >> > > >> > > >> > > >> > > >> On 25 June 2014 15:44, Gunnar Jansen wrote: > > >> > > >>> Hi, > > >>> > > >>> i try to solve a problem in parallel with MUMPS as the direct > solver. As > > >>> long as I run the program on only 1 node with 6 processors > everything works > > >>> fine! But using 2 nodes with 3 processors each gets mumps stuck in > the > > >>> factorization. > > >>> > > >>> For the purpose of testing I run the ex2.c on a resolution of 100x100 > > >>> (which is of course way to small for a direct solver in parallel). > > >>> > > >>> The code is run with : > > >>> mpirun ./ex2 -on_error_abort -pc_type lu > -pc_factor_mat_solver_package > > >>> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 > > >>> -mat_mumps_icntl_4 3 > > >>> > > >>> The petsc-configuration I used is: > > >>> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes > > >>> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no > --download-mumps > > >>> --download-scalapack --download-parmetis --download-metis > > >>> > > >>> Is this common behavior? Or is there an error in the petsc > configuration > > >>> I am using here? > > >>> > > >>> Best, > > >>> Gunnar > > >>> > > >> > > >> > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jun 26 12:37:28 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 26 Jun 2014 12:37:28 -0500 Subject: [petsc-users] Irritating behavior of MUMPS with PETSc In-Reply-To: References: Message-ID: <3770D890-E65B-4539-9D90-44BD1E91A79A@mcs.anl.gov> -display -start_in_debugger -debugger_nodes 1 for example just have the debugger run on node one. The tricky part is setting up the X windows so that the compute node can open the xterm back on your machine Then when the program ?hangs? hit control C in the debugger window and then type where and look around to see why it might be hanging. Barry On Jun 26, 2014, at 7:37 AM, Gunnar Jansen wrote: > Ok, I tried superlu_dist as well. Unfortunately the system seems to hang at more or less the same position. > > Sadly I can not check another version of openmpi since only this version is installed on the cluster at the time (which it needs to be because of CUDA for other programmers). > > The -info command told me that the processes were successfully started on both nodes. In the GMRES case this also leads to a clean run-through of the program. > > The -log_trace tells me that the problem occurs within the numeric factorization of the matrix. > > [5] 0.00311184 Event begin: MatLUFactorSym > [1] 0.0049789 Event begin: MatLUFactorSym > [3] 0.00316596 Event begin: MatLUFactorSym > [4] 0.00345397 Event begin: MatLUFactorSym > [0] 0.00546789 Event end: MatLUFactorSym > [0] 0.0054841 Event begin: MatLUFactorNum > [2] 0.00545907 Event end: MatLUFactorSym > [2] 0.005476 Event begin: MatLUFactorNum > [1] 0.00542402 Event end: MatLUFactorSym > [1] 0.00544 Event begin: MatLUFactorNum > [4] 0.00369906 Event end: MatLUFactorSym > [4] 0.00372505 Event begin: MatLUFactorNum > [3] 0.00371909 Event end: MatLUFactorSym > [3] 0.00374603 Event begin: MatLUFactorNum > [5] 0.00367594 Event end: MatLUFactorSym > [5] 0.00370193 Event begin: MatLUFactorNum > > Any hints? > > > > > 2014-06-25 17:17 GMT+02:00 Satish Balay : > Suggest running the non-mumps case with -log_summary [to confirm that > '-np 6' is actually used in both cases] > > Secondly - you can try a 'release' version of openmpi or mpich and see > if that works. [I don't see a mention of openmpi-1.9a on the website] > > Also you can try -log_trace to see where its hanging [or figure out how > to run code in debugger on this cluster]. But that might not help in > figuring out the solution to the hang.. > > Satish > > On Wed, 25 Jun 2014, Matthew Knepley wrote: > > > On Wed, Jun 25, 2014 at 7:09 AM, Gunnar Jansen > > wrote: > > > > > You are right about the queuing system. The job is submitted with a PBS > > > script specifying the number of nodes/processors. On the cluster petsc is > > > configured in a module environment which sets the appropriate flags for > > > compilers/rules etc. > > > > > > The same exact job script on the same exact nodes with a standard krylov > > > method does not give any trouble but executes nicely on all processors (and > > > also give the correct result). > > > > > > Therefore my suspicion is a missing flag in the mumps interface. Is this > > > maybe rather a topic for the mumps-dev team? > > > > > > > I doubt this. The whole point of MPI is to shield code from these details. > > > > Can you first try this system with SuperLU_dist? > > > > > Thanks, > > > > MAtt > > > > > > > Best, Gunnar > > > > > > > > > > > > 2014-06-25 15:52 GMT+02:00 Dave May : > > > > > > This sounds weird. > > >> > > >> The launch line you provided doesn't include any information regarding > > >> how many processors (nodes/nodes per core to use). I presume you are using > > >> a queuing system. My guess is that there could be an issue with either (i) > > >> your job script, (ii) the configuration of the job scheduler on the > > >> machine, or (iii) the mpi installation on the machine. > > >> > > >> Have you been able to successfully run other petsc (or any mpi) codes > > >> with the same launch options (2 nodes, 3 procs per node)? > > >> > > >> Cheers. > > >> Dave > > >> > > >> > > >> > > >> > > >> On 25 June 2014 15:44, Gunnar Jansen wrote: > > >> > > >>> Hi, > > >>> > > >>> i try to solve a problem in parallel with MUMPS as the direct solver. As > > >>> long as I run the program on only 1 node with 6 processors everything works > > >>> fine! But using 2 nodes with 3 processors each gets mumps stuck in the > > >>> factorization. > > >>> > > >>> For the purpose of testing I run the ex2.c on a resolution of 100x100 > > >>> (which is of course way to small for a direct solver in parallel). > > >>> > > >>> The code is run with : > > >>> mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package > > >>> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 > > >>> -mat_mumps_icntl_4 3 > > >>> > > >>> The petsc-configuration I used is: > > >>> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes > > >>> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps > > >>> --download-scalapack --download-parmetis --download-metis > > >>> > > >>> Is this common behavior? Or is there an error in the petsc configuration > > >>> I am using here? > > >>> > > >>> Best, > > >>> Gunnar > > >>> > > >> > > >> > > > > > > > > > > > From mc0710 at gmail.com Thu Jun 26 13:19:40 2014 From: mc0710 at gmail.com (Mani Chandra) Date: Thu, 26 Jun 2014 13:19:40 -0500 Subject: [petsc-users] Computing time derivatives in the residual for TS Message-ID: Hi, Suppose the following is the residual function that TS needs: void residualFunction(TS ts, PetscScalar t, Vec X, Vec dX_dt, Vec F, void *ptr) and this returns the following finite volume residual at each grid point dU_dt + gradF + sourceTerms = 0 where dU_dt are the time derivatives of the conserved variables. The value of dU_dt needs to be computed from the values of the time derivatives of the primitive variables given in dX_dt, i.e. dU_dt is an analytic function of dX_dt. But is it possible to compute dU_dt numerically using the vector X and it's value at a previous time step like the following? dU_dt = (U(X) - U(X_old) )/dt -- (1) (instead of analytically computing dU_dt which is a function of the vectors X and dX_dt which is hard to do if the function is very complicated) So, is computing dU_dt using (1) permissible using TS? The jacobian will be correctly assembled for implicit methods? Cheers, Mani -------------- next part -------------- An HTML attachment was scrubbed... URL: From akurlej at gmail.com Thu Jun 26 16:26:46 2014 From: akurlej at gmail.com (Arthur Kurlej) Date: Thu, 26 Jun 2014 16:26:46 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: I cannot send the original code, but I reproduced the problem in another code. I have attached a makefile the code, and the data for the x vector and A matrix. I think the problem may be with my ShortenMatrix function, but it's not clear to me what exactly is going wrong and how to fix it. So I would appreciate some assistance there. Thanks, Arthur On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith wrote: > > Can you send the code that reproduces this behavior? > > Barry > > On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: > > > Hi Barry, > > > > So for the matrix C that I am currently testing (size 162x162), the > condition number is roughly 10^4. > > > > For reference, I'm porting MATLAB code into PETSc, and for one > processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. > So I know that for one processor, my program is performing as expected. > > > > I've included examples below of values for b (also of size 162), ranging > from indices 131 to 141. > > > > #processors=1: > > 0 > > 1.315217173959314e-20 > > 1.315217173959314e-20 > > 4.843201487740107e-17 > > 4.843201487740107e-17 > > 8.166104700666665e-14 > > 8.166104700666665e-14 > > 6.303834267553249e-11 > > 6.303834267553249e-11 > > 2.227932688485483e-08 > > 2.227932688485483e-08 > > > > # processors=2: > > 5.480410831461926e-22 > > 2.892553944350444e-22 > > 2.892553944350444e-22 > > 7.524038923310717e-24 > > 7.524038923214420e-24 > > -3.340766769043093e-26 > > -7.558372155761972e-27 > > 5.551561288838557e-25 > > 5.550551546879874e-25 > > -1.579397982093437e-22 > > 2.655766754178065e-22 > > > > # processors = 4: > > 5.480410831461926e-22 > > 2.892553944351728e-22 > > 2.892553944351728e-22 > > 7.524092205125593e-24 > > 7.524092205125593e-24 > > -2.584939414228212e-26 > > -2.584939414228212e-26 > > 0 > > 0 > > -1.245940797657998e-23 > > -1.245940797657998e-23 > > > > # processors = 8: > > 5.480410831461926e-22 > > 2.892553944023035e-22 > > 2.892553944023035e-22 > > 7.524065744581494e-24 > > 7.524065744581494e-24 > > -2.250265175188197e-26 > > -2.250265175188197e-26 > > -6.543127892265160e-26 > > 1.544288143499193e-317 > > 8.788794008375919e-25 > > 8.788794008375919e-25 > > > > > > Thanks, > > Arthur > > > > > > > > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith wrote: > > > > How different are the values in b? Can you send back a few examples > of the different b?s? Any idea of the condition number of C? > > > > Barry > > > > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: > > > > > Hi all, > > > > > > While running my code, I have found that MatMult() returns different > values depending on the number of processors I use (and there is quite the > variance in the values). > > > > > > The setup of my code is as follows (I can go into more > depth/background if needed): > > > -Generate parallel AIJ matrix of size NxN, denoted as A > > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from > A, denoted as C > > > -Generate vector of length N-1, denoted as x > > > -Find C*x=b > > > > > > I have already checked that A, C, and x are all equivalent when ran > for any number of processors, it is only the values of vector b that varies. > > > > > > Does anyone have an idea about what's going on? > > > > > > > > > Thanks, > > > Arthur > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matmult_problem.zip Type: application/zip Size: 4768 bytes Desc: not available URL: From prbrune at gmail.com Thu Jun 26 17:32:25 2014 From: prbrune at gmail.com (Peter Brune) Date: Thu, 26 Jun 2014 17:32:25 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on the provided comm, so the logic you have built to call it on one proc at a time is unnecessary at best and most likely incorrect and likely to produce strange results. You can forgo the if statement and loop over processors, create the ISes on the same comm as x, and then call MatGetSubMatrix() once. - Peter On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej wrote: > I cannot send the original code, but I reproduced the problem in another > code. I have attached a makefile the code, and the data for the x vector > and A matrix. > > I think the problem may be with my ShortenMatrix function, but it's not > clear to me what exactly is going wrong and how to fix it. So I would > appreciate some assistance there. > > > Thanks, > Arthur > > > > On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith wrote: > >> >> Can you send the code that reproduces this behavior? >> >> Barry >> >> On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: >> >> > Hi Barry, >> > >> > So for the matrix C that I am currently testing (size 162x162), the >> condition number is roughly 10^4. >> > >> > For reference, I'm porting MATLAB code into PETSc, and for one >> processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. >> So I know that for one processor, my program is performing as expected. >> > >> > I've included examples below of values for b (also of size 162), >> ranging from indices 131 to 141. >> > >> > #processors=1: >> > 0 >> > 1.315217173959314e-20 >> > 1.315217173959314e-20 >> > 4.843201487740107e-17 >> > 4.843201487740107e-17 >> > 8.166104700666665e-14 >> > 8.166104700666665e-14 >> > 6.303834267553249e-11 >> > 6.303834267553249e-11 >> > 2.227932688485483e-08 >> > 2.227932688485483e-08 >> > >> > # processors=2: >> > 5.480410831461926e-22 >> > 2.892553944350444e-22 >> > 2.892553944350444e-22 >> > 7.524038923310717e-24 >> > 7.524038923214420e-24 >> > -3.340766769043093e-26 >> > -7.558372155761972e-27 >> > 5.551561288838557e-25 >> > 5.550551546879874e-25 >> > -1.579397982093437e-22 >> > 2.655766754178065e-22 >> > >> > # processors = 4: >> > 5.480410831461926e-22 >> > 2.892553944351728e-22 >> > 2.892553944351728e-22 >> > 7.524092205125593e-24 >> > 7.524092205125593e-24 >> > -2.584939414228212e-26 >> > -2.584939414228212e-26 >> > 0 >> > 0 >> > -1.245940797657998e-23 >> > -1.245940797657998e-23 >> > >> > # processors = 8: >> > 5.480410831461926e-22 >> > 2.892553944023035e-22 >> > 2.892553944023035e-22 >> > 7.524065744581494e-24 >> > 7.524065744581494e-24 >> > -2.250265175188197e-26 >> > -2.250265175188197e-26 >> > -6.543127892265160e-26 >> > 1.544288143499193e-317 >> > 8.788794008375919e-25 >> > 8.788794008375919e-25 >> > >> > >> > Thanks, >> > Arthur >> > >> > >> > >> > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith >> wrote: >> > >> > How different are the values in b? Can you send back a few examples >> of the different b?s? Any idea of the condition number of C? >> > >> > Barry >> > >> > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: >> > >> > > Hi all, >> > > >> > > While running my code, I have found that MatMult() returns different >> values depending on the number of processors I use (and there is quite the >> variance in the values). >> > > >> > > The setup of my code is as follows (I can go into more >> depth/background if needed): >> > > -Generate parallel AIJ matrix of size NxN, denoted as A >> > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from >> A, denoted as C >> > > -Generate vector of length N-1, denoted as x >> > > -Find C*x=b >> > > >> > > I have already checked that A, C, and x are all equivalent when ran >> for any number of processors, it is only the values of vector b that varies. >> > > >> > > Does anyone have an idea about what's going on? >> > > >> > > >> > > Thanks, >> > > Arthur >> > > >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fd.kong at siat.ac.cn Thu Jun 26 20:52:12 2014 From: fd.kong at siat.ac.cn (Fande Kong) Date: Fri, 27 Jun 2014 09:52:12 +0800 Subject: [petsc-users] Any examples to output a dmplex mesh as a hdf5 file? In-Reply-To: References: Message-ID: Hi Matt, Is it possible to send me some printed hdf5 and xdmf files (representing a simple mesh) that can be visualized by Paraview? Similarly like that printed by pylith. From these files I think I can figure out how to write a hdf5 viewer by myself. Sorry for bothering you. Thanks, On Wed, Jun 25, 2014 at 11:46 PM, Matthew Knepley wrote: > On Tue, Jun 24, 2014 at 7:50 AM, Fande Kong wrote: > >> Hi all, >> >> There are some functions called DMPlex_load_hdf5 and DMPlex_view_hdf5 in >> petsc-dev. They are really good functions for outputting the solution as a >> hdf5 file in parallel. Are there any examples to show how to use these >> functions? Or are there some printed hdf5 and xdmf files that can be >> visualized by paraview? >> > > This is very new code. I plan to write a manual section as soon as the > functionality solidifies. However, here is how I am currently using it. > Anywhere that you think about viewing something add > > ierr = PetscObjectViewFromOptions((PetscObject) obj, prefix, > "-my_obj_view");CHKERRQ(ierr); > > Then you can use the standard option style > > -my_obj_view hdf5:my.h5 > > This extends nicely to many objects. Here is what I use for my magma > dynamics output > > -dm_view hdf5:sol_solver_debug.h5 -magma_view_solution > hdf5:sol_solver_debug.h5::append -compaction_vec_view > hdf5:sol_solver_debug.h5:HDF5_VIZ:append > > There is still a problem in that you cannot choose multiple formats using > this method. I am going to extend > the view options format > > type:file:format:mode > > to allow > > type:file:format,format,format:mode > > to handle this. > > Thanks, > > Matt > > >> Thanks, >> >> Fande, >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From akurlej at gmail.com Thu Jun 26 20:56:13 2014 From: akurlej at gmail.com (Arthur Kurlej) Date: Thu, 26 Jun 2014 20:56:13 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: Hello, I'm sorry, but this is a bit new for me, and I'm still not quite sure I follow. Are you recommending that opposed to doing this: if(procs!=1){ for(i=0;i wrote: > MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on > the provided comm, so the logic you have built to call it on one proc at a > time is unnecessary at best and most likely incorrect and likely to produce > strange results. You can forgo the if statement and loop over processors, > create the ISes on the same comm as x, and then call MatGetSubMatrix() once. > > - Peter > > > On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej wrote: > >> I cannot send the original code, but I reproduced the problem in another >> code. I have attached a makefile the code, and the data for the x vector >> and A matrix. >> >> I think the problem may be with my ShortenMatrix function, but it's not >> clear to me what exactly is going wrong and how to fix it. So I would >> appreciate some assistance there. >> >> >> Thanks, >> Arthur >> >> >> >> On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith wrote: >> >>> >>> Can you send the code that reproduces this behavior? >>> >>> Barry >>> >>> On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: >>> >>> > Hi Barry, >>> > >>> > So for the matrix C that I am currently testing (size 162x162), the >>> condition number is roughly 10^4. >>> > >>> > For reference, I'm porting MATLAB code into PETSc, and for one >>> processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. >>> So I know that for one processor, my program is performing as expected. >>> > >>> > I've included examples below of values for b (also of size 162), >>> ranging from indices 131 to 141. >>> > >>> > #processors=1: >>> > 0 >>> > 1.315217173959314e-20 >>> > 1.315217173959314e-20 >>> > 4.843201487740107e-17 >>> > 4.843201487740107e-17 >>> > 8.166104700666665e-14 >>> > 8.166104700666665e-14 >>> > 6.303834267553249e-11 >>> > 6.303834267553249e-11 >>> > 2.227932688485483e-08 >>> > 2.227932688485483e-08 >>> > >>> > # processors=2: >>> > 5.480410831461926e-22 >>> > 2.892553944350444e-22 >>> > 2.892553944350444e-22 >>> > 7.524038923310717e-24 >>> > 7.524038923214420e-24 >>> > -3.340766769043093e-26 >>> > -7.558372155761972e-27 >>> > 5.551561288838557e-25 >>> > 5.550551546879874e-25 >>> > -1.579397982093437e-22 >>> > 2.655766754178065e-22 >>> > >>> > # processors = 4: >>> > 5.480410831461926e-22 >>> > 2.892553944351728e-22 >>> > 2.892553944351728e-22 >>> > 7.524092205125593e-24 >>> > 7.524092205125593e-24 >>> > -2.584939414228212e-26 >>> > -2.584939414228212e-26 >>> > 0 >>> > 0 >>> > -1.245940797657998e-23 >>> > -1.245940797657998e-23 >>> > >>> > # processors = 8: >>> > 5.480410831461926e-22 >>> > 2.892553944023035e-22 >>> > 2.892553944023035e-22 >>> > 7.524065744581494e-24 >>> > 7.524065744581494e-24 >>> > -2.250265175188197e-26 >>> > -2.250265175188197e-26 >>> > -6.543127892265160e-26 >>> > 1.544288143499193e-317 >>> > 8.788794008375919e-25 >>> > 8.788794008375919e-25 >>> > >>> > >>> > Thanks, >>> > Arthur >>> > >>> > >>> > >>> > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith >>> wrote: >>> > >>> > How different are the values in b? Can you send back a few examples >>> of the different b?s? Any idea of the condition number of C? >>> > >>> > Barry >>> > >>> > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: >>> > >>> > > Hi all, >>> > > >>> > > While running my code, I have found that MatMult() returns different >>> values depending on the number of processors I use (and there is quite the >>> variance in the values). >>> > > >>> > > The setup of my code is as follows (I can go into more >>> depth/background if needed): >>> > > -Generate parallel AIJ matrix of size NxN, denoted as A >>> > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from >>> A, denoted as C >>> > > -Generate vector of length N-1, denoted as x >>> > > -Find C*x=b >>> > > >>> > > I have already checked that A, C, and x are all equivalent when ran >>> for any number of processors, it is only the values of vector b that varies. >>> > > >>> > > Does anyone have an idea about what's going on? >>> > > >>> > > >>> > > Thanks, >>> > > Arthur >>> > > >>> > >>> > >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From prbrune at gmail.com Thu Jun 26 21:05:03 2014 From: prbrune at gmail.com (Peter Brune) Date: Thu, 26 Jun 2014 21:05:03 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: Yep! Looks good. One other thing is that you can pre/postface all PETSc calls with ierr = ... CHKERRQ(ierr); and it will properly trace other problems if they arise. - Peter On Thu, Jun 26, 2014 at 8:56 PM, Arthur Kurlej wrote: > Hello, > > I'm sorry, but this is a bit new for me, and I'm still not quite sure I > follow. > > Are you recommending that opposed to doing this: > if(procs!=1){ > for(i=0;i if(rank==i){ > VecGetLocalSize(*x,&length); > VecGetOwnershipRange(*x,&div,NULL); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&iscol); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&isrow); > ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); > CHKERRQ(ierr); > } > } > } > else{ > ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,&iscol); > ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,&isrow); > ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); > CHKERRQ(ierr); > } > > > The proper implementation would instead just be the following: > VecGetLocalSize(*x,&length); > VecGetOwnershipRange(*x,&div,NULL); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&iscol); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&isrow); > ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); > CHKERRQ(ierr); > ? > > > > > > On Thu, Jun 26, 2014 at 5:32 PM, Peter Brune wrote: > >> MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on >> the provided comm, so the logic you have built to call it on one proc at a >> time is unnecessary at best and most likely incorrect and likely to produce >> strange results. You can forgo the if statement and loop over processors, >> create the ISes on the same comm as x, and then call MatGetSubMatrix() once. >> >> - Peter >> >> >> On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej wrote: >> >>> I cannot send the original code, but I reproduced the problem in another >>> code. I have attached a makefile the code, and the data for the x vector >>> and A matrix. >>> >>> I think the problem may be with my ShortenMatrix function, but it's not >>> clear to me what exactly is going wrong and how to fix it. So I would >>> appreciate some assistance there. >>> >>> >>> Thanks, >>> Arthur >>> >>> >>> >>> On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith wrote: >>> >>>> >>>> Can you send the code that reproduces this behavior? >>>> >>>> Barry >>>> >>>> On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: >>>> >>>> > Hi Barry, >>>> > >>>> > So for the matrix C that I am currently testing (size 162x162), the >>>> condition number is roughly 10^4. >>>> > >>>> > For reference, I'm porting MATLAB code into PETSc, and for one >>>> processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. >>>> So I know that for one processor, my program is performing as expected. >>>> > >>>> > I've included examples below of values for b (also of size 162), >>>> ranging from indices 131 to 141. >>>> > >>>> > #processors=1: >>>> > 0 >>>> > 1.315217173959314e-20 >>>> > 1.315217173959314e-20 >>>> > 4.843201487740107e-17 >>>> > 4.843201487740107e-17 >>>> > 8.166104700666665e-14 >>>> > 8.166104700666665e-14 >>>> > 6.303834267553249e-11 >>>> > 6.303834267553249e-11 >>>> > 2.227932688485483e-08 >>>> > 2.227932688485483e-08 >>>> > >>>> > # processors=2: >>>> > 5.480410831461926e-22 >>>> > 2.892553944350444e-22 >>>> > 2.892553944350444e-22 >>>> > 7.524038923310717e-24 >>>> > 7.524038923214420e-24 >>>> > -3.340766769043093e-26 >>>> > -7.558372155761972e-27 >>>> > 5.551561288838557e-25 >>>> > 5.550551546879874e-25 >>>> > -1.579397982093437e-22 >>>> > 2.655766754178065e-22 >>>> > >>>> > # processors = 4: >>>> > 5.480410831461926e-22 >>>> > 2.892553944351728e-22 >>>> > 2.892553944351728e-22 >>>> > 7.524092205125593e-24 >>>> > 7.524092205125593e-24 >>>> > -2.584939414228212e-26 >>>> > -2.584939414228212e-26 >>>> > 0 >>>> > 0 >>>> > -1.245940797657998e-23 >>>> > -1.245940797657998e-23 >>>> > >>>> > # processors = 8: >>>> > 5.480410831461926e-22 >>>> > 2.892553944023035e-22 >>>> > 2.892553944023035e-22 >>>> > 7.524065744581494e-24 >>>> > 7.524065744581494e-24 >>>> > -2.250265175188197e-26 >>>> > -2.250265175188197e-26 >>>> > -6.543127892265160e-26 >>>> > 1.544288143499193e-317 >>>> > 8.788794008375919e-25 >>>> > 8.788794008375919e-25 >>>> > >>>> > >>>> > Thanks, >>>> > Arthur >>>> > >>>> > >>>> > >>>> > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith >>>> wrote: >>>> > >>>> > How different are the values in b? Can you send back a few >>>> examples of the different b?s? Any idea of the condition number of C? >>>> > >>>> > Barry >>>> > >>>> > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: >>>> > >>>> > > Hi all, >>>> > > >>>> > > While running my code, I have found that MatMult() returns >>>> different values depending on the number of processors I use (and there is >>>> quite the variance in the values). >>>> > > >>>> > > The setup of my code is as follows (I can go into more >>>> depth/background if needed): >>>> > > -Generate parallel AIJ matrix of size NxN, denoted as A >>>> > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns >>>> from A, denoted as C >>>> > > -Generate vector of length N-1, denoted as x >>>> > > -Find C*x=b >>>> > > >>>> > > I have already checked that A, C, and x are all equivalent when ran >>>> for any number of processors, it is only the values of vector b that varies. >>>> > > >>>> > > Does anyone have an idea about what's going on? >>>> > > >>>> > > >>>> > > Thanks, >>>> > > Arthur >>>> > > >>>> > >>>> > >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From akurlej at gmail.com Thu Jun 26 21:10:01 2014 From: akurlej at gmail.com (Arthur Kurlej) Date: Thu, 26 Jun 2014 21:10:01 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: Hmm, so I made that change, but my program still exhibits the same problem with it's output. On Thu, Jun 26, 2014 at 9:05 PM, Peter Brune wrote: > Yep! Looks good. One other thing is that you can pre/postface all PETSc > calls with ierr = ... CHKERRQ(ierr); and it will properly trace other > problems if they arise. > > - Peter > > > On Thu, Jun 26, 2014 at 8:56 PM, Arthur Kurlej wrote: > >> Hello, >> >> I'm sorry, but this is a bit new for me, and I'm still not quite sure I >> follow. >> >> Are you recommending that opposed to doing this: >> if(procs!=1){ >> for(i=0;i> if(rank==i){ >> VecGetLocalSize(*x,&length); >> VecGetOwnershipRange(*x,&div,NULL); >> ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&iscol); >> ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&isrow); >> ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); >> CHKERRQ(ierr); >> } >> } >> } >> else{ >> ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,&iscol); >> ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,&isrow); >> ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); >> CHKERRQ(ierr); >> } >> >> >> The proper implementation would instead just be the following: >> VecGetLocalSize(*x,&length); >> VecGetOwnershipRange(*x,&div,NULL); >> ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&iscol); >> ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&isrow); >> ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); >> CHKERRQ(ierr); >> ? >> >> >> >> >> >> On Thu, Jun 26, 2014 at 5:32 PM, Peter Brune wrote: >> >>> MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on >>> the provided comm, so the logic you have built to call it on one proc at a >>> time is unnecessary at best and most likely incorrect and likely to produce >>> strange results. You can forgo the if statement and loop over processors, >>> create the ISes on the same comm as x, and then call MatGetSubMatrix() once. >>> >>> - Peter >>> >>> >>> On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej >>> wrote: >>> >>>> I cannot send the original code, but I reproduced the problem in >>>> another code. I have attached a makefile the code, and the data for the x >>>> vector and A matrix. >>>> >>>> I think the problem may be with my ShortenMatrix function, but it's not >>>> clear to me what exactly is going wrong and how to fix it. So I would >>>> appreciate some assistance there. >>>> >>>> >>>> Thanks, >>>> Arthur >>>> >>>> >>>> >>>> On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith >>>> wrote: >>>> >>>>> >>>>> Can you send the code that reproduces this behavior? >>>>> >>>>> Barry >>>>> >>>>> On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: >>>>> >>>>> > Hi Barry, >>>>> > >>>>> > So for the matrix C that I am currently testing (size 162x162), the >>>>> condition number is roughly 10^4. >>>>> > >>>>> > For reference, I'm porting MATLAB code into PETSc, and for one >>>>> processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. >>>>> So I know that for one processor, my program is performing as expected. >>>>> > >>>>> > I've included examples below of values for b (also of size 162), >>>>> ranging from indices 131 to 141. >>>>> > >>>>> > #processors=1: >>>>> > 0 >>>>> > 1.315217173959314e-20 >>>>> > 1.315217173959314e-20 >>>>> > 4.843201487740107e-17 >>>>> > 4.843201487740107e-17 >>>>> > 8.166104700666665e-14 >>>>> > 8.166104700666665e-14 >>>>> > 6.303834267553249e-11 >>>>> > 6.303834267553249e-11 >>>>> > 2.227932688485483e-08 >>>>> > 2.227932688485483e-08 >>>>> > >>>>> > # processors=2: >>>>> > 5.480410831461926e-22 >>>>> > 2.892553944350444e-22 >>>>> > 2.892553944350444e-22 >>>>> > 7.524038923310717e-24 >>>>> > 7.524038923214420e-24 >>>>> > -3.340766769043093e-26 >>>>> > -7.558372155761972e-27 >>>>> > 5.551561288838557e-25 >>>>> > 5.550551546879874e-25 >>>>> > -1.579397982093437e-22 >>>>> > 2.655766754178065e-22 >>>>> > >>>>> > # processors = 4: >>>>> > 5.480410831461926e-22 >>>>> > 2.892553944351728e-22 >>>>> > 2.892553944351728e-22 >>>>> > 7.524092205125593e-24 >>>>> > 7.524092205125593e-24 >>>>> > -2.584939414228212e-26 >>>>> > -2.584939414228212e-26 >>>>> > 0 >>>>> > 0 >>>>> > -1.245940797657998e-23 >>>>> > -1.245940797657998e-23 >>>>> > >>>>> > # processors = 8: >>>>> > 5.480410831461926e-22 >>>>> > 2.892553944023035e-22 >>>>> > 2.892553944023035e-22 >>>>> > 7.524065744581494e-24 >>>>> > 7.524065744581494e-24 >>>>> > -2.250265175188197e-26 >>>>> > -2.250265175188197e-26 >>>>> > -6.543127892265160e-26 >>>>> > 1.544288143499193e-317 >>>>> > 8.788794008375919e-25 >>>>> > 8.788794008375919e-25 >>>>> > >>>>> > >>>>> > Thanks, >>>>> > Arthur >>>>> > >>>>> > >>>>> > >>>>> > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith >>>>> wrote: >>>>> > >>>>> > How different are the values in b? Can you send back a few >>>>> examples of the different b?s? Any idea of the condition number of C? >>>>> > >>>>> > Barry >>>>> > >>>>> > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej >>>>> wrote: >>>>> > >>>>> > > Hi all, >>>>> > > >>>>> > > While running my code, I have found that MatMult() returns >>>>> different values depending on the number of processors I use (and there is >>>>> quite the variance in the values). >>>>> > > >>>>> > > The setup of my code is as follows (I can go into more >>>>> depth/background if needed): >>>>> > > -Generate parallel AIJ matrix of size NxN, denoted as A >>>>> > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns >>>>> from A, denoted as C >>>>> > > -Generate vector of length N-1, denoted as x >>>>> > > -Find C*x=b >>>>> > > >>>>> > > I have already checked that A, C, and x are all equivalent when >>>>> ran for any number of processors, it is only the values of vector b that >>>>> varies. >>>>> > > >>>>> > > Does anyone have an idea about what's going on? >>>>> > > >>>>> > > >>>>> > > Thanks, >>>>> > > Arthur >>>>> > > >>>>> > >>>>> > >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jun 26 21:41:53 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 26 Jun 2014 21:41:53 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: Using the point wise differences in the two computed vectors is not a meaningful way to check if the matrix-vector product is correct. I loaded the reduced matrix up in matlab (after doing a MatView(C,PETSC_VIEWER_BINARY_WORLD) in the code) and did the calculation >> A(82:162,1:81)*x(1:81) + A(82:162,82:162)*x(82:162) - A(82:162,:)*x ans = -1.333828737741757e-23 3.970466940254533e-22 3.970466940254533e-22 6.606856988583543e-20 6.606856988583543e-20 -4.878909776184770e-19 -4.878909776184770e-19 0 0 0 0 0 0 -2.628894374088577e-31 -2.628894374088577e-31 -1.709836290543913e-26 -1.709836290543913e-26 -2.136210087789533e-24 -2.205680334546916e-24 -2.584939414228211e-26 -2.584939414228211e-26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> A(1:81,1:81)*x(1:81) + A(1:81,82:162)*x(82:162) - A(1:81,:)*x ans = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1.734723475976807e-18 0 0 0 1.925929944387236e-34 1.925929944387236e-34 1.972152263052530e-31 1.972152263052530e-31 3.453317498695501e-26 2.019483917365790e-28 -1.333828737741757e-23 Note that the two different ways of computing the same product (which would be identical in ?infinite precision?) gives differences on the same order as running the PETSc program on two processes. When running in parallel the order that the computations are done (and any movement of the partial sums from registers to memory) is different for different number of processes hence different results. Note also that the Intel floating point registers are 80 bits wide while double precision numbers in memory are 64 bit. Thus moving a partially computed sum to memory and then back to register wipes out 16 bits of the number. Barry On Jun 26, 2014, at 9:10 PM, Arthur Kurlej wrote: > Hmm, so I made that change, but my program still exhibits the same problem with it's output. > > > On Thu, Jun 26, 2014 at 9:05 PM, Peter Brune wrote: > Yep! Looks good. One other thing is that you can pre/postface all PETSc calls with ierr = ... CHKERRQ(ierr); and it will properly trace other problems if they arise. > > - Peter > > > On Thu, Jun 26, 2014 at 8:56 PM, Arthur Kurlej wrote: > Hello, > > I'm sorry, but this is a bit new for me, and I'm still not quite sure I follow. > > Are you recommending that opposed to doing this: > if(procs!=1){ > for(i=0;i if(rank==i){ > VecGetLocalSize(*x,&length); > VecGetOwnershipRange(*x,&div,NULL); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&iscol); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&isrow); > ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); CHKERRQ(ierr); > } > } > } > else{ > ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,&iscol); > ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,&isrow); > ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); CHKERRQ(ierr); > } > > > The proper implementation would instead just be the following: > VecGetLocalSize(*x,&length); > VecGetOwnershipRange(*x,&div,NULL); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&iscol); > ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,&isrow); > ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); CHKERRQ(ierr); > ? > > > > > > On Thu, Jun 26, 2014 at 5:32 PM, Peter Brune wrote: > MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on the provided comm, so the logic you have built to call it on one proc at a time is unnecessary at best and most likely incorrect and likely to produce strange results. You can forgo the if statement and loop over processors, create the ISes on the same comm as x, and then call MatGetSubMatrix() once. > > - Peter > > > On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej wrote: > I cannot send the original code, but I reproduced the problem in another code. I have attached a makefile the code, and the data for the x vector and A matrix. > > I think the problem may be with my ShortenMatrix function, but it's not clear to me what exactly is going wrong and how to fix it. So I would appreciate some assistance there. > > > Thanks, > Arthur > > > > On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith wrote: > > Can you send the code that reproduces this behavior? > > Barry > > On Jun 25, 2014, at 4:37 PM, Arthur Kurlej wrote: > > > Hi Barry, > > > > So for the matrix C that I am currently testing (size 162x162), the condition number is roughly 10^4. > > > > For reference, I'm porting MATLAB code into PETSc, and for one processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. So I know that for one processor, my program is performing as expected. > > > > I've included examples below of values for b (also of size 162), ranging from indices 131 to 141. > > > > #processors=1: > > 0 > > 1.315217173959314e-20 > > 1.315217173959314e-20 > > 4.843201487740107e-17 > > 4.843201487740107e-17 > > 8.166104700666665e-14 > > 8.166104700666665e-14 > > 6.303834267553249e-11 > > 6.303834267553249e-11 > > 2.227932688485483e-08 > > 2.227932688485483e-08 > > > > # processors=2: > > 5.480410831461926e-22 > > 2.892553944350444e-22 > > 2.892553944350444e-22 > > 7.524038923310717e-24 > > 7.524038923214420e-24 > > -3.340766769043093e-26 > > -7.558372155761972e-27 > > 5.551561288838557e-25 > > 5.550551546879874e-25 > > -1.579397982093437e-22 > > 2.655766754178065e-22 > > > > # processors = 4: > > 5.480410831461926e-22 > > 2.892553944351728e-22 > > 2.892553944351728e-22 > > 7.524092205125593e-24 > > 7.524092205125593e-24 > > -2.584939414228212e-26 > > -2.584939414228212e-26 > > 0 > > 0 > > -1.245940797657998e-23 > > -1.245940797657998e-23 > > > > # processors = 8: > > 5.480410831461926e-22 > > 2.892553944023035e-22 > > 2.892553944023035e-22 > > 7.524065744581494e-24 > > 7.524065744581494e-24 > > -2.250265175188197e-26 > > -2.250265175188197e-26 > > -6.543127892265160e-26 > > 1.544288143499193e-317 > > 8.788794008375919e-25 > > 8.788794008375919e-25 > > > > > > Thanks, > > Arthur > > > > > > > > On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith wrote: > > > > How different are the values in b? Can you send back a few examples of the different b?s? Any idea of the condition number of C? > > > > Barry > > > > On Jun 25, 2014, at 3:10 PM, Arthur Kurlej wrote: > > > > > Hi all, > > > > > > While running my code, I have found that MatMult() returns different values depending on the number of processors I use (and there is quite the variance in the values). > > > > > > The setup of my code is as follows (I can go into more depth/background if needed): > > > -Generate parallel AIJ matrix of size NxN, denoted as A > > > -Retrieve parallel AIJ submatrix from the last N-1 rows&columns from A, denoted as C > > > -Generate vector of length N-1, denoted as x > > > -Find C*x=b > > > > > > I have already checked that A, C, and x are all equivalent when ran for any number of processors, it is only the values of vector b that varies. > > > > > > Does anyone have an idea about what's going on? > > > > > > > > > Thanks, > > > Arthur > > > > > > > > > > > > > From jed at jedbrown.org Thu Jun 26 22:48:09 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 26 Jun 2014 22:48:09 -0500 Subject: [petsc-users] MatMult() returning different values depending on # of processors? In-Reply-To: References: <9C2B3350-9C6D-456A-9657-78E8874607D6@mcs.anl.gov> <6879BACE-9F82-4DF9-B99B-9D52FF1B1E86@mcs.anl.gov> Message-ID: <871tubrp1i.fsf@jedbrown.org> Barry Smith writes: > Note also that the Intel floating point registers are 80 bits wide > while double precision numbers in memory are 64 bit. Thus moving a > partially computed sum to memory and then back to register wipes > out 16 bits of the number. This comment refers to the x87 unit which fell out of favor when SSE provided better performance and vectorization. It is only used now if you compile for x86 (32-bit) without SSE support. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Thu Jun 26 23:31:51 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 26 Jun 2014 21:31:51 -0700 Subject: [petsc-users] Any examples to output a dmplex mesh as a hdf5 file? In-Reply-To: References: Message-ID: On Thu, Jun 26, 2014 at 6:52 PM, Fande Kong wrote: > Hi Matt, > > Is it possible to send me some printed hdf5 and xdmf files (representing > a simple mesh) that can be visualized by Paraview? Similarly like that > printed by pylith. From these files I think I can figure out how to write a > hdf5 viewer by myself. > I can, here is a representative HDF5 file (you can generate Xdmf using bin/pythonscripts/petsc_gen_xdmf.py). Did the viewer not work for you? Thanks, Matt > Sorry for bothering you. > > Thanks, > > > On Wed, Jun 25, 2014 at 11:46 PM, Matthew Knepley > wrote: > >> On Tue, Jun 24, 2014 at 7:50 AM, Fande Kong wrote: >> >>> Hi all, >>> >>> There are some functions called DMPlex_load_hdf5 and DMPlex_view_hdf5 in >>> petsc-dev. They are really good functions for outputting the solution as a >>> hdf5 file in parallel. Are there any examples to show how to use these >>> functions? Or are there some printed hdf5 and xdmf files that can be >>> visualized by paraview? >>> >> >> This is very new code. I plan to write a manual section as soon as the >> functionality solidifies. However, here is how I am currently using it. >> Anywhere that you think about viewing something add >> >> ierr = PetscObjectViewFromOptions((PetscObject) obj, prefix, >> "-my_obj_view");CHKERRQ(ierr); >> >> Then you can use the standard option style >> >> -my_obj_view hdf5:my.h5 >> >> This extends nicely to many objects. Here is what I use for my magma >> dynamics output >> >> -dm_view hdf5:sol_solver_debug.h5 -magma_view_solution >> hdf5:sol_solver_debug.h5::append -compaction_vec_view >> hdf5:sol_solver_debug.h5:HDF5_VIZ:append >> >> There is still a problem in that you cannot choose multiple formats using >> this method. I am going to extend >> the view options format >> >> type:file:format:mode >> >> to allow >> >> type:file:format,format,format:mode >> >> to handle this. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> >>> Fande, >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sol.h5 Type: application/octet-stream Size: 246288 bytes Desc: not available URL: From quecat001 at gmail.com Fri Jun 27 12:22:31 2014 From: quecat001 at gmail.com (Que Cat) Date: Fri, 27 Jun 2014 12:22:31 -0500 Subject: [petsc-users] Data type for PetscViewer Message-ID: Dear PetscUsers, How could we specify the data type for PetscView? For example, if we have a vector and we want to output to hdf5 with the integer number ( not float like 0.000). Thanks. Que -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jun 27 12:34:40 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 27 Jun 2014 10:34:40 -0700 Subject: [petsc-users] Data type for PetscViewer In-Reply-To: References: Message-ID: On Fri, Jun 27, 2014 at 10:22 AM, Que Cat wrote: > Dear PetscUsers, > > How could we specify the data type for PetscView? For example, if we have > a vector and we want to output to hdf5 with the integer number ( not float > like 0.000). Thanks. > Vecs hold only PetscScalar, so you cannot set this. Thanks, Matt > Que > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Fri Jun 27 23:22:37 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Fri, 27 Jun 2014 21:22:37 -0700 Subject: [petsc-users] Combining preconditioners Message-ID: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> Hello, I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? Many thanks for your help. Best Regards, Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jun 27 23:41:45 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 27 Jun 2014 23:41:45 -0500 Subject: [petsc-users] Combining preconditioners In-Reply-To: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> Message-ID: <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > Hello, > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? Yes > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. Barry > > Many thanks for your help. > > Best Regards, > Qin From lu_qin_2000 at yahoo.com Sat Jun 28 12:43:53 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Sat, 28 Jun 2014 10:43:53 -0700 Subject: [petsc-users] Combining preconditioners In-Reply-To: <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> Message-ID: <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> 1. About using?PCCOMPOSITE: I didn't state correctly in my first email. Actually, the rank of first preconditioner matrix is, say, half of the rank of the full matrix (the latter is used as the second preconditioner matrix), and the first preconditioner solves half of the unknowns (say, unknowns with odd index), how can I let PETSc know this info, so that the solution of the first preconditioner can be applied to the full matrix and update the full residual before applying the second preconditioner? In other words, does?PCCOMPOSITE require that the ranks of all preconditioner matrices be the same as the full matrix? 2. If I use?PCFIELDSPLIT, does it also need?PCCOMPOSITE to define multiple preconditioners? Thanks ?a lot, Qin ________________________________ From: Barry Smith To: Qin Lu Cc: petsc-users Sent: Friday, June 27, 2014 11:41 PM Subject: Re: [petsc-users] Combining preconditioners On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > Hello, > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? ? Yes > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? ? At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. ? If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. ? Barry > > Many thanks for your help. > > Best Regards, > Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jun 28 12:53:12 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 28 Jun 2014 12:53:12 -0500 Subject: [petsc-users] Combining preconditioners In-Reply-To: <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> Message-ID: On Jun 28, 2014, at 12:43 PM, Qin Lu wrote: > 1. About using PCCOMPOSITE: I didn't state correctly in my first email. Actually, the rank of first preconditioner matrix is, say, half of the rank of the full matrix (the latter is used as the second preconditioner matrix), and the first preconditioner solves half of the unknowns (say, unknowns with odd index), how can I let PETSc know this info, so that the solution of the first preconditioner can be applied to the full matrix and update the full residual before applying the second preconditioner? > > In other words, does PCCOMPOSITE require that the ranks of all preconditioner matrices be the same as the full matrix? No. You don?t have to tell it anything special. > > 2. If I use PCFIELDSPLIT, does it also need PCCOMPOSITE to define multiple preconditioners? No field split is a different way of handling multiple preconditioners Barry > > Thanks a lot, > Qin > > From: Barry Smith > To: Qin Lu > Cc: petsc-users > Sent: Friday, June 27, 2014 11:41 PM > Subject: Re: [petsc-users] Combining preconditioners > > > On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > > > Hello, > > > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? > > Yes > > > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? > > At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. > > If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. > > Barry > > > > > > Many thanks for your help. > > > > Best Regards, > > Qin > > From lu_qin_2000 at yahoo.com Sat Jun 28 14:13:29 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Sat, 28 Jun 2014 12:13:29 -0700 Subject: [petsc-users] Combining preconditioners In-Reply-To: References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> Message-ID: <1403982809.7932.YahooMailNeo@web160204.mail.bf1.yahoo.com> About 1, I don't get it. How does PETSc know what unknowns the first preconditioner solves? i.e., how does PETSc know the first preconditioner solves the unknowns with odd index rather than with even index? This info is necessary for updating the full residual, probably through a restriction (mapping) matrix/vector? Thanks, Qin ________________________________ From: Barry Smith To: Qin Lu Cc: petsc-users Sent: Saturday, June 28, 2014 12:53 PM Subject: Re: [petsc-users] Combining preconditioners On Jun 28, 2014, at 12:43 PM, Qin Lu wrote: > 1. About using PCCOMPOSITE: I didn't state correctly in my first email. Actually, the rank of first preconditioner matrix is, say, half of the rank of the full matrix (the latter is used as the second preconditioner matrix), and the first preconditioner solves half of the unknowns (say, unknowns with odd index), how can I let PETSc know this info, so that the solution of the first preconditioner can be applied to the full matrix and update the full residual before applying the second preconditioner? > > In other words, does PCCOMPOSITE require that the ranks of all preconditioner matrices be the same as the full matrix? ? No. You don?t have to tell it anything special. > > 2. If I use PCFIELDSPLIT, does it also need PCCOMPOSITE to define multiple preconditioners? ? No field split is a different way of handling multiple preconditioners ? Barry > > Thanks? a lot, > Qin > > From: Barry Smith > To: Qin Lu > Cc: petsc-users > Sent: Friday, June 27, 2014 11:41 PM > Subject: Re: [petsc-users] Combining preconditioners > > > On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > > > Hello, > > > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? > >? Yes > > > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? > >? At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. > >? If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. > >? Barry > > > > > > Many thanks for your help. > > > > Best Regards, > > Qin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jun 28 14:44:59 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 28 Jun 2014 14:44:59 -0500 Subject: [petsc-users] Combining preconditioners In-Reply-To: <1403982809.7932.YahooMailNeo@web160204.mail.bf1.yahoo.com> References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> <1403982809.7932.YahooMailNeo@web160204.mail.bf1.yahoo.com> Message-ID: <26D1B77C-F705-4E4A-A892-E1EABF3A1EBE@mcs.anl.gov> With composite it updates the entire residual. Yes, a portion of the computations just make a zero update but as I said before part of a matrix vector product is generally much cheaper than a full preconditioner. The field split preconditioner has code to update just portions of the residual, but don?t use it until you see that your preconditioner has very good convergence properties. Barry On Jun 28, 2014, at 2:13 PM, Qin Lu wrote: > About 1, I don't get it. How does PETSc know what unknowns the first preconditioner solves? i.e., how does PETSc know the first preconditioner solves the unknowns with odd index rather than with even index? This info is necessary for updating the full residual, probably through a restriction (mapping) matrix/vector? > > Thanks, > Qin > > From: Barry Smith > To: Qin Lu > Cc: petsc-users > Sent: Saturday, June 28, 2014 12:53 PM > Subject: Re: [petsc-users] Combining preconditioners > > > On Jun 28, 2014, at 12:43 PM, Qin Lu wrote: > > > 1. About using PCCOMPOSITE: I didn't state correctly in my first email. Actually, the rank of first preconditioner matrix is, say, half of the rank of the full matrix (the latter is used as the second preconditioner matrix), and the first preconditioner solves half of the unknowns (say, unknowns with odd index), how can I let PETSc know this info, so that the solution of the first preconditioner can be applied to the full matrix and update the full residual before applying the second preconditioner? > > > > In other words, does PCCOMPOSITE require that the ranks of all preconditioner matrices be the same as the full matrix? > > No. You don?t have to tell it anything special. > > > > 2. If I use PCFIELDSPLIT, does it also need PCCOMPOSITE to define multiple preconditioners? > > No field split is a different way of handling multiple preconditioners > > Barry > > > > > > > > Thanks a lot, > > Qin > > > > From: Barry Smith > > To: Qin Lu > > Cc: petsc-users > > Sent: Friday, June 27, 2014 11:41 PM > > Subject: Re: [petsc-users] Combining preconditioners > > > > > > On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > > > > > Hello, > > > > > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > > > > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? > > > > Yes > > > > > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? > > > > At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. > > > > If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. > > > > Barry > > > > > > > > > > Many thanks for your help. > > > > > > Best Regards, > > > Qin > > > > > > From lu_qin_2000 at yahoo.com Sat Jun 28 18:32:13 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Sat, 28 Jun 2014 16:32:13 -0700 Subject: [petsc-users] Combining preconditioners In-Reply-To: <26D1B77C-F705-4E4A-A892-E1EABF3A1EBE@mcs.anl.gov> References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> <1403982809.7932.YahooMailNeo@web160204.mail.bf1.yahoo.com> <26D1B77C-F705-4E4A-A892-E1EABF3A1EBE@mcs.anl.gov> Message-ID: <1403998333.87655.YahooMailNeo@web160204.mail.bf1.yahoo.com> I am not sure if I understand you correctly. See page 83 of PETSc manual (revision 3.4). B1 and B2 are two preconditioner matrices. The full residual is updated after application of the first preconditioner: y = B1x w1 = x - Ay In my case, say A is 200 by 200, but B1 is 100 by 100, so y is 1 by 100. It seems to me we need a prolongation matrix P to make Ay possible, that is, w1 = x - APy. If this is the case, how can I pass P to PETSc? Thanks, Qin ________________________________ From: Barry Smith To: Qin Lu Cc: petsc-users Sent: Saturday, June 28, 2014 2:44 PM Subject: Re: [petsc-users] Combining preconditioners ? With composite it updates the entire residual. Yes, a portion of the computations just make a zero update but as I said before part of a matrix vector product is generally much cheaper than a full preconditioner. ? The field split preconditioner has code to update just portions of the residual, but don?t use it until you see that your preconditioner has very good convergence properties. ? Barry On Jun 28, 2014, at 2:13 PM, Qin Lu wrote: > About 1, I don't get it. How does PETSc know what unknowns the first preconditioner solves? i.e., how does PETSc know the first preconditioner solves the unknowns with odd index rather than with even index? This info is necessary for updating the full residual, probably through a restriction (mapping) matrix/vector? > > Thanks, > Qin > > From: Barry Smith > To: Qin Lu > Cc: petsc-users > Sent: Saturday, June 28, 2014 12:53 PM > Subject: Re: [petsc-users] Combining preconditioners > > > On Jun 28, 2014, at 12:43 PM, Qin Lu wrote: > > > 1. About using PCCOMPOSITE: I didn't state correctly in my first email. Actually, the rank of first preconditioner matrix is, say, half of the rank of the full matrix (the latter is used as the second preconditioner matrix), and the first preconditioner solves half of the unknowns (say, unknowns with odd index), how can I let PETSc know this info, so that the solution of the first preconditioner can be applied to the full matrix and update the full residual before applying the second preconditioner? > > > > In other words, does PCCOMPOSITE require that the ranks of all preconditioner matrices be the same as the full matrix? > >? No. You don?t have to tell it anything special. > > > > 2. If I use PCFIELDSPLIT, does it also need PCCOMPOSITE to define multiple preconditioners? > >? No field split is a different way of handling multiple preconditioners > >? Barry > > > > > > > > Thanks? a lot, > > Qin > > > > From: Barry Smith > > To: Qin Lu > > Cc: petsc-users > > Sent: Friday, June 27, 2014 11:41 PM > > Subject: Re: [petsc-users] Combining preconditioners > > > > > > On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > > > > > Hello, > > > > > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > > > > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? > > > >? Yes > > > > > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? > > > >? At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. > > > >? If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. > > > >? Barry > > > > > > > > > > Many thanks for your help. > > > > > > Best Regards, > > > Qin > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jun 28 19:07:50 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 28 Jun 2014 19:07:50 -0500 Subject: [petsc-users] Combining preconditioners In-Reply-To: <1403998333.87655.YahooMailNeo@web160204.mail.bf1.yahoo.com> References: <1403929357.93477.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1C39CDF7-A7DA-4BEA-8122-A573D421A543@mcs.anl.gov> <1403977433.27032.YahooMailNeo@web160201.mail.bf1.yahoo.com> <1403982809.7932.YahooMailNeo@web160204.mail.bf1.yahoo.com> <26D1B77C-F705-4E4A-A892-E1EABF3A1EBE@mcs.anl.gov> <1403998333.87655.YahooMailNeo@web160204.mail.bf1.yahoo.com> Message-ID: Ok, I see what you are getting at: It is part of your PCShell?s responsibility to get any subset of the variables you are applying your ?preconditioner" (restriction) to and then prolongation after the application your ?preconditioner?. You can use the PCGALERKIN if want to handle the ?restriction? and ?prolongation? for you. Hence you would still use PCCOMPOSITE but the first PC inside it would be a PCGALERKIN (the second ILU) then inside the PCGALERKIN would be whatever your ?smaller? preconditioner is. Barry On Jun 28, 2014, at 6:32 PM, Qin Lu wrote: > I am not sure if I understand you correctly. See page 83 of PETSc manual (revision 3.4). B1 and B2 are two preconditioner matrices. The full residual is updated after application of the first preconditioner: > > y = B1x > w1 = x - Ay > > In my case, say A is 200 by 200, but B1 is 100 by 100, so y is 1 by 100. It seems to me we need a prolongation matrix P to make Ay possible, that is, w1 = x - APy. If this is the case, how can I pass P to PETSc? > > Thanks, > Qin > > From: Barry Smith > To: Qin Lu > Cc: petsc-users > Sent: Saturday, June 28, 2014 2:44 PM > Subject: Re: [petsc-users] Combining preconditioners > > > With composite it updates the entire residual. Yes, a portion of the computations just make a zero update but as I said before part of a matrix vector product is generally much cheaper than a full preconditioner. > > The field split preconditioner has code to update just portions of the residual, but don?t use it until you see that your preconditioner has very good convergence properties. > > Barry > > > > On Jun 28, 2014, at 2:13 PM, Qin Lu wrote: > > > About 1, I don't get it. How does PETSc know what unknowns the first preconditioner solves? i.e., how does PETSc know the first preconditioner solves the unknowns with odd index rather than with even index? This info is necessary for updating the full residual, probably through a restriction (mapping) matrix/vector? > > > > Thanks, > > Qin > > > > From: Barry Smith > > To: Qin Lu > > Cc: petsc-users > > Sent: Saturday, June 28, 2014 12:53 PM > > Subject: Re: [petsc-users] Combining preconditioners > > > > > > On Jun 28, 2014, at 12:43 PM, Qin Lu wrote: > > > > > 1. About using PCCOMPOSITE: I didn't state correctly in my first email. Actually, the rank of first preconditioner matrix is, say, half of the rank of the full matrix (the latter is used as the second preconditioner matrix), and the first preconditioner solves half of the unknowns (say, unknowns with odd index), how can I let PETSc know this info, so that the solution of the first preconditioner can be applied to the full matrix and update the full residual before applying the second preconditioner? > > > > > > In other words, does PCCOMPOSITE require that the ranks of all preconditioner matrices be the same as the full matrix? > > > > No. You don?t have to tell it anything special. > > > > > > 2. If I use PCFIELDSPLIT, does it also need PCCOMPOSITE to define multiple preconditioners? > > > > No field split is a different way of handling multiple preconditioners > > > > Barry > > > > > > > > > > > > > > Thanks a lot, > > > Qin > > > > > > From: Barry Smith > > > To: Qin Lu > > > Cc: petsc-users > > > Sent: Friday, June 27, 2014 11:41 PM > > > Subject: Re: [petsc-users] Combining preconditioners > > > > > > > > > On Jun 27, 2014, at 11:22 PM, Qin Lu wrote: > > > > > > > Hello, > > > > > > > > I would like to combine two preconditioners in PETSc linear solver. The first preconditioner is user defined, the second one is just PETSc ILU, and the residual is updated after application of each preconditioner (the multiplicative form). There are two questions: > > > > > > > > 1. Shall I use PCShellSetApply to set the user defined preconditioner, and then use PCCompositeAddPC to combine the 2 preconditioners? > > > > > > Yes > > > > > > > 2. The user defined preconditioner only applies to part of the components of the unknowns, in other words, the rank of the first preconditioner matrix is less than the rank of the full matrix. How can I let PETSc know how to update the residual after the application of the first preconditioner? Can I define a routine of residual updating for PETSc? > > > > > > At first just use PCCOMPOSITE and let PETSc compute the residual by doing the usual complete matrix-vector product. Usually the cost of the matrix vector product is much less then a preconditioner so it is not worth optimizing. > > > > > > If the composed preconditioner works very well and the shell PC affects only a small percentage of the components of the problem then you can switch to PCFIELDSPLIT which does support only updating a portion of the residual. > > > > > > Barry > > > > > > > > > > > > > > Many thanks for your help. > > > > > > > > Best Regards, > > > > Qin > > > > > > > > > > > > From stali at geology.wisc.edu Sun Jun 29 11:28:52 2014 From: stali at geology.wisc.edu (Tabrez Ali) Date: Sun, 29 Jun 2014 11:28:52 -0500 Subject: [petsc-users] Valgrind (invalid read) error with ASM in petsc-dev Message-ID: <53B03EC4.3020504@geology.wisc.edu> Hello I get the following Valgrind error (below) while using ASM in petsc-dev. No such error occurs with petsc-3.4.x. Any ideas as to what could be wrong? Thanks in advance. Tabrez ==22745== Memcheck, a memory error detector ==22745== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==22745== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==22745== Command: ./defmod -f examples/two_quads_qs.inp -log_summary ==22745== Reading input ... Reading mesh data ... Forming [K] ... Forming RHS ... Setting up solver ... Solving ... ==22745== Invalid read of size 4 ==22745== at 0x4231567: ISStrideGetInfo (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x678C5DF: ??? ==22745== Address 0x678c3ec is 0 bytes after a block of size 12 alloc'd ==22745== at 0x40278A4: memalign (vg_replace_malloc.c:694) ==22745== by 0x40FBE89: PetscMallocAlign (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x42288FD: ISCreate_General (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x4221494: ISSetType (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x4227745: ISCreateGeneral (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x45F104F: MatIncreaseOverlap_SeqAIJ (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x473B4F2: MatIncreaseOverlap (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x4A9CFC0: PCSetUp_ASM (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x49BF314: PCSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x4ABAB25: KSPSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x4ABC8A8: KSPSolve (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x4AE2DE0: kspsolve_ (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== Recovering stress ... Cleaning up ... Finished ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./defmod on a arch-linux2-c-debug named i5 with 1 processor, by stali Sun Jun 29 11:02:25 2014 Using Petsc Development GIT revision: v3.4.4-4963-g821251a GIT Date: 2014-06-28 12:39:08 -0500 Max Max/Min Avg Total Time (sec): 2.261e+00 1.00000 2.261e+00 Objects: 3.100e+01 1.00000 3.100e+01 Flops: 1.207e+03 1.00000 1.207e+03 1.207e+03 Flops/sec: 5.339e+02 1.00000 5.339e+02 5.339e+02 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.2451e+00 99.3% 1.2070e+03 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKer 1 1.0 7.3960e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ThreadCommBarrie 1 1.0 8.2803e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 1 1.0 1.0480e-02 1.0 2.30e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 0 VecNorm 2 1.0 1.2341e-02 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 0 VecScale 2 1.0 1.2405e-02 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 0 VecCopy 1 1.0 9.0401e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 10 1.0 4.3447e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1 1.0 1.1247e-02 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 0 VecMAXPY 2 1.0 6.3708e-03 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 0 VecAssemblyBegin 1 1.0 2.5654e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAssemblyEnd 1 1.0 9.2661e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 5 1.0 8.8482e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2 1.0 3.3422e-02 1.0 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 6 0 0 0 1 6 0 0 0 0 MatMult 1 1.0 1.5871e-02 1.0 2.12e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 1 18 0 0 0 0 MatSolve 2 1.0 3.1852e-02 1.0 4.24e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 35 0 0 0 1 35 0 0 0 0 MatLUFactorNum 1 1.0 9.2084e-02 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 4 33 0 0 0 4 33 0 0 0 0 MatILUFactorSym 1 1.0 5.1554e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatAssemblyBegin 2 1.0 6.5899e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 2.8979e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 1 1.0 2.5041e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 2.8049e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetOrdering 1 1.0 5.0988e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatIncreaseOvrlp 1 1.0 3.7770e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 KSPGMRESOrthog 1 1.0 2.9687e-02 1.0 4.70e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 0 KSPSetUp 2 1.0 1.8465e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 KSPSolve 1 1.0 7.4545e-01 1.0 1.21e+03 1.0 0.0e+00 0.0e+00 0.0e+00 33100 0 0 0 33100 0 0 0 0 PCSetUp 2 1.0 4.4820e-01 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 20 33 0 0 0 20 33 0 0 0 0 PCSetUpOnBlocks 1 1.0 2.3395e-01 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 10 33 0 0 0 10 33 0 0 0 0 PCApply 2 1.0 7.8817e-02 1.0 4.24e+02 1.0 0.0e+00 0.0e+00 0.0e+00 3 35 0 0 0 4 35 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 11 11 10840 0 Vector Scatter 2 2 784 0 Matrix 3 3 10120 0 Index Set 10 10 4872 0 Krylov Solver 2 2 18316 0 Preconditioner 2 2 1204 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0.000123882 #PETSc Option Table entries: -f examples/two_quads_qs.inp -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-fc=gfortran --with-cc=gcc --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS=-O2 --FOPTFLAGS=-O2 --with-shared-libraries ----------------------------------------- Libraries compiled on Sun Jun 29 08:08:16 2014 on i5 Machine characteristics: Linux-3.2.0-4-686-pae-i686-with-debian-7.5 Using PETSc directory: /home/stali/petsc-dev Using PETSc arch: arch-linux2-c-debug ----------------------------------------- Using C compiler: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O2 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O2 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/stali/petsc-dev/arch-linux2-c-debug/include -I/home/stali/petsc-dev/include -I/home/stali/petsc-dev/include -I/home/stali/petsc-dev/arch-linux2-c-debug/include ----------------------------------------- Using C linker: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpicc Using Fortran linker: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpif90 Using libraries: -Wl,-rpath,/home/stali/petsc-dev/arch-linux2-c-debug/lib -L/home/stali/petsc-dev/arch-linux2-c-debug/lib -lpetsc -llapack -lblas -Wl,-rpath,/home/stali/petsc-dev/arch-linux2-c-debug/lib -L/home/stali/petsc-dev/arch-linux2-c-debug/lib -lmetis -lX11 -lpthread -lssl -lcrypto -lm -Wl,-rpath,/usr/lib/gcc/i486-linux-gnu/4.7 -L/usr/lib/gcc/i486-linux-gnu/4.7 -Wl,-rpath,/usr/lib/i386-linux-gnu -L/usr/lib/i386-linux-gnu -Wl,-rpath,/lib/i386-linux-gnu -L/lib/i386-linux-gnu -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl ----------------------------------------- ==22745== ==22745== HEAP SUMMARY: ==22745== in use at exit: 0 bytes in 0 blocks ==22745== total heap usage: 3,723 allocs, 3,723 frees, 282,348 bytes allocated ==22745== ==22745== All heap blocks were freed -- no leaks are possible ==22745== ==22745== For counts of detected and suppressed errors, rerun with: -v ==22745== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 61 from 8) From bsmith at mcs.anl.gov Sun Jun 29 12:41:23 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 29 Jun 2014 12:41:23 -0500 Subject: [petsc-users] Valgrind (invalid read) error with ASM in petsc-dev In-Reply-To: <53B03EC4.3020504@geology.wisc.edu> References: <53B03EC4.3020504@geology.wisc.edu> Message-ID: Hmm, what code is calling the ISStrideGetInfo(). It is a ISGENERAL so ISStrideGetInfo() shouldn?t get called on it. Barry ==22745== Invalid read of size 4 ==22745== at 0x4231567: ISStrideGetInfo (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) ==22745== by 0x678C5DF: ??? On Jun 29, 2014, at 11:28 AM, Tabrez Ali wrote: > Hello > > I get the following Valgrind error (below) while using ASM in petsc-dev. No such error occurs with petsc-3.4.x. Any ideas as to what could be wrong? > > Thanks in advance. > > Tabrez > > > ==22745== Memcheck, a memory error detector > ==22745== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. > ==22745== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info > ==22745== Command: ./defmod -f examples/two_quads_qs.inp -log_summary > ==22745== > Reading input ... > Reading mesh data ... > Forming [K] ... > Forming RHS ... > Setting up solver ... > Solving ... > ==22745== Invalid read of size 4 > ==22745== at 0x4231567: ISStrideGetInfo (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x678C5DF: ??? > ==22745== Address 0x678c3ec is 0 bytes after a block of size 12 alloc'd > ==22745== at 0x40278A4: memalign (vg_replace_malloc.c:694) > ==22745== by 0x40FBE89: PetscMallocAlign (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x42288FD: ISCreate_General (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x4221494: ISSetType (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x4227745: ISCreateGeneral (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x45F104F: MatIncreaseOverlap_SeqAIJ (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x473B4F2: MatIncreaseOverlap (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x4A9CFC0: PCSetUp_ASM (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x49BF314: PCSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x4ABAB25: KSPSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x4ABC8A8: KSPSolve (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x4AE2DE0: kspsolve_ (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== > Recovering stress ... > Cleaning up ... > Finished > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./defmod on a arch-linux2-c-debug named i5 with 1 processor, by stali Sun Jun 29 11:02:25 2014 > Using Petsc Development GIT revision: v3.4.4-4963-g821251a GIT Date: 2014-06-28 12:39:08 -0500 > > Max Max/Min Avg Total > Time (sec): 2.261e+00 1.00000 2.261e+00 > Objects: 3.100e+01 1.00000 3.100e+01 > Flops: 1.207e+03 1.00000 1.207e+03 1.207e+03 > Flops/sec: 5.339e+02 1.00000 5.339e+02 5.339e+02 > MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 > MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 > MPI Reductions: 0.000e+00 0.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 2.2451e+00 99.3% 1.2070e+03 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run ./configure # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > ThreadCommRunKer 1 1.0 7.3960e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ThreadCommBarrie 1 1.0 8.2803e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecMDot 1 1.0 1.0480e-02 1.0 2.30e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 0 > VecNorm 2 1.0 1.2341e-02 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 0 > VecScale 2 1.0 1.2405e-02 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 0 > VecCopy 1 1.0 9.0401e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 10 1.0 4.3447e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 1 1.0 1.1247e-02 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 0 > VecMAXPY 2 1.0 6.3708e-03 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 0 > VecAssemblyBegin 1 1.0 2.5654e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecAssemblyEnd 1 1.0 9.2661e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 5 1.0 8.8482e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecNormalize 2 1.0 3.3422e-02 1.0 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 6 0 0 0 1 6 0 0 0 0 > MatMult 1 1.0 1.5871e-02 1.0 2.12e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 1 18 0 0 0 0 > MatSolve 2 1.0 3.1852e-02 1.0 4.24e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 35 0 0 0 1 35 0 0 0 0 > MatLUFactorNum 1 1.0 9.2084e-02 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 4 33 0 0 0 4 33 0 0 0 0 > MatILUFactorSym 1 1.0 5.1554e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > MatAssemblyBegin 2 1.0 6.5899e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyEnd 2 1.0 2.8979e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > MatGetRowIJ 1 1.0 2.5041e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 1 1.0 2.8049e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > MatGetOrdering 1 1.0 5.0988e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > MatIncreaseOvrlp 1 1.0 3.7770e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPGMRESOrthog 1 1.0 2.9687e-02 1.0 4.70e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 0 > KSPSetUp 2 1.0 1.8465e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > KSPSolve 1 1.0 7.4545e-01 1.0 1.21e+03 1.0 0.0e+00 0.0e+00 0.0e+00 33100 0 0 0 33100 0 0 0 0 > PCSetUp 2 1.0 4.4820e-01 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 20 33 0 0 0 20 33 0 0 0 0 > PCSetUpOnBlocks 1 1.0 2.3395e-01 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 10 33 0 0 0 10 33 0 0 0 0 > PCApply 2 1.0 7.8817e-02 1.0 4.24e+02 1.0 0.0e+00 0.0e+00 0.0e+00 3 35 0 0 0 4 35 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Vector 11 11 10840 0 > Vector Scatter 2 2 784 0 > Matrix 3 3 10120 0 > Index Set 10 10 4872 0 > Krylov Solver 2 2 18316 0 > Preconditioner 2 2 1204 0 > Viewer 1 0 0 0 > ======================================================================================================================== > Average time to get PetscTime(): 0.000123882 > #PETSc Option Table entries: > -f examples/two_quads_qs.inp > -log_summary > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --with-fc=gfortran --with-cc=gcc --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS=-O2 --FOPTFLAGS=-O2 --with-shared-libraries > ----------------------------------------- > Libraries compiled on Sun Jun 29 08:08:16 2014 on i5 > Machine characteristics: Linux-3.2.0-4-686-pae-i686-with-debian-7.5 > Using PETSc directory: /home/stali/petsc-dev > Using PETSc arch: arch-linux2-c-debug > ----------------------------------------- > > Using C compiler: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O2 ${COPTFLAGS} ${CFLAGS} > Using Fortran compiler: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O2 ${FOPTFLAGS} ${FFLAGS} > ----------------------------------------- > > Using include paths: -I/home/stali/petsc-dev/arch-linux2-c-debug/include -I/home/stali/petsc-dev/include -I/home/stali/petsc-dev/include -I/home/stali/petsc-dev/arch-linux2-c-debug/include > ----------------------------------------- > > Using C linker: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpicc > Using Fortran linker: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpif90 > Using libraries: -Wl,-rpath,/home/stali/petsc-dev/arch-linux2-c-debug/lib -L/home/stali/petsc-dev/arch-linux2-c-debug/lib -lpetsc -llapack -lblas -Wl,-rpath,/home/stali/petsc-dev/arch-linux2-c-debug/lib -L/home/stali/petsc-dev/arch-linux2-c-debug/lib -lmetis -lX11 -lpthread -lssl -lcrypto -lm -Wl,-rpath,/usr/lib/gcc/i486-linux-gnu/4.7 -L/usr/lib/gcc/i486-linux-gnu/4.7 -Wl,-rpath,/usr/lib/i386-linux-gnu -L/usr/lib/i386-linux-gnu -Wl,-rpath,/lib/i386-linux-gnu -L/lib/i386-linux-gnu -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > ----------------------------------------- > > ==22745== > ==22745== HEAP SUMMARY: > ==22745== in use at exit: 0 bytes in 0 blocks > ==22745== total heap usage: 3,723 allocs, 3,723 frees, 282,348 bytes allocated > ==22745== > ==22745== All heap blocks were freed -- no leaks are possible > ==22745== > ==22745== For counts of detected and suppressed errors, rerun with: -v > ==22745== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 61 from 8) > From bsmith at mcs.anl.gov Sun Jun 29 12:56:57 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 29 Jun 2014 12:56:57 -0500 Subject: [petsc-users] Valgrind (invalid read) error with ASM in petsc-dev In-Reply-To: References: <53B03EC4.3020504@geology.wisc.edu> Message-ID: <99A4078F-D1F1-439C-AD92-97B05BD9B180@mcs.anl.gov> This is our error. It is harmless, I am fixing it in master now. Barry On Jun 29, 2014, at 12:41 PM, Barry Smith wrote: > > Hmm, what code is calling the ISStrideGetInfo(). It is a ISGENERAL so ISStrideGetInfo() shouldn?t get called on it. > > Barry > > ==22745== Invalid read of size 4 > ==22745== at 0x4231567: ISStrideGetInfo (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) > ==22745== by 0x678C5DF: ??? > > > On Jun 29, 2014, at 11:28 AM, Tabrez Ali wrote: > >> Hello >> >> I get the following Valgrind error (below) while using ASM in petsc-dev. No such error occurs with petsc-3.4.x. Any ideas as to what could be wrong? >> >> Thanks in advance. >> >> Tabrez >> >> >> ==22745== Memcheck, a memory error detector >> ==22745== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. >> ==22745== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info >> ==22745== Command: ./defmod -f examples/two_quads_qs.inp -log_summary >> ==22745== >> Reading input ... >> Reading mesh data ... >> Forming [K] ... >> Forming RHS ... >> Setting up solver ... >> Solving ... >> ==22745== Invalid read of size 4 >> ==22745== at 0x4231567: ISStrideGetInfo (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x678C5DF: ??? >> ==22745== Address 0x678c3ec is 0 bytes after a block of size 12 alloc'd >> ==22745== at 0x40278A4: memalign (vg_replace_malloc.c:694) >> ==22745== by 0x40FBE89: PetscMallocAlign (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x42288FD: ISCreate_General (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x4221494: ISSetType (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x4227745: ISCreateGeneral (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x45F104F: MatIncreaseOverlap_SeqAIJ (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x473B4F2: MatIncreaseOverlap (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x4A9CFC0: PCSetUp_ASM (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x49BF314: PCSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x4ABAB25: KSPSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x4ABC8A8: KSPSolve (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== by 0x4AE2DE0: kspsolve_ (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.05.0) >> ==22745== >> Recovering stress ... >> Cleaning up ... >> Finished >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >> >> ./defmod on a arch-linux2-c-debug named i5 with 1 processor, by stali Sun Jun 29 11:02:25 2014 >> Using Petsc Development GIT revision: v3.4.4-4963-g821251a GIT Date: 2014-06-28 12:39:08 -0500 >> >> Max Max/Min Avg Total >> Time (sec): 2.261e+00 1.00000 2.261e+00 >> Objects: 3.100e+01 1.00000 3.100e+01 >> Flops: 1.207e+03 1.00000 1.207e+03 1.207e+03 >> Flops/sec: 5.339e+02 1.00000 5.339e+02 5.339e+02 >> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 >> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 >> MPI Reductions: 0.000e+00 0.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flops >> and VecAXPY() for complex vectors of length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts %Total Avg %Total counts %Total >> 0: Main Stage: 2.2451e+00 99.3% 1.2070e+03 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> Avg. len: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was compiled with a debugging option, # >> # To get timing results run ./configure # >> # using --with-debugging=no, the performance will # >> # be generally two or three times faster. # >> # # >> ########################################################## >> >> >> Event Count Time (sec) Flops --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> ThreadCommRunKer 1 1.0 7.3960e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ThreadCommBarrie 1 1.0 8.2803e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecMDot 1 1.0 1.0480e-02 1.0 2.30e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 0 >> VecNorm 2 1.0 1.2341e-02 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 0 >> VecScale 2 1.0 1.2405e-02 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 0 >> VecCopy 1 1.0 9.0401e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 10 1.0 4.3447e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 1 1.0 1.1247e-02 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 0 >> VecMAXPY 2 1.0 6.3708e-03 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 0 >> VecAssemblyBegin 1 1.0 2.5654e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> VecAssemblyEnd 1 1.0 9.2661e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecScatterBegin 5 1.0 8.8482e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecNormalize 2 1.0 3.3422e-02 1.0 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 6 0 0 0 1 6 0 0 0 0 >> MatMult 1 1.0 1.5871e-02 1.0 2.12e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 1 18 0 0 0 0 >> MatSolve 2 1.0 3.1852e-02 1.0 4.24e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 35 0 0 0 1 35 0 0 0 0 >> MatLUFactorNum 1 1.0 9.2084e-02 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 4 33 0 0 0 4 33 0 0 0 0 >> MatILUFactorSym 1 1.0 5.1554e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> MatAssemblyBegin 2 1.0 6.5899e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAssemblyEnd 2 1.0 2.8979e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatGetRowIJ 1 1.0 2.5041e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetSubMatrice 1 1.0 2.8049e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatGetOrdering 1 1.0 5.0988e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> MatIncreaseOvrlp 1 1.0 3.7770e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> KSPGMRESOrthog 1 1.0 2.9687e-02 1.0 4.70e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 0 >> KSPSetUp 2 1.0 1.8465e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> KSPSolve 1 1.0 7.4545e-01 1.0 1.21e+03 1.0 0.0e+00 0.0e+00 0.0e+00 33100 0 0 0 33100 0 0 0 0 >> PCSetUp 2 1.0 4.4820e-01 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 20 33 0 0 0 20 33 0 0 0 0 >> PCSetUpOnBlocks 1 1.0 2.3395e-01 1.0 4.04e+02 1.0 0.0e+00 0.0e+00 0.0e+00 10 33 0 0 0 10 33 0 0 0 0 >> PCApply 2 1.0 7.8817e-02 1.0 4.24e+02 1.0 0.0e+00 0.0e+00 0.0e+00 3 35 0 0 0 4 35 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Vector 11 11 10840 0 >> Vector Scatter 2 2 784 0 >> Matrix 3 3 10120 0 >> Index Set 10 10 4872 0 >> Krylov Solver 2 2 18316 0 >> Preconditioner 2 2 1204 0 >> Viewer 1 0 0 0 >> ======================================================================================================================== >> Average time to get PetscTime(): 0.000123882 >> #PETSc Option Table entries: >> -f examples/two_quads_qs.inp >> -log_summary >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >> Configure options: --with-fc=gfortran --with-cc=gcc --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS=-O2 --FOPTFLAGS=-O2 --with-shared-libraries >> ----------------------------------------- >> Libraries compiled on Sun Jun 29 08:08:16 2014 on i5 >> Machine characteristics: Linux-3.2.0-4-686-pae-i686-with-debian-7.5 >> Using PETSc directory: /home/stali/petsc-dev >> Using PETSc arch: arch-linux2-c-debug >> ----------------------------------------- >> >> Using C compiler: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O2 ${COPTFLAGS} ${CFLAGS} >> Using Fortran compiler: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O2 ${FOPTFLAGS} ${FFLAGS} >> ----------------------------------------- >> >> Using include paths: -I/home/stali/petsc-dev/arch-linux2-c-debug/include -I/home/stali/petsc-dev/include -I/home/stali/petsc-dev/include -I/home/stali/petsc-dev/arch-linux2-c-debug/include >> ----------------------------------------- >> >> Using C linker: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpicc >> Using Fortran linker: /home/stali/petsc-dev/arch-linux2-c-debug/bin/mpif90 >> Using libraries: -Wl,-rpath,/home/stali/petsc-dev/arch-linux2-c-debug/lib -L/home/stali/petsc-dev/arch-linux2-c-debug/lib -lpetsc -llapack -lblas -Wl,-rpath,/home/stali/petsc-dev/arch-linux2-c-debug/lib -L/home/stali/petsc-dev/arch-linux2-c-debug/lib -lmetis -lX11 -lpthread -lssl -lcrypto -lm -Wl,-rpath,/usr/lib/gcc/i486-linux-gnu/4.7 -L/usr/lib/gcc/i486-linux-gnu/4.7 -Wl,-rpath,/usr/lib/i386-linux-gnu -L/usr/lib/i386-linux-gnu -Wl,-rpath,/lib/i386-linux-gnu -L/lib/i386-linux-gnu -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl >> ----------------------------------------- >> >> ==22745== >> ==22745== HEAP SUMMARY: >> ==22745== in use at exit: 0 bytes in 0 blocks >> ==22745== total heap usage: 3,723 allocs, 3,723 frees, 282,348 bytes allocated >> ==22745== >> ==22745== All heap blocks were freed -- no leaks are possible >> ==22745== >> ==22745== For counts of detected and suppressed errors, rerun with: -v >> ==22745== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 61 from 8) >> > From zonexo at gmail.com Mon Jun 30 00:00:36 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 30 Jun 2014 13:00:36 +0800 Subject: [petsc-users] How to debug PETSc error Message-ID: <53B0EEF4.8090308@gmail.com> Hi, I have a CFD code which gives an error when solving the momentum eqn at time step = 1109. Using KSPGetConvergedReason give < 0 using optimized build. I retry using debug build and it gives the error below. I sent the job to a job scheduler on 32 procs. So what is best way to debug? Should I print out the matrix but it is very big since grid size is 13 million. Thanks. Regards. n12-10:13681] 31 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics [n12-10:13681] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages [17]PETSC ERROR: ------------------------------------------------------------------------ [17]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero [17]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [17]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[17]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [17]PETSC ERROR: likely location of problem given in stack below [17]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [17]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [17]PETSC ERROR: INSTEAD the line number of the start of the function [17]PETSC ERROR: is given. [17]PETSC ERROR: [17] VecNorm_MPI line 57 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/impls/mpi/pvec2.c [17]PETSC ERROR: [17] VecNorm line 224 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/interface/rvector.c [17]PETSC ERROR: [17] KSPSolve_BCGS line 39 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/impls/bcgs/bcgs.c [17]PETSC ERROR: [17] KSPSolve line 356 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c -- Thank you Yours sincerely, TAY wee-beng From bsmith at mcs.anl.gov Mon Jun 30 00:53:45 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 30 Jun 2014 00:53:45 -0500 Subject: [petsc-users] How to debug PETSc error In-Reply-To: <53B0EEF4.8090308@gmail.com> References: <53B0EEF4.8090308@gmail.com> Message-ID: <1768AEF3-BE07-4A07-AFC9-45444D3CB532@mcs.anl.gov> On Jun 30, 2014, at 12:00 AM, TAY wee-beng wrote: > Hi, > > I have a CFD code which gives an error when solving the momentum eqn at time step = 1109. Using KSPGetConvergedReason give < 0 using optimized build. What value < 0? It is possible there is no bug. Bi-CG-stab (though it is stabilized) is not always stable and it can grief even if the matrix and right hand side are ?reasonable?. Or the preconditioner may be generating inappropriately huge values (for example if ILU is being used inside it). Yes, don?t try to print the matrix or anything like that. I would start by trying with KSPBCGSL (manual page below). It is designed to be more stable than Bi-CG-stab. Try it with the default options; you can also increase the ell if it fails. GMRES is always a good bet but I am thinking you are not using it because it requires too much memory due to restart length. Barry KSPBCGSL - Implements a slight variant of the Enhanced BiCGStab(L) algorithm in (3) and (2). The variation concerns cases when either kappa0**2 or kappa1**2 is negative due to round-off. Kappa0 has also been pulled out of the denominator in the formula for ghat. References: 1. G.L.G. Sleijpen, H.A. van der Vorst, "An overview of approaches for the stable computation of hybrid BiCG methods", Applied Numerical Mathematics: Transactions f IMACS, 19(3), pp 235-54, 1996. 2. G.L.G. Sleijpen, H.A. van der Vorst, D.R. Fokkema, "BiCGStab(L) and other hybrid Bi-CG methods", Numerical Algorithms, 7, pp 75-109, 1994. 3. D.R. Fokkema, "Enhanced implementation of BiCGStab(L) for solving linear systems of equations", preprint from www.citeseer.com. Contributed by: Joel M. Malard, email jm.malard at pnl.gov Options Database Keys: + -ksp_bcgsl_ell Number of Krylov search directions, defaults to 2 -- KSPBCGSLSetEll() . -ksp_bcgsl_cxpol - Use a convex function of the MinRes and OR polynomials after the BiCG step instead of default MinRes -- KSPBCGSLSetPol() . -ksp_bcgsl_mrpoly - Use the default MinRes polynomial after the BiCG step -- KSPBCGSLSetPol() . -ksp_bcgsl_xres Threshold used to decide when to refresh computed residuals -- KSPBCGSLSetXRes() - -ksp_bcgsl_pinv - (de)activate use of pseudoinverse -- KSPBCGSLSetUsePseudoinverse() Level: beginner .seealso: KSPCreate(), KSPSetType(), KSPType (for list of available types), KSP, KSPFGMRES, KSPBCGS, KSPSetPCSide(), KSPBCGSLSetEll(), KSPBCGSLSetXRes() > > I retry using debug build and it gives the error below. I sent the job to a job scheduler on 32 procs. So what is best way to debug? Should I print out the matrix but it is very big since grid size is 13 million. > > Thanks. Regards. > > n12-10:13681] 31 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics > [n12-10:13681] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages > [17]PETSC ERROR: ------------------------------------------------------------------------ > [17]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero > [17]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [17]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[17]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [17]PETSC ERROR: likely location of problem given in stack below > [17]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [17]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [17]PETSC ERROR: INSTEAD the line number of the start of the function > [17]PETSC ERROR: is given. > [17]PETSC ERROR: [17] VecNorm_MPI line 57 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/impls/mpi/pvec2.c > [17]PETSC ERROR: [17] VecNorm line 224 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/interface/rvector.c > [17]PETSC ERROR: [17] KSPSolve_BCGS line 39 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/impls/bcgs/bcgs.c > [17]PETSC ERROR: [17] KSPSolve line 356 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > From zonexo at gmail.com Mon Jun 30 01:30:22 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 30 Jun 2014 14:30:22 +0800 Subject: [petsc-users] How to debug PETSc error In-Reply-To: <1768AEF3-BE07-4A07-AFC9-45444D3CB532@mcs.anl.gov> References: <53B0EEF4.8090308@gmail.com> <1768AEF3-BE07-4A07-AFC9-45444D3CB532@mcs.anl.gov> Message-ID: <53B103FE.7010602@gmail.com> On 30/6/2014 1:53 PM, Barry Smith wrote: > On Jun 30, 2014, at 12:00 AM, TAY wee-beng wrote: > >> Hi, >> >> I have a CFD code which gives an error when solving the momentum eqn at time step = 1109. Using KSPGetConvergedReason give < 0 using optimized build. > What value < 0? It is possible there is no bug. Bi-CG-stab (though it is stabilized) is not always stable and it can grief even if the matrix and right hand side are ?reasonable?. Or the preconditioner may be generating inappropriately huge values (for example if ILU is being used inside it). > > Yes, don?t try to print the matrix or anything like that. > > I would start by trying with KSPBCGSL (manual page below). It is designed to be more stable than Bi-CG-stab. Try it with the default options; you can also increase the ell if it fails. > > GMRES is always a good bet but I am thinking you are not using it because it requires too much memory due to restart length. > > Barry > > > KSPBCGSL - Implements a slight variant of the Enhanced > BiCGStab(L) algorithm in (3) and (2). The variation > concerns cases when either kappa0**2 or kappa1**2 is > negative due to round-off. Kappa0 has also been pulled > out of the denominator in the formula for ghat. > > References: > 1. G.L.G. Sleijpen, H.A. van der Vorst, "An overview of > approaches for the stable computation of hybrid BiCG > methods", Applied Numerical Mathematics: Transactions > f IMACS, 19(3), pp 235-54, 1996. > 2. G.L.G. Sleijpen, H.A. van der Vorst, D.R. Fokkema, > "BiCGStab(L) and other hybrid Bi-CG methods", > Numerical Algorithms, 7, pp 75-109, 1994. > 3. D.R. Fokkema, "Enhanced implementation of BiCGStab(L) > for solving linear systems of equations", preprint > from www.citeseer.com. > > Contributed by: Joel M. Malard, email jm.malard at pnl.gov > > Options Database Keys: > + -ksp_bcgsl_ell Number of Krylov search directions, defaults to 2 -- KSPBCGSLSetEll() > . -ksp_bcgsl_cxpol - Use a convex function of the MinRes and OR polynomials after the BiCG step instead of default MinRes -- KSPBCGSLSetPol() > . -ksp_bcgsl_mrpoly - Use the default MinRes polynomial after the BiCG step -- KSPBCGSLSetPol() > . -ksp_bcgsl_xres Threshold used to decide when to refresh computed residuals -- KSPBCGSLSetXRes() > - -ksp_bcgsl_pinv - (de)activate use of pseudoinverse -- KSPBCGSLSetUsePseudoinverse() > > Level: beginner > > .seealso: KSPCreate(), KSPSetType(), KSPType (for list of available types), KSP, KSPFGMRES, KSPBCGS, KSPSetPCSide(), KSPBCGSLSetEll(), KSPBCGSLSetXRes() Hi Barry, I mean why I run : KSPGetConvergedReason(ksp_semi_xyz,reason,ierr) reason < 0. I forgot to add that the problem happens with my newly modified code. In my old code, it works perfectly. So during my modification, the matrix or vector may have been changed unintentionally. By right, the new and old code should give the same matrix, except for small differences due to truncation error. Based on these info, is there a better way to debug? I will also changed to KSPBCGSL as suggested. Thanks Regards. >> I retry using debug build and it gives the error below. I sent the job to a job scheduler on 32 procs. So what is best way to debug? Should I print out the matrix but it is very big since grid size is 13 million. >> >> Thanks. Regards. >> >> n12-10:13681] 31 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics >> [n12-10:13681] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >> [17]PETSC ERROR: ------------------------------------------------------------------------ >> [17]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero >> [17]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [17]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[17]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [17]PETSC ERROR: likely location of problem given in stack below >> [17]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [17]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [17]PETSC ERROR: INSTEAD the line number of the start of the function >> [17]PETSC ERROR: is given. >> [17]PETSC ERROR: [17] VecNorm_MPI line 57 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/impls/mpi/pvec2.c >> [17]PETSC ERROR: [17] VecNorm line 224 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/interface/rvector.c >> [17]PETSC ERROR: [17] KSPSolve_BCGS line 39 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/impls/bcgs/bcgs.c >> [17]PETSC ERROR: [17] KSPSolve line 356 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c >> >> -- >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> From bsmith at mcs.anl.gov Mon Jun 30 01:37:57 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 30 Jun 2014 01:37:57 -0500 Subject: [petsc-users] How to debug PETSc error In-Reply-To: <53B103FE.7010602@gmail.com> References: <53B0EEF4.8090308@gmail.com> <1768AEF3-BE07-4A07-AFC9-45444D3CB532@mcs.anl.gov> <53B103FE.7010602@gmail.com> Message-ID: <57A4E104-5B09-44A3-9046-FF547B68E621@mcs.anl.gov> On Jun 30, 2014, at 1:30 AM, TAY wee-beng wrote: > On 30/6/2014 1:53 PM, Barry Smith wrote: >> On Jun 30, 2014, at 12:00 AM, TAY wee-beng wrote: >> >>> Hi, >>> >>> I have a CFD code which gives an error when solving the momentum eqn at time step = 1109. Using KSPGetConvergedReason give < 0 using optimized build. >> What value < 0? It is possible there is no bug. Bi-CG-stab (though it is stabilized) is not always stable and it can grief even if the matrix and right hand side are ?reasonable?. Or the preconditioner may be generating inappropriately huge values (for example if ILU is being used inside it). >> >> Yes, don?t try to print the matrix or anything like that. >> >> I would start by trying with KSPBCGSL (manual page below). It is designed to be more stable than Bi-CG-stab. Try it with the default options; you can also increase the ell if it fails. >> >> GMRES is always a good bet but I am thinking you are not using it because it requires too much memory due to restart length. >> >> Barry >> >> >> KSPBCGSL - Implements a slight variant of the Enhanced >> BiCGStab(L) algorithm in (3) and (2). The variation >> concerns cases when either kappa0**2 or kappa1**2 is >> negative due to round-off. Kappa0 has also been pulled >> out of the denominator in the formula for ghat. >> >> References: >> 1. G.L.G. Sleijpen, H.A. van der Vorst, "An overview of >> approaches for the stable computation of hybrid BiCG >> methods", Applied Numerical Mathematics: Transactions >> f IMACS, 19(3), pp 235-54, 1996. >> 2. G.L.G. Sleijpen, H.A. van der Vorst, D.R. Fokkema, >> "BiCGStab(L) and other hybrid Bi-CG methods", >> Numerical Algorithms, 7, pp 75-109, 1994. >> 3. D.R. Fokkema, "Enhanced implementation of BiCGStab(L) >> for solving linear systems of equations", preprint >> from www.citeseer.com. >> >> Contributed by: Joel M. Malard, email jm.malard at pnl.gov >> >> Options Database Keys: >> + -ksp_bcgsl_ell Number of Krylov search directions, defaults to 2 -- KSPBCGSLSetEll() >> . -ksp_bcgsl_cxpol - Use a convex function of the MinRes and OR polynomials after the BiCG step instead of default MinRes -- KSPBCGSLSetPol() >> . -ksp_bcgsl_mrpoly - Use the default MinRes polynomial after the BiCG step -- KSPBCGSLSetPol() >> . -ksp_bcgsl_xres Threshold used to decide when to refresh computed residuals -- KSPBCGSLSetXRes() >> - -ksp_bcgsl_pinv - (de)activate use of pseudoinverse -- KSPBCGSLSetUsePseudoinverse() >> >> Level: beginner >> >> .seealso: KSPCreate(), KSPSetType(), KSPType (for list of available types), KSP, KSPFGMRES, KSPBCGS, KSPSetPCSide(), KSPBCGSLSetEll(), KSPBCGSLSetXRes() > Hi Barry, > > I mean why I run : > > KSPGetConvergedReason(ksp_semi_xyz,reason,ierr) > > reason < 0. Yes but exactly what value of reason? > > I forgot to add that the problem happens with my newly modified code. In my old code, it works perfectly. So during my modification, the matrix or vector may have been changed unintentionally. By right, the new and old code should give the same matrix, except for small differences due to truncation error. Based on these info, is there a better way to debug? I will also changed to KSPBCGSL as suggested. > If you run the two versions next to each other do they produce very similar results for all those time steps? Can you very slowly change the old code to the new form and run these intermediate versions until you hit upon the change that causes the problem? Unfortunately I don?t have any easy answer. Barry > Thanks > > Regards. >>> I retry using debug build and it gives the error below. I sent the job to a job scheduler on 32 procs. So what is best way to debug? Should I print out the matrix but it is very big since grid size is 13 million. >>> >>> Thanks. Regards. >>> >>> n12-10:13681] 31 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics >>> [n12-10:13681] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>> [17]PETSC ERROR: ------------------------------------------------------------------------ >>> [17]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero >>> [17]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [17]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[17]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [17]PETSC ERROR: likely location of problem given in stack below >>> [17]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [17]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [17]PETSC ERROR: INSTEAD the line number of the start of the function >>> [17]PETSC ERROR: is given. >>> [17]PETSC ERROR: [17] VecNorm_MPI line 57 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/impls/mpi/pvec2.c >>> [17]PETSC ERROR: [17] VecNorm line 224 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/interface/rvector.c >>> [17]PETSC ERROR: [17] KSPSolve_BCGS line 39 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/impls/bcgs/bcgs.c >>> [17]PETSC ERROR: [17] KSPSolve line 356 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c >>> >>> -- >>> Thank you >>> >>> Yours sincerely, >>> >>> TAY wee-beng