[petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90

Matthew Knepley knepley at gmail.com
Sun May 18 20:53:19 CDT 2014


On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi Barry,
>
> I am trying to sort out the details so that it's easier to pinpoint the
> error. However, I tried on gnu gfortran and it worked well. On intel ifort,
> it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that
> it's a bug in ifort? Do you work with both intel and gnu?
>

Yes it works with Intel. Is this using optimization?

  Matt


>
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
> On 14/5/2014 12:03 AM, Barry Smith wrote:
>
>>    Please send you current code. So we may compile and run it.
>>
>>    Barry
>>
>>
>>    On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>  Hi,
>>>
>>> I have sent the entire code a while ago. Is there any answer? I was also
>>> trying myself but it worked for some intel compiler, and some not. I'm
>>> still not able to find the answer. gnu compilers for most cluster are old
>>> versions so they are not able to compile since I have allocatable
>>> structures.
>>>
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>
>>>>     Please send the entire code. If we can run it and reproduce the
>>>> problem we can likely track down the issue much faster than through endless
>>>> rounds of email.
>>>>
>>>>     Barry
>>>>
>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>>  On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>>>
>>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>>>>
>>>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>> wrote:
>>>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>>>>>
>>>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>> wrote:
>>>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>>>>>
>>>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>>>>>>
>>>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>>>>>>>     Hmm,
>>>>>>>>>>
>>>>>>>>>>         Interface DMDAVecGetArrayF90
>>>>>>>>>>           Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>>>>>>>             USE_DM_HIDE
>>>>>>>>>>             DM_HIDE da1
>>>>>>>>>>             VEC_HIDE v
>>>>>>>>>>             PetscScalar,pointer :: d1(:,:,:)
>>>>>>>>>>             PetscErrorCode ierr
>>>>>>>>>>           End Subroutine
>>>>>>>>>>
>>>>>>>>>>      So the d1 is a F90 POINTER. But your subroutine seems to be
>>>>>>>>>> treating it as a “plain old Fortran array”?
>>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain
>>>>>>>>>> old Fortran array"? Because I declare it as a Fortran array and it works
>>>>>>>>>> w/o any problem if I only call DMDAVecGetArrayF90 and
>>>>>>>>>> DMDAVecRestoreArrayF90 with "u".
>>>>>>>>>>
>>>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with
>>>>>>>>>> "u", "v" and "w", error starts to happen. I wonder why...
>>>>>>>>>>
>>>>>>>>>> Also, supposed I call:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> u_array ....
>>>>>>>>>>
>>>>>>>>>> v_array .... etc
>>>>>>>>>>
>>>>>>>>>> Now to restore the array, does it matter the sequence they are
>>>>>>>>>> restored?
>>>>>>>>>>      No it should not matter. If it matters that is a sign that
>>>>>>>>>> memory has been written to incorrectly earlier in the code.
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Hmm, I have been getting different results on different intel
>>>>>>>>>> compilers. I'm not sure if MPI played a part but I'm only using a single
>>>>>>>>>> processor. In the debug mode, things run without problem. In optimized
>>>>>>>>>> mode, in some cases, the code aborts even doing simple initialization:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      u_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>      v_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>      w_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>      p_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr),
>>>>>>>>>> giving segmentation error. But other
>>>>>>>>>>               version of intel compiler passes thru this part w/o error.
>>>>>>>>>> Since the response is different among different compilers, is this PETSc or
>>>>>>>>>> intel 's bug? Or mvapich or openmpi?
>>>>>>>>>>
>>>>>>>>>> We do this is a bunch of examples. Can you reproduce this
>>>>>>>>>> different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>>>>>>>>
>>>>>>>>> Hi Matt,
>>>>>>>>>
>>>>>>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>>>>>>
>>>>>>>>> It already has DMDAVecGetArray(). Just run it.
>>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> It worked. The differences between mine and the code is the way the
>>>>>>>> fortran modules are defined, and the ex11f90 only uses global vectors. Does
>>>>>>>> it make a difference whether global or local vectors are used? Because the
>>>>>>>> way it accesses x1 only touches the local region.
>>>>>>>>
>>>>>>>> No the global/local difference should not matter.
>>>>>>>>   Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be
>>>>>>>> used 1st, is that so? I can't find the equivalent for local vector though.
>>>>>>>>
>>>>>>>> DMGetLocalVector()
>>>>>>>>
>>>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my
>>>>>>> code. Does it matter?
>>>>>>>
>>>>>>> If so, when should I call them?
>>>>>>>
>>>>>>> You just need a local vector from somewhere.
>>>>>>>
>>>>>> Hi,
>>>>>
>>>>> Anyone can help with the questions below? Still trying to find why my
>>>>> code doesn't work.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I insert part of my error region code into ex11f90:
>>>>>>
>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>           call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>
>>>>>>      u_array = 0.d0
>>>>>>           v_array = 0.d0
>>>>>>           w_array = 0.d0
>>>>>>           p_array = 0.d0
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>
>>>>>> It worked w/o error. I'm going to change the way the modules are
>>>>>> defined in my code.
>>>>>>
>>>>>> My code contains a main program and a number of modules files, with
>>>>>> subroutines inside e.g.
>>>>>>
>>>>>> module solve
>>>>>>                   <- add include file?
>>>>>> subroutine RRK
>>>>>>                   <- add include file?
>>>>>> end subroutine RRK
>>>>>>
>>>>>> end module solve
>>>>>>
>>>>>> So where should the include files (#include <finclude/petscdmda.h90>)
>>>>>> be placed?
>>>>>>
>>>>>> After the module or inside the subroutine?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>    Matt
>>>>>>>   Thanks.
>>>>>>>
>>>>>>>>     Matt
>>>>>>>>   Thanks.
>>>>>>>>
>>>>>>>>>     Matt
>>>>>>>>>   Thanks
>>>>>>>>>
>>>>>>>>> Regards.
>>>>>>>>>
>>>>>>>>>>     Matt
>>>>>>>>>>   As in w, then v and u?
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> thanks
>>>>>>>>>>      Note also that the beginning and end indices of the u,v,w,
>>>>>>>>>> are different for each process see for example
>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/
>>>>>>>>>> tutorials/ex11f90.F  (and they do not start at 1). This is how
>>>>>>>>>> to get the loop bounds.
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> In my case, I fixed the u,v,w such that their indices are the
>>>>>>>>>> same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the
>>>>>>>>>> problem lies in my subroutine treating it as a “plain old Fortran array”.
>>>>>>>>>>
>>>>>>>>>> If I declare them as pointers, their indices follow the C 0 start
>>>>>>>>>> convention, is that so?
>>>>>>>>>>      Not really. It is that in each process you need to access
>>>>>>>>>> them from the indices indicated by DMDAGetCorners() for global vectors and
>>>>>>>>>> DMDAGetGhostCorners() for local vectors.  So really C or Fortran
>>>>>>>>>>                                                 doesn’t make any difference.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow
>>>>>>>>>> the Fortran 1 start convention. Is there some way to manipulate such that I
>>>>>>>>>> do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>>>>>>     If you code wishes to access them with indices plus one from
>>>>>>>>>> the values returned by DMDAGetCorners() for global vectors and
>>>>>>>>>> DMDAGetGhostCorners() for local vectors then you need to manually subtract
>>>>>>>>>> off the 1.
>>>>>>>>>>
>>>>>>>>>>     Barry
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>     Barry
>>>>>>>>>>
>>>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence
>>>>>>>>>> I can run on 1 processor. Tried using valgrind but perhaps I'm using the
>>>>>>>>>> optimized version, it didn't catch the error, besides saying "Segmentation
>>>>>>>>>> fault (core dumped)"
>>>>>>>>>>
>>>>>>>>>> However, by re-writing my code, I found out a few things:
>>>>>>>>>>
>>>>>>>>>> 1. if I write my code this way:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> u_array = ....
>>>>>>>>>>
>>>>>>>>>> v_array = ....
>>>>>>>>>>
>>>>>>>>>> w_array = ....
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> The code runs fine.
>>>>>>>>>>
>>>>>>>>>> 2. if I write my code this way:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this
>>>>>>>>>> subroutine does the same modification as the above.
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>>>>>>>
>>>>>>>>>> where the subroutine is:
>>>>>>>>>>
>>>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>>>>
>>>>>>>>>> u ...
>>>>>>>>>> v...
>>>>>>>>>> w ...
>>>>>>>>>>
>>>>>>>>>> end subroutine uvw_array_change.
>>>>>>>>>>
>>>>>>>>>> The above will give an error at :
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> 3. Same as above, except I change the order of the last 3 lines
>>>>>>>>>> to:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> So they are now in reversed order. Now it works.
>>>>>>>>>>
>>>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>>>>>>>
>>>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_
>>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>>>>>>>> 3):end_indices(3))
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_
>>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>>>>>>>> 3):end_indices(3))
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_
>>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>>>>>>>> 3):end_indices(3))
>>>>>>>>>>
>>>>>>>>>> u ...
>>>>>>>>>> v...
>>>>>>>>>> w ...
>>>>>>>>>>
>>>>>>>>>> end subroutine uvw_array_change.
>>>>>>>>>>
>>>>>>>>>> The start_indices and end_indices are simply to shift the 0
>>>>>>>>>> indices of C convention to that of the 1 indices of the Fortran convention.
>>>>>>>>>> This is necessary in my case because most of my codes start array counting
>>>>>>>>>> at 1, hence the "trick".
>>>>>>>>>>
>>>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90
>>>>>>>>>> (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>> "
>>>>>>>>>>
>>>>>>>>>> So did I violate and cause memory corruption due to the trick
>>>>>>>>>> above? But I can't think of any way other
>>>>>>>>>>                         than the "trick" to continue using the 1 indices
>>>>>>>>>> convention.
>>>>>>>>>>
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>>>>>>>     Try running under valgrind http://www.mcs.anl.gov/petsc/
>>>>>>>>>> documentation/faq.html#valgrind
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Barry,
>>>>>>>>>>
>>>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode
>>>>>>>>>> but fails in non-debug mode.
>>>>>>>>>>
>>>>>>>>>> I have attached my code.
>>>>>>>>>>
>>>>>>>>>> Thank you
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>>>>>>>     Please send the code that creates da_w and the declarations
>>>>>>>>>> of w_array
>>>>>>>>>>
>>>>>>>>>>     Barry
>>>>>>>>>>
>>>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>>>>>>>> <zonexo at gmail.com>
>>>>>>>>>>    wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Barry,
>>>>>>>>>>
>>>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>>>>>>>
>>>>>>>>>>    mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>>>>
>>>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11),
>>>>>>>>>> the program aborts.
>>>>>>>>>>
>>>>>>>>>> Also I tried running in another cluster and it worked. Also tried
>>>>>>>>>> in the current cluster in debug mode and it worked too.
>>>>>>>>>>
>>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>>>> ------------------------------------------------------------
>>>>>>>>>> --------------
>>>>>>>>>> An MPI process has executed an operation involving a call to the
>>>>>>>>>> "fork()" system call to create a child process.  Open MPI is
>>>>>>>>>> currently
>>>>>>>>>> operating in a condition that could result in memory corruption or
>>>>>>>>>> other system errors; your MPI job may hang, crash, or produce
>>>>>>>>>> silent
>>>>>>>>>> data corruption.  The use of fork() (or system() or other calls
>>>>>>>>>> that
>>>>>>>>>> create child processes) is strongly discouraged.
>>>>>>>>>>
>>>>>>>>>> The process that invoked fork was:
>>>>>>>>>>
>>>>>>>>>>     Local host:          n12-76 (PID 20235)
>>>>>>>>>>     MPI_COMM_WORLD rank: 2
>>>>>>>>>>
>>>>>>>>>> If you are *absolutely sure* that your application will
>>>>>>>>>> successfully
>>>>>>>>>> and correctly survive a call to fork(), you may disable this
>>>>>>>>>> warning
>>>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>>>>>>> ------------------------------------------------------------
>>>>>>>>>> --------------
>>>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [n12-76:20232] 3 more processes have sent help message
>>>>>>>>>> help-mpi-runtime.txt / mpi_init:warn-fork
>>>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0
>>>>>>>>>> to see all help / error messages
>>>>>>>>>>
>>>>>>>>>> ....
>>>>>>>>>>
>>>>>>>>>>    1
>>>>>>>>>> [1]PETSC ERROR: ------------------------------
>>>>>>>>>> ------------------------------------------
>>>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>>> Violation, probably memory access out of range
>>>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>>> -on_error_attach_debugger
>>>>>>>>>> [1]PETSC ERROR: or see
>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#
>>>>>>>>>> valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption
>>>>>>>>>> errors
>>>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>>>>>>>> link, and run
>>>>>>>>>> [1]PETSC ERROR: to get more information on the crash.
>>>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown
>>>>>>>>>> directory unknown file (null)
>>>>>>>>>> [3]PETSC ERROR: ------------------------------
>>>>>>>>>> ------------------------------------------
>>>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>>> Violation, probably memory access out of range
>>>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>>> -on_error_attach_debugger
>>>>>>>>>> [3]PETSC ERROR: or see
>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#
>>>>>>>>>> valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption
>>>>>>>>>> errors
>>>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>>>>>>>> link, and run
>>>>>>>>>> [3]PETSC ERROR: to get more information on the crash.
>>>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown
>>>>>>>>>> directory unknown file (null)
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>>>>>>>
>>>>>>>>>>     Because IO doesn’t always get flushed immediately it may not
>>>>>>>>>> be hanging at this point.  It is better to use the option
>>>>>>>>>> -start_in_debugger then type cont in each debugger window and then when you
>>>>>>>>>> think it is “hanging” do a control C in each debugger window and type where
>>>>>>>>>> to see where each process is you can also look around in the debugger at
>>>>>>>>>> variables to see why it is “hanging” at that point.
>>>>>>>>>>
>>>>>>>>>>      Barry
>>>>>>>>>>
>>>>>>>>>>     This routines don’t have any parallel communication in them
>>>>>>>>>> so are unlikely to hang.
>>>>>>>>>>
>>>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> <zonexo at gmail.com>
>>>>>>>>>>
>>>>>>>>>>    wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the
>>>>>>>>>> bug. I found that it hangs after printing "7". Is it because I'm doing
>>>>>>>>>> something wrong? I need to access the u,v,w array so I use
>>>>>>>>>> DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>>>>>>>>
>>>>>>>>>>           call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"3"
>>>>>>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"4"
>>>>>>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"5"
>>>>>>>>>>           call I_IIB_uv_initial_1st_dm(I_
>>>>>>>>>> cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_
>>>>>>>>>> v1,I_cell_w1,u_array,v_array,w_array)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"6"
>>>>>>>>>>           call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>  !must be in reverse order
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"7"
>>>>>>>>>>           call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"8"
>>>>>>>>>>           call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>> --
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> <code.txt>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>>> experiments lead.
>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>> experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140518/4461f098/attachment-0001.html>


More information about the petsc-users mailing list