[petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90

TAY wee-beng zonexo at gmail.com
Sat Apr 19 19:39:25 CDT 2014


On 20/4/2014 1:02 AM, Matthew Knepley wrote:
> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>     On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com
>>     <mailto:zonexo at gmail.com>> wrote:
>>
>>         On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>         On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng
>>>         <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>>>
>>>             On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>             On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng
>>>>             <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>>>>
>>>>                 On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>
>>>>                     On Apr 19, 2014, at 12:11 AM, TAY wee-beng
>>>>                     <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>>>>
>>>>                         On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>
>>>>                             On Apr 18, 2014, at 9:57 PM, TAY
>>>>                             wee-beng <zonexo at gmail.com
>>>>                             <mailto:zonexo at gmail.com>> wrote:
>>>>
>>>>                                 On 19/4/2014 3:53 AM, Barry Smith
>>>>                                 wrote:
>>>>
>>>>                                        Hmm,
>>>>
>>>>                                      Interface DMDAVecGetArrayF90
>>>>                                      Subroutine
>>>>                                     DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>                                      USE_DM_HIDE
>>>>                                      DM_HIDE da1
>>>>                                      VEC_HIDE v
>>>>                                      PetscScalar,pointer :: d1(:,:,:)
>>>>                                      PetscErrorCode ierr
>>>>                                              End Subroutine
>>>>
>>>>                                         So the d1 is a F90 POINTER.
>>>>                                     But your subroutine seems to be
>>>>                                     treating it as a “plain old
>>>>                                     Fortran array”?
>>>>                                     real(8), intent(inout) ::
>>>>                                     u(:,:,:),v(:,:,:),w(:,:,:)
>>>>
>>>>                         Hi,
>>>>
>>>>                         So d1 is a pointer, and it's different if I
>>>>                         declare it as "plain old Fortran array"?
>>>>                         Because I declare it as a Fortran array and
>>>>                         it works w/o any problem if I only call
>>>>                         DMDAVecGetArrayF90 and
>>>>                         DMDAVecRestoreArrayF90 with "u".
>>>>
>>>>                         But if I call DMDAVecGetArrayF90 and
>>>>                         DMDAVecRestoreArrayF90 with "u", "v" and
>>>>                         "w", error starts to happen. I wonder why...
>>>>
>>>>                         Also, supposed I call:
>>>>
>>>>                         call
>>>>                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                             call
>>>>                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                             call
>>>>                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                         u_array ....
>>>>
>>>>                         v_array .... etc
>>>>
>>>>                         Now to restore the array, does it matter
>>>>                         the sequence they are restored?
>>>>
>>>>                         No it should not matter. If it matters that
>>>>                     is a sign that memory has been written to
>>>>                     incorrectly earlier in the code.
>>>>
>>>>                 Hi,
>>>>
>>>>                 Hmm, I have been getting different results on
>>>>                 different intel compilers. I'm not sure if MPI
>>>>                 played a part but I'm only using a single
>>>>                 processor. In the debug mode, things run without
>>>>                 problem. In optimized mode, in some cases, the code
>>>>                 aborts even doing simple initialization:
>>>>
>>>>
>>>>                 call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                     call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                     call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                     call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>
>>>>                     u_array = 0.d0
>>>>
>>>>                     v_array = 0.d0
>>>>
>>>>                     w_array = 0.d0
>>>>
>>>>                     p_array = 0.d0
>>>>
>>>>
>>>>                     call
>>>>                 DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>
>>>>
>>>>                     call
>>>>                 DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                     call
>>>>                 DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                     call
>>>>                 DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                 The code aborts at call
>>>>                 DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr),
>>>>                 giving segmentation error. But other version of
>>>>                 intel compiler passes thru this part w/o error.
>>>>                 Since the response is different among different
>>>>                 compilers, is this PETSc or intel 's bug? Or
>>>>                 mvapich or openmpi?
>>>>
>>>>
>>>>             We do this is a bunch of examples. Can you reproduce
>>>>             this different behavior in
>>>>             src/dm/examples/tutorials/ex11f90.F?
>>>
>>>             Hi Matt,
>>>
>>>             Do you mean putting the above lines into ex11f90.F and test?
>>>
>>>
>>>         It already has DMDAVecGetArray(). Just run it.
>>
>>         Hi,
>>
>>         It worked. The differences between mine and the code is the
>>         way the fortran modules are defined, and the ex11f90 only
>>         uses global vectors. Does it make a difference whether global
>>         or local vectors are used? Because the way it accesses x1
>>         only touches the local region.
>>
>>
>>     No the global/local difference should not matter.
>>
>>         Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must
>>         be used 1st, is that so? I can't find the equivalent for
>>         local vector though.
>>
>>
>>     DMGetLocalVector()
>
>     Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my
>     code. Does it matter?
>
>     If so, when should I call them?
>
>
> You just need a local vector from somewhere.
Hi,

I insert part of my error region code into ex11f90:

call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)

     call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)

     call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)

     call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)

     u_array = 0.d0

     v_array = 0.d0

     w_array = 0.d0

     p_array = 0.d0

     call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)

     call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)

     call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)

     call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)

It worked w/o error. I'm going to change the way the modules are defined 
in my code.

My code contains a main program and a number of modules files, with 
subroutines inside e.g.

module solve
                  <- add include file?
subroutine RRK
                  <- add include file?
end subroutine RRK

end module solve

So where should the include files (#include <finclude/petscdmda.h90>) be 
placed?

After the module or inside the subroutine?

Thanks.
>
>   Matt
>
>     Thanks.
>>
>>        Matt
>>
>>         Thanks.
>>>
>>>            Matt
>>>
>>>             Thanks
>>>
>>>             Regards.
>>>>
>>>>                Matt
>>>>
>>>>                         As in w, then v and u?
>>>>
>>>>                         call
>>>>                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>                         call
>>>>                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>                         call
>>>>                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                         thanks
>>>>
>>>>                                         Note also that the
>>>>                                     beginning and end indices of
>>>>                                     the u,v,w, are different for
>>>>                                     each process see for example
>>>>                                     http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F
>>>>                                      (and they do not start at 1).
>>>>                                     This is how to get the loop bounds.
>>>>
>>>>                                 Hi,
>>>>
>>>>                                 In my case, I fixed the u,v,w such
>>>>                                 that their indices are the same. I
>>>>                                 also checked using DMDAGetCorners
>>>>                                 and DMDAGetGhostCorners. Now the
>>>>                                 problem lies in my subroutine
>>>>                                 treating it as a “plain old Fortran
>>>>                                 array”.
>>>>
>>>>                                 If I declare them as pointers,
>>>>                                 their indices follow the C 0 start
>>>>                                 convention, is that so?
>>>>
>>>>                                 Not really. It is that in each
>>>>                             process you need to access them from
>>>>                             the indices indicated by
>>>>                             DMDAGetCorners() for global vectors and
>>>>                             DMDAGetGhostCorners() for local
>>>>                             vectors.  So really C or Fortran
>>>>                             doesn’t make any difference.
>>>>
>>>>
>>>>                                 So my problem now is that in my old
>>>>                                 MPI code, the u(i,j,k) follow the
>>>>                                 Fortran 1 start convention. Is
>>>>                                 there some way to manipulate such
>>>>                                 that I do not have to change my
>>>>                                 u(i,j,k) to u(i-1,j-1,k-1)?
>>>>
>>>>                                If you code wishes to access them
>>>>                             with indices plus one from the values
>>>>                             returned by DMDAGetCorners() for global
>>>>                             vectors and DMDAGetGhostCorners() for
>>>>                             local vectors then you need to manually
>>>>                             subtract off the 1.
>>>>
>>>>                                Barry
>>>>
>>>>                                 Thanks.
>>>>
>>>>                                        Barry
>>>>
>>>>                                     On Apr 18, 2014, at 10:58 AM,
>>>>                                     TAY wee-beng <zonexo at gmail.com
>>>>                                     <mailto:zonexo at gmail.com>> wrote:
>>>>
>>>>                                         Hi,
>>>>
>>>>                                         I tried to pinpoint the
>>>>                                         problem. I reduced my job
>>>>                                         size and hence I can run on
>>>>                                         1 processor. Tried using
>>>>                                         valgrind but perhaps I'm
>>>>                                         using the optimized
>>>>                                         version, it didn't catch
>>>>                                         the error, besides saying
>>>>                                         "Segmentation fault (core
>>>>                                         dumped)"
>>>>
>>>>                                         However, by re-writing my
>>>>                                         code, I found out a few things:
>>>>
>>>>                                         1. if I write my code this way:
>>>>
>>>>                                         call
>>>>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                                         u_array = ....
>>>>
>>>>                                         v_array = ....
>>>>
>>>>                                         w_array = ....
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                                         The code runs fine.
>>>>
>>>>                                         2. if I write my code this way:
>>>>
>>>>                                         call
>>>>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                                         call
>>>>                                         uvw_array_change(u_array,v_array,w_array)
>>>>                                         -> this subroutine does the
>>>>                                         same modification as the above.
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>                                         -> error
>>>>
>>>>                                         where the subroutine is:
>>>>
>>>>                                         subroutine
>>>>                                         uvw_array_change(u,v,w)
>>>>
>>>>                                         real(8), intent(inout) ::
>>>>                                         u(:,:,:),v(:,:,:),w(:,:,:)
>>>>
>>>>                                         u ...
>>>>                                         v...
>>>>                                         w ...
>>>>
>>>>                                         end subroutine
>>>>                                         uvw_array_change.
>>>>
>>>>                                         The above will give an
>>>>                                         error at :
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                                         3. Same as above, except I
>>>>                                         change the order of the
>>>>                                         last 3 lines to:
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>                                         call
>>>>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>                                         So they are now in reversed
>>>>                                         order. Now it works.
>>>>
>>>>                                         4. Same as 2 or 3, except
>>>>                                         the subroutine is changed to :
>>>>
>>>>                                         subroutine
>>>>                                         uvw_array_change(u,v,w)
>>>>
>>>>                                         real(8), intent(inout) ::
>>>>                                         u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>
>>>>                                         real(8), intent(inout) ::
>>>>                                         v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>
>>>>                                         real(8), intent(inout) ::
>>>>                                         w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>
>>>>                                         u ...
>>>>                                         v...
>>>>                                         w ...
>>>>
>>>>                                         end subroutine
>>>>                                         uvw_array_change.
>>>>
>>>>                                         The start_indices and
>>>>                                         end_indices are simply to
>>>>                                         shift the 0 indices of C
>>>>                                         convention to that of the 1
>>>>                                         indices of the Fortran
>>>>                                         convention. This is
>>>>                                         necessary in my case
>>>>                                         because most of my codes
>>>>                                         start array counting at 1,
>>>>                                         hence the "trick".
>>>>
>>>>                                         However, now no matter
>>>>                                         which order of the
>>>>                                         DMDAVecRestoreArrayF90 (as
>>>>                                         in 2 or 3), error will
>>>>                                         occur at "call
>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>                                         "
>>>>
>>>>                                         So did I violate and cause
>>>>                                         memory corruption due to
>>>>                                         the trick above? But I
>>>>                                         can't think of any way
>>>>                                         other than the "trick" to
>>>>                                         continue using the 1
>>>>                                         indices convention.
>>>>
>>>>                                         Thank you.
>>>>
>>>>                                         Yours sincerely,
>>>>
>>>>                                         TAY wee-beng
>>>>
>>>>                                         On 15/4/2014 8:00 PM, Barry
>>>>                                         Smith wrote:
>>>>
>>>>                                                Try running under
>>>>                                             valgrind
>>>>                                             http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>
>>>>
>>>>                                             On Apr 14, 2014, at
>>>>                                             9:47 PM, TAY wee-beng
>>>>                                             <zonexo at gmail.com
>>>>                                             <mailto:zonexo at gmail.com>>
>>>>                                             wrote:
>>>>
>>>>                                                 Hi Barry,
>>>>
>>>>                                                 As I mentioned
>>>>                                                 earlier, the code
>>>>                                                 works fine in PETSc
>>>>                                                 debug mode but
>>>>                                                 fails in non-debug
>>>>                                                 mode.
>>>>
>>>>                                                 I have attached my
>>>>                                                 code.
>>>>
>>>>                                                 Thank you
>>>>
>>>>                                                 Yours sincerely,
>>>>
>>>>                                                 TAY wee-beng
>>>>
>>>>                                                 On 15/4/2014 2:26
>>>>                                                 AM, Barry Smith wrote:
>>>>
>>>>                                                        Please send
>>>>                                                     the code that
>>>>                                                     creates da_w
>>>>                                                     and the
>>>>                                                     declarations of
>>>>                                                     w_array
>>>>
>>>>                                                        Barry
>>>>
>>>>                                                     On Apr 14,
>>>>                                                     2014, at 9:40
>>>>                                                     AM, TAY wee-beng
>>>>                                                     <zonexo at gmail.com
>>>>                                                     <mailto:zonexo at gmail.com>>
>>>>                                                       wrote:
>>>>
>>>>
>>>>                                                         Hi Barry,
>>>>
>>>>                                                         I'm not too
>>>>                                                         sure how to
>>>>                                                         do it. I'm
>>>>                                                         running
>>>>                                                         mpi. So I run:
>>>>
>>>>                                                           mpirun -n
>>>>                                                         4 ./a.out
>>>>                                                         -start_in_debugger
>>>>
>>>>                                                         I got the
>>>>                                                         msg below.
>>>>                                                         Before the
>>>>                                                         gdb windows
>>>>                                                         appear
>>>>                                                         (thru x11),
>>>>                                                         the program
>>>>                                                         aborts.
>>>>
>>>>                                                         Also I
>>>>                                                         tried
>>>>                                                         running in
>>>>                                                         another
>>>>                                                         cluster and
>>>>                                                         it worked.
>>>>                                                         Also tried
>>>>                                                         in the
>>>>                                                         current
>>>>                                                         cluster in
>>>>                                                         debug mode
>>>>                                                         and it
>>>>                                                         worked too.
>>>>
>>>>                                                         mpirun -n 4
>>>>                                                         ./a.out
>>>>                                                         -start_in_debugger
>>>>                                                         --------------------------------------------------------------------------
>>>>                                                         An MPI
>>>>                                                         process has
>>>>                                                         executed an
>>>>                                                         operation
>>>>                                                         involving a
>>>>                                                         call to the
>>>>                                                         "fork()"
>>>>                                                         system call
>>>>                                                         to create a
>>>>                                                         child
>>>>                                                         process.
>>>>                                                          Open MPI
>>>>                                                         is currently
>>>>                                                         operating
>>>>                                                         in a
>>>>                                                         condition
>>>>                                                         that could
>>>>                                                         result in
>>>>                                                         memory
>>>>                                                         corruption or
>>>>                                                         other
>>>>                                                         system
>>>>                                                         errors;
>>>>                                                         your MPI
>>>>                                                         job may
>>>>                                                         hang,
>>>>                                                         crash, or
>>>>                                                         produce silent
>>>>                                                         data
>>>>                                                         corruption.
>>>>                                                          The use of
>>>>                                                         fork() (or
>>>>                                                         system() or
>>>>                                                         other calls
>>>>                                                         that
>>>>                                                         create
>>>>                                                         child
>>>>                                                         processes)
>>>>                                                         is strongly
>>>>                                                         discouraged.
>>>>
>>>>                                                         The process
>>>>                                                         that
>>>>                                                         invoked
>>>>                                                         fork was:
>>>>
>>>>                                                            Local
>>>>                                                         host:
>>>>                                                          n12-76
>>>>                                                         (PID 20235)
>>>>                                                          MPI_COMM_WORLD
>>>>                                                         rank: 2
>>>>
>>>>                                                         If you are
>>>>                                                         *absolutely
>>>>                                                         sure* that
>>>>                                                         your
>>>>                                                         application
>>>>                                                         will
>>>>                                                         successfully
>>>>                                                         and
>>>>                                                         correctly
>>>>                                                         survive a
>>>>                                                         call to
>>>>                                                         fork(), you
>>>>                                                         may disable
>>>>                                                         this warning
>>>>                                                         by setting
>>>>                                                         the
>>>>                                                         mpi_warn_on_fork
>>>>                                                         MCA
>>>>                                                         parameter to 0.
>>>>                                                         --------------------------------------------------------------------------
>>>>                                                         [2]PETSC
>>>>                                                         ERROR:
>>>>                                                         PETSC:
>>>>                                                         Attaching
>>>>                                                         gdb to
>>>>                                                         ./a.out of
>>>>                                                         pid 20235
>>>>                                                         on display
>>>>                                                         localhost:50.0
>>>>                                                         on machine
>>>>                                                         n12-76
>>>>                                                         [0]PETSC
>>>>                                                         ERROR:
>>>>                                                         PETSC:
>>>>                                                         Attaching
>>>>                                                         gdb to
>>>>                                                         ./a.out of
>>>>                                                         pid 20233
>>>>                                                         on display
>>>>                                                         localhost:50.0
>>>>                                                         on machine
>>>>                                                         n12-76
>>>>                                                         [1]PETSC
>>>>                                                         ERROR:
>>>>                                                         PETSC:
>>>>                                                         Attaching
>>>>                                                         gdb to
>>>>                                                         ./a.out of
>>>>                                                         pid 20234
>>>>                                                         on display
>>>>                                                         localhost:50.0
>>>>                                                         on machine
>>>>                                                         n12-76
>>>>                                                         [3]PETSC
>>>>                                                         ERROR:
>>>>                                                         PETSC:
>>>>                                                         Attaching
>>>>                                                         gdb to
>>>>                                                         ./a.out of
>>>>                                                         pid 20236
>>>>                                                         on display
>>>>                                                         localhost:50.0
>>>>                                                         on machine
>>>>                                                         n12-76
>>>>                                                         [n12-76:20232]
>>>>                                                         3 more
>>>>                                                         processes
>>>>                                                         have sent
>>>>                                                         help
>>>>                                                         message
>>>>                                                         help-mpi-runtime.txt
>>>>                                                         /
>>>>                                                         mpi_init:warn-fork
>>>>                                                         [n12-76:20232]
>>>>                                                         Set MCA
>>>>                                                         parameter
>>>>                                                         "orte_base_help_aggregate"
>>>>                                                         to 0 to see
>>>>                                                         all help /
>>>>                                                         error messages
>>>>
>>>>                                                         ....
>>>>
>>>>                                                           1
>>>>                                                         [1]PETSC
>>>>                                                         ERROR:
>>>>                                                         ------------------------------------------------------------------------
>>>>                                                         [1]PETSC
>>>>                                                         ERROR:
>>>>                                                         Caught
>>>>                                                         signal
>>>>                                                         number 11
>>>>                                                         SEGV:
>>>>                                                         Segmentation Violation,
>>>>                                                         probably
>>>>                                                         memory
>>>>                                                         access out
>>>>                                                         of range
>>>>                                                         [1]PETSC
>>>>                                                         ERROR: Try
>>>>                                                         option
>>>>                                                         -start_in_debugger
>>>>                                                         or
>>>>                                                         -on_error_attach_debugger
>>>>                                                         [1]PETSC
>>>>                                                         ERROR: or see
>>>>                                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC
>>>>                                                         ERROR: or
>>>>                                                         try
>>>>                                                         http://valgrind.org
>>>>                                                           on
>>>>                                                         GNU/linux
>>>>                                                         and Apple
>>>>                                                         Mac OS X to
>>>>                                                         find memory
>>>>                                                         corruption
>>>>                                                         errors
>>>>                                                         [1]PETSC
>>>>                                                         ERROR:
>>>>                                                         configure
>>>>                                                         using
>>>>                                                         --with-debugging=yes,
>>>>                                                         recompile,
>>>>                                                         link, and run
>>>>                                                         [1]PETSC
>>>>                                                         ERROR: to
>>>>                                                         get more
>>>>                                                         information
>>>>                                                         on the crash.
>>>>                                                         [1]PETSC
>>>>                                                         ERROR: User
>>>>                                                         provided
>>>>                                                         function()
>>>>                                                         line 0 in
>>>>                                                         unknown
>>>>                                                         directory
>>>>                                                         unknown
>>>>                                                         file (null)
>>>>                                                         [3]PETSC
>>>>                                                         ERROR:
>>>>                                                         ------------------------------------------------------------------------
>>>>                                                         [3]PETSC
>>>>                                                         ERROR:
>>>>                                                         Caught
>>>>                                                         signal
>>>>                                                         number 11
>>>>                                                         SEGV:
>>>>                                                         Segmentation Violation,
>>>>                                                         probably
>>>>                                                         memory
>>>>                                                         access out
>>>>                                                         of range
>>>>                                                         [3]PETSC
>>>>                                                         ERROR: Try
>>>>                                                         option
>>>>                                                         -start_in_debugger
>>>>                                                         or
>>>>                                                         -on_error_attach_debugger
>>>>                                                         [3]PETSC
>>>>                                                         ERROR: or see
>>>>                                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC
>>>>                                                         ERROR: or
>>>>                                                         try
>>>>                                                         http://valgrind.org
>>>>                                                           on
>>>>                                                         GNU/linux
>>>>                                                         and Apple
>>>>                                                         Mac OS X to
>>>>                                                         find memory
>>>>                                                         corruption
>>>>                                                         errors
>>>>                                                         [3]PETSC
>>>>                                                         ERROR:
>>>>                                                         configure
>>>>                                                         using
>>>>                                                         --with-debugging=yes,
>>>>                                                         recompile,
>>>>                                                         link, and run
>>>>                                                         [3]PETSC
>>>>                                                         ERROR: to
>>>>                                                         get more
>>>>                                                         information
>>>>                                                         on the crash.
>>>>                                                         [3]PETSC
>>>>                                                         ERROR: User
>>>>                                                         provided
>>>>                                                         function()
>>>>                                                         line 0 in
>>>>                                                         unknown
>>>>                                                         directory
>>>>                                                         unknown
>>>>                                                         file (null)
>>>>
>>>>                                                         ...
>>>>                                                         Thank you.
>>>>
>>>>                                                         Yours
>>>>                                                         sincerely,
>>>>
>>>>                                                         TAY wee-beng
>>>>
>>>>                                                         On
>>>>                                                         14/4/2014
>>>>                                                         9:05 PM,
>>>>                                                         Barry Smith
>>>>                                                         wrote:
>>>>
>>>>                                                              
>>>>                                                              Because IO
>>>>                                                             doesn’t
>>>>                                                             always
>>>>                                                             get
>>>>                                                             flushed
>>>>                                                             immediately
>>>>                                                             it may
>>>>                                                             not be
>>>>                                                             hanging
>>>>                                                             at this
>>>>                                                             point.
>>>>                                                              It is
>>>>                                                             better
>>>>                                                             to use
>>>>                                                             the
>>>>                                                             option
>>>>                                                             -start_in_debugger
>>>>                                                             then
>>>>                                                             type
>>>>                                                             cont in
>>>>                                                             each
>>>>                                                             debugger window
>>>>                                                             and
>>>>                                                             then
>>>>                                                             when
>>>>                                                             you
>>>>                                                             think
>>>>                                                             it is
>>>>                                                             “hanging”
>>>>                                                             do a
>>>>                                                             control
>>>>                                                             C in
>>>>                                                             each
>>>>                                                             debugger window
>>>>                                                             and
>>>>                                                             type
>>>>                                                             where
>>>>                                                             to see
>>>>                                                             where
>>>>                                                             each
>>>>                                                             process
>>>>                                                             is you
>>>>                                                             can
>>>>                                                             also
>>>>                                                             look
>>>>                                                             around
>>>>                                                             in the
>>>>                                                             debugger at
>>>>                                                             variables
>>>>                                                             to see
>>>>                                                             why it
>>>>                                                             is
>>>>                                                             “hanging”
>>>>                                                             at that
>>>>                                                             point.
>>>>
>>>>                                                                 Barry
>>>>
>>>>                                                                This
>>>>                                                             routines don’t
>>>>                                                             have
>>>>                                                             any
>>>>                                                             parallel communication
>>>>                                                             in them
>>>>                                                             so are
>>>>                                                             unlikely to
>>>>                                                             hang.
>>>>
>>>>                                                             On Apr
>>>>                                                             14,
>>>>                                                             2014,
>>>>                                                             at 6:52
>>>>                                                             AM, TAY
>>>>                                                             wee-beng
>>>>
>>>>                                                             <zonexo at gmail.com
>>>>                                                             <mailto:zonexo at gmail.com>>
>>>>
>>>>                                                               wrote:
>>>>
>>>>
>>>>
>>>>                                                                 Hi,
>>>>
>>>>                                                                 My
>>>>                                                                 code hangs
>>>>                                                                 and
>>>>                                                                 I
>>>>                                                                 added
>>>>                                                                 in
>>>>                                                                 mpi_barrier
>>>>                                                                 and
>>>>                                                                 print
>>>>                                                                 to
>>>>                                                                 catch
>>>>                                                                 the
>>>>                                                                 bug. I
>>>>                                                                 found
>>>>                                                                 that it
>>>>                                                                 hangs
>>>>                                                                 after
>>>>                                                                 printing
>>>>                                                                 "7". Is
>>>>                                                                 it
>>>>                                                                 because
>>>>                                                                 I'm
>>>>                                                                 doing
>>>>                                                                 something
>>>>                                                                 wrong?
>>>>                                                                 I
>>>>                                                                 need to
>>>>                                                                 access
>>>>                                                                 the
>>>>                                                                 u,v,w
>>>>                                                                 array
>>>>                                                                 so
>>>>                                                                 I
>>>>                                                                 use
>>>>                                                                 DMDAVecGetArrayF90.
>>>>                                                                 After
>>>>                                                                 access,
>>>>                                                                 I
>>>>                                                                 use
>>>>                                                                 DMDAVecRestoreArrayF90.
>>>>
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>                                                                  if
>>>>                                                                 (myid==0)
>>>>                                                                 print
>>>>                                                                 *,"3"
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>                                                                  if
>>>>                                                                 (myid==0)
>>>>                                                                 print
>>>>                                                                 *,"4"
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>                                                                  if
>>>>                                                                 (myid==0)
>>>>                                                                 print
>>>>                                                                 *,"5"
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>                                                                  if
>>>>                                                                 (myid==0)
>>>>                                                                 print
>>>>                                                                 *,"6"
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>                                                                  !must
>>>>                                                                 be
>>>>                                                                 in
>>>>                                                                 reverse
>>>>                                                                 order
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>                                                                  if
>>>>                                                                 (myid==0)
>>>>                                                                 print
>>>>                                                                 *,"7"
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>                                                                  if
>>>>                                                                 (myid==0)
>>>>                                                                 print
>>>>                                                                 *,"8"
>>>>                                                                    
>>>>                                                                    
>>>>                                                                  call
>>>>                                                                 DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>                                                                 -- 
>>>>                                                                 Thank
>>>>                                                                 you.
>>>>
>>>>                                                                 Yours
>>>>                                                                 sincerely,
>>>>
>>>>                                                                 TAY
>>>>                                                                 wee-beng
>>>>
>>>>
>>>>
>>>>                                                 <code.txt>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             What most experimenters take for granted before they
>>>>             begin their experiments is infinitely more interesting
>>>>             than any results to which their experiments lead.
>>>>             -- Norbert Wiener
>>>
>>>
>>>
>>>
>>>         -- 
>>>         What most experimenters take for granted before they begin
>>>         their experiments is infinitely more interesting than any
>>>         results to which their experiments lead.
>>>         -- Norbert Wiener
>>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140420/034f36a6/attachment-0001.html>


More information about the petsc-users mailing list