[petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90

TAY wee-beng zonexo at gmail.com
Sun Apr 20 19:49:07 CDT 2014


On 20/4/2014 8:39 AM, TAY wee-beng wrote:
> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com 
>> <mailto:zonexo at gmail.com>> wrote:
>>
>>     On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>     On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com
>>>     <mailto:zonexo at gmail.com>> wrote:
>>>
>>>         On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>         On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng
>>>>         <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>>>>
>>>>             On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>>             On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng
>>>>>             <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>>>>>
>>>>>                 On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>>
>>>>>                     On Apr 19, 2014, at 12:11 AM, TAY wee-beng
>>>>>                     <zonexo at gmail.com <mailto:zonexo at gmail.com>>
>>>>>                     wrote:
>>>>>
>>>>>                         On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>>
>>>>>                             On Apr 18, 2014, at 9:57 PM, TAY
>>>>>                             wee-beng <zonexo at gmail.com
>>>>>                             <mailto:zonexo at gmail.com>> wrote:
>>>>>
>>>>>                                 On 19/4/2014 3:53 AM, Barry Smith
>>>>>                                 wrote:
>>>>>
>>>>>                                        Hmm,
>>>>>
>>>>>                                      Interface DMDAVecGetArrayF90
>>>>>                                      Subroutine
>>>>>                                     DMDAVecGetArrayF903(da1,
>>>>>                                     v,d1,ierr)
>>>>>                                      USE_DM_HIDE
>>>>>                                      DM_HIDE da1
>>>>>                                      VEC_HIDE v
>>>>>                                      PetscScalar,pointer :: d1(:,:,:)
>>>>>                                      PetscErrorCode ierr
>>>>>                                              End Subroutine
>>>>>
>>>>>                                         So the d1 is a F90
>>>>>                                     POINTER. But your subroutine
>>>>>                                     seems to be treating it as a
>>>>>                                     “plain old Fortran array”?
>>>>>                                     real(8), intent(inout) ::
>>>>>                                     u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>
>>>>>                         Hi,
>>>>>
>>>>>                         So d1 is a pointer, and it's different if
>>>>>                         I declare it as "plain old Fortran array"?
>>>>>                         Because I declare it as a Fortran array
>>>>>                         and it works w/o any problem if I only
>>>>>                         call DMDAVecGetArrayF90 and
>>>>>                         DMDAVecRestoreArrayF90 with "u".
>>>>>
>>>>>                         But if I call DMDAVecGetArrayF90 and
>>>>>                         DMDAVecRestoreArrayF90 with "u", "v" and
>>>>>                         "w", error starts to happen. I wonder why...
>>>>>
>>>>>                         Also, supposed I call:
>>>>>
>>>>>                         call
>>>>>                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                             call
>>>>>                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                             call
>>>>>                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                         u_array ....
>>>>>
>>>>>                         v_array .... etc
>>>>>
>>>>>                         Now to restore the array, does it matter
>>>>>                         the sequence they are restored?
>>>>>
>>>>>                         No it should not matter. If it matters
>>>>>                     that is a sign that memory has been written to
>>>>>                     incorrectly earlier in the code.
>>>>>
>>>>>                 Hi,
>>>>>
>>>>>                 Hmm, I have been getting different results on
>>>>>                 different intel compilers. I'm not sure if MPI
>>>>>                 played a part but I'm only using a single
>>>>>                 processor. In the debug mode, things run without
>>>>>                 problem. In optimized mode, in some cases, the
>>>>>                 code aborts even doing simple initialization:
>>>>>
>>>>>
>>>>>                 call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                     call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                     call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                     call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>                     u_array = 0.d0
>>>>>
>>>>>                     v_array = 0.d0
>>>>>
>>>>>                     w_array = 0.d0
>>>>>
>>>>>                     p_array = 0.d0
>>>>>
>>>>>
>>>>>                     call
>>>>>                 DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>
>>>>>                     call
>>>>>                 DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                     call
>>>>>                 DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                     call
>>>>>                 DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                 The code aborts at call
>>>>>                 DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr),
>>>>>                 giving segmentation error. But other version of
>>>>>                 intel compiler passes thru this part w/o error.
>>>>>                 Since the response is different among different
>>>>>                 compilers, is this PETSc or intel 's bug? Or
>>>>>                 mvapich or openmpi?
>>>>>
>>>>>
>>>>>             We do this is a bunch of examples. Can you reproduce
>>>>>             this different behavior in
>>>>>             src/dm/examples/tutorials/ex11f90.F?
>>>>
>>>>             Hi Matt,
>>>>
>>>>             Do you mean putting the above lines into ex11f90.F and
>>>>             test?
>>>>
>>>>
>>>>         It already has DMDAVecGetArray(). Just run it.
>>>
>>>         Hi,
>>>
>>>         It worked. The differences between mine and the code is the
>>>         way the fortran modules are defined, and the ex11f90 only
>>>         uses global vectors. Does it make a difference whether
>>>         global or local vectors are used? Because the way it
>>>         accesses x1 only touches the local region.
>>>
>>>
>>>     No the global/local difference should not matter.
>>>
>>>         Also, before using DMDAVecGetArrayF90, DMGetGlobalVector
>>>         must be used 1st, is that so? I can't find the equivalent
>>>         for local vector though.
>>>
>>>
>>>     DMGetLocalVector()
>>
>>     Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in
>>     my code. Does it matter?
>>
>>     If so, when should I call them?
>>
>>
>> You just need a local vector from somewhere.

Hi,

Anyone can help with the questions below? Still trying to find why my 
code doesn't work.

Thanks.
> Hi,
>
> I insert part of my error region code into ex11f90:
>
> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
>     call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>
>     call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>
>     call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>
>     u_array = 0.d0
>
>     v_array = 0.d0
>
>     w_array = 0.d0
>
>     p_array = 0.d0
>
>     call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>
>     call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>
>     call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>
>     call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
> It worked w/o error. I'm going to change the way the modules are 
> defined in my code.
>
> My code contains a main program and a number of modules files, with 
> subroutines inside e.g.
>
> module solve
>                  <- add include file?
> subroutine RRK
>                  <- add include file?
> end subroutine RRK
>
> end module solve
>
> So where should the include files (#include <finclude/petscdmda.h90>) 
> be placed?
>
> After the module or inside the subroutine?
>
> Thanks.
>>
>>   Matt
>>
>>     Thanks.
>>>
>>>        Matt
>>>
>>>         Thanks.
>>>>
>>>>            Matt
>>>>
>>>>             Thanks
>>>>
>>>>             Regards.
>>>>>
>>>>>                Matt
>>>>>
>>>>>                         As in w, then v and u?
>>>>>
>>>>>                         call
>>>>>                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>                         call
>>>>>                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>                         call
>>>>>                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                         thanks
>>>>>
>>>>>                                         Note also that the
>>>>>                                     beginning and end indices of
>>>>>                                     the u,v,w, are different for
>>>>>                                     each process see for example
>>>>>                                     http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F
>>>>>                                      (and they do not start at 1).
>>>>>                                     This is how to get the loop
>>>>>                                     bounds.
>>>>>
>>>>>                                 Hi,
>>>>>
>>>>>                                 In my case, I fixed the u,v,w such
>>>>>                                 that their indices are the same. I
>>>>>                                 also checked using DMDAGetCorners
>>>>>                                 and DMDAGetGhostCorners. Now the
>>>>>                                 problem lies in my subroutine
>>>>>                                 treating it as a “plain old
>>>>>                                 Fortran array”.
>>>>>
>>>>>                                 If I declare them as pointers,
>>>>>                                 their indices follow the C 0 start
>>>>>                                 convention, is that so?
>>>>>
>>>>>                                 Not really. It is that in each
>>>>>                             process you need to access them from
>>>>>                             the indices indicated by
>>>>>                             DMDAGetCorners() for global vectors
>>>>>                             and DMDAGetGhostCorners() for local
>>>>>                             vectors.  So really C or Fortran
>>>>>                             doesn’t make any difference.
>>>>>
>>>>>
>>>>>                                 So my problem now is that in my
>>>>>                                 old MPI code, the u(i,j,k) follow
>>>>>                                 the Fortran 1 start convention. Is
>>>>>                                 there some way to manipulate such
>>>>>                                 that I do not have to change my
>>>>>                                 u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>
>>>>>                                If you code wishes to access them
>>>>>                             with indices plus one from the values
>>>>>                             returned by DMDAGetCorners() for
>>>>>                             global vectors and
>>>>>                             DMDAGetGhostCorners() for local
>>>>>                             vectors then you need to manually
>>>>>                             subtract off the 1.
>>>>>
>>>>>                                Barry
>>>>>
>>>>>                                 Thanks.
>>>>>
>>>>>                                        Barry
>>>>>
>>>>>                                     On Apr 18, 2014, at 10:58 AM,
>>>>>                                     TAY wee-beng <zonexo at gmail.com
>>>>>                                     <mailto:zonexo at gmail.com>> wrote:
>>>>>
>>>>>                                         Hi,
>>>>>
>>>>>                                         I tried to pinpoint the
>>>>>                                         problem. I reduced my job
>>>>>                                         size and hence I can run
>>>>>                                         on 1 processor. Tried
>>>>>                                         using valgrind but perhaps
>>>>>                                         I'm using the optimized
>>>>>                                         version, it didn't catch
>>>>>                                         the error, besides saying
>>>>>                                         "Segmentation fault (core
>>>>>                                         dumped)"
>>>>>
>>>>>                                         However, by re-writing my
>>>>>                                         code, I found out a few
>>>>>                                         things:
>>>>>
>>>>>                                         1. if I write my code this
>>>>>                                         way:
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                                         u_array = ....
>>>>>
>>>>>                                         v_array = ....
>>>>>
>>>>>                                         w_array = ....
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                                         The code runs fine.
>>>>>
>>>>>                                         2. if I write my code this
>>>>>                                         way:
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         uvw_array_change(u_array,v_array,w_array)
>>>>>                                         -> this subroutine does
>>>>>                                         the same modification as
>>>>>                                         the above.
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>                                         -> error
>>>>>
>>>>>                                         where the subroutine is:
>>>>>
>>>>>                                         subroutine
>>>>>                                         uvw_array_change(u,v,w)
>>>>>
>>>>>                                         real(8), intent(inout) ::
>>>>>                                         u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>
>>>>>                                         u ...
>>>>>                                         v...
>>>>>                                         w ...
>>>>>
>>>>>                                         end subroutine
>>>>>                                         uvw_array_change.
>>>>>
>>>>>                                         The above will give an
>>>>>                                         error at :
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                                         3. Same as above, except I
>>>>>                                         change the order of the
>>>>>                                         last 3 lines to:
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>                                         call
>>>>>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>                                         So they are now in
>>>>>                                         reversed order. Now it works.
>>>>>
>>>>>                                         4. Same as 2 or 3, except
>>>>>                                         the subroutine is changed to :
>>>>>
>>>>>                                         subroutine
>>>>>                                         uvw_array_change(u,v,w)
>>>>>
>>>>>                                         real(8), intent(inout) ::
>>>>>                                         u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>
>>>>>                                         real(8), intent(inout) ::
>>>>>                                         v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>
>>>>>                                         real(8), intent(inout) ::
>>>>>                                         w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>
>>>>>                                         u ...
>>>>>                                         v...
>>>>>                                         w ...
>>>>>
>>>>>                                         end subroutine
>>>>>                                         uvw_array_change.
>>>>>
>>>>>                                         The start_indices and
>>>>>                                         end_indices are simply to
>>>>>                                         shift the 0 indices of C
>>>>>                                         convention to that of the
>>>>>                                         1 indices of the Fortran
>>>>>                                         convention. This is
>>>>>                                         necessary in my case
>>>>>                                         because most of my codes
>>>>>                                         start array counting at 1,
>>>>>                                         hence the "trick".
>>>>>
>>>>>                                         However, now no matter
>>>>>                                         which order of the
>>>>>                                         DMDAVecRestoreArrayF90 (as
>>>>>                                         in 2 or 3), error will
>>>>>                                         occur at "call
>>>>>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>                                         "
>>>>>
>>>>>                                         So did I violate and cause
>>>>>                                         memory corruption due to
>>>>>                                         the trick above? But I
>>>>>                                         can't think of any way
>>>>>                                         other than the "trick" to
>>>>>                                         continue using the 1
>>>>>                                         indices convention.
>>>>>
>>>>>                                         Thank you.
>>>>>
>>>>>                                         Yours sincerely,
>>>>>
>>>>>                                         TAY wee-beng
>>>>>
>>>>>                                         On 15/4/2014 8:00 PM,
>>>>>                                         Barry Smith wrote:
>>>>>
>>>>>                                                Try running under
>>>>>                                             valgrind
>>>>>                                             http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>
>>>>>
>>>>>                                             On Apr 14, 2014, at
>>>>>                                             9:47 PM, TAY wee-beng
>>>>>                                             <zonexo at gmail.com
>>>>>                                             <mailto:zonexo at gmail.com>>
>>>>>                                             wrote:
>>>>>
>>>>>                                                 Hi Barry,
>>>>>
>>>>>                                                 As I mentioned
>>>>>                                                 earlier, the code
>>>>>                                                 works fine in
>>>>>                                                 PETSc debug mode
>>>>>                                                 but fails in
>>>>>                                                 non-debug mode.
>>>>>
>>>>>                                                 I have attached my
>>>>>                                                 code.
>>>>>
>>>>>                                                 Thank you
>>>>>
>>>>>                                                 Yours sincerely,
>>>>>
>>>>>                                                 TAY wee-beng
>>>>>
>>>>>                                                 On 15/4/2014 2:26
>>>>>                                                 AM, Barry Smith wrote:
>>>>>
>>>>>                                                        Please send
>>>>>                                                     the code that
>>>>>                                                     creates da_w
>>>>>                                                     and the
>>>>>                                                     declarations
>>>>>                                                     of w_array
>>>>>
>>>>>                                                        Barry
>>>>>
>>>>>                                                     On Apr 14,
>>>>>                                                     2014, at 9:40
>>>>>                                                     AM, TAY wee-beng
>>>>>                                                     <zonexo at gmail.com
>>>>>                                                     <mailto:zonexo at gmail.com>>
>>>>>                                                       wrote:
>>>>>
>>>>>
>>>>>                                                         Hi Barry,
>>>>>
>>>>>                                                         I'm not
>>>>>                                                         too sure
>>>>>                                                         how to do
>>>>>                                                         it. I'm
>>>>>                                                         running
>>>>>                                                         mpi. So I run:
>>>>>
>>>>>                                                           mpirun
>>>>>                                                         -n 4
>>>>>                                                         ./a.out
>>>>>                                                         -start_in_debugger
>>>>>
>>>>>                                                         I got the
>>>>>                                                         msg below.
>>>>>                                                         Before the
>>>>>                                                         gdb
>>>>>                                                         windows
>>>>>                                                         appear
>>>>>                                                         (thru
>>>>>                                                         x11), the
>>>>>                                                         program
>>>>>                                                         aborts.
>>>>>
>>>>>                                                         Also I
>>>>>                                                         tried
>>>>>                                                         running in
>>>>>                                                         another
>>>>>                                                         cluster
>>>>>                                                         and it
>>>>>                                                         worked.
>>>>>                                                         Also tried
>>>>>                                                         in the
>>>>>                                                         current
>>>>>                                                         cluster in
>>>>>                                                         debug mode
>>>>>                                                         and it
>>>>>                                                         worked too.
>>>>>
>>>>>                                                         mpirun -n
>>>>>                                                         4 ./a.out
>>>>>                                                         -start_in_debugger
>>>>>                                                         --------------------------------------------------------------------------
>>>>>                                                         An MPI
>>>>>                                                         process
>>>>>                                                         has
>>>>>                                                         executed
>>>>>                                                         an
>>>>>                                                         operation
>>>>>                                                         involving
>>>>>                                                         a call to the
>>>>>                                                         "fork()"
>>>>>                                                         system
>>>>>                                                         call to
>>>>>                                                         create a
>>>>>                                                         child
>>>>>                                                         process.
>>>>>                                                          Open MPI
>>>>>                                                         is currently
>>>>>                                                         operating
>>>>>                                                         in a
>>>>>                                                         condition
>>>>>                                                         that could
>>>>>                                                         result in
>>>>>                                                         memory
>>>>>                                                         corruption or
>>>>>                                                         other
>>>>>                                                         system
>>>>>                                                         errors;
>>>>>                                                         your MPI
>>>>>                                                         job may
>>>>>                                                         hang,
>>>>>                                                         crash, or
>>>>>                                                         produce silent
>>>>>                                                         data
>>>>>                                                         corruption.  The
>>>>>                                                         use of
>>>>>                                                         fork() (or
>>>>>                                                         system()
>>>>>                                                         or other
>>>>>                                                         calls that
>>>>>                                                         create
>>>>>                                                         child
>>>>>                                                         processes)
>>>>>                                                         is
>>>>>                                                         strongly
>>>>>                                                         discouraged.
>>>>>
>>>>>                                                         The
>>>>>                                                         process
>>>>>                                                         that
>>>>>                                                         invoked
>>>>>                                                         fork was:
>>>>>
>>>>>                                                            Local
>>>>>                                                         host:
>>>>>                                                          n12-76
>>>>>                                                         (PID 20235)
>>>>>                                                          MPI_COMM_WORLD
>>>>>                                                         rank: 2
>>>>>
>>>>>                                                         If you are
>>>>>                                                         *absolutely sure*
>>>>>                                                         that your
>>>>>                                                         application will
>>>>>                                                         successfully
>>>>>                                                         and
>>>>>                                                         correctly
>>>>>                                                         survive a
>>>>>                                                         call to
>>>>>                                                         fork(),
>>>>>                                                         you may
>>>>>                                                         disable
>>>>>                                                         this warning
>>>>>                                                         by setting
>>>>>                                                         the
>>>>>                                                         mpi_warn_on_fork
>>>>>                                                         MCA
>>>>>                                                         parameter
>>>>>                                                         to 0.
>>>>>                                                         --------------------------------------------------------------------------
>>>>>                                                         [2]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         PETSC:
>>>>>                                                         Attaching
>>>>>                                                         gdb to
>>>>>                                                         ./a.out of
>>>>>                                                         pid 20235
>>>>>                                                         on display
>>>>>                                                         localhost:50.0
>>>>>                                                         on machine
>>>>>                                                         n12-76
>>>>>                                                         [0]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         PETSC:
>>>>>                                                         Attaching
>>>>>                                                         gdb to
>>>>>                                                         ./a.out of
>>>>>                                                         pid 20233
>>>>>                                                         on display
>>>>>                                                         localhost:50.0
>>>>>                                                         on machine
>>>>>                                                         n12-76
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         PETSC:
>>>>>                                                         Attaching
>>>>>                                                         gdb to
>>>>>                                                         ./a.out of
>>>>>                                                         pid 20234
>>>>>                                                         on display
>>>>>                                                         localhost:50.0
>>>>>                                                         on machine
>>>>>                                                         n12-76
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         PETSC:
>>>>>                                                         Attaching
>>>>>                                                         gdb to
>>>>>                                                         ./a.out of
>>>>>                                                         pid 20236
>>>>>                                                         on display
>>>>>                                                         localhost:50.0
>>>>>                                                         on machine
>>>>>                                                         n12-76
>>>>>                                                         [n12-76:20232]
>>>>>                                                         3 more
>>>>>                                                         processes
>>>>>                                                         have sent
>>>>>                                                         help
>>>>>                                                         message
>>>>>                                                         help-mpi-runtime.txt
>>>>>                                                         /
>>>>>                                                         mpi_init:warn-fork
>>>>>                                                         [n12-76:20232]
>>>>>                                                         Set MCA
>>>>>                                                         parameter
>>>>>                                                         "orte_base_help_aggregate"
>>>>>                                                         to 0 to
>>>>>                                                         see all
>>>>>                                                         help /
>>>>>                                                         error messages
>>>>>
>>>>>                                                         ....
>>>>>
>>>>>                                                           1
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         ------------------------------------------------------------------------
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         Caught
>>>>>                                                         signal
>>>>>                                                         number 11
>>>>>                                                         SEGV:
>>>>>                                                         Segmentation
>>>>>                                                         Violation,
>>>>>                                                         probably
>>>>>                                                         memory
>>>>>                                                         access out
>>>>>                                                         of range
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR: Try
>>>>>                                                         option
>>>>>                                                         -start_in_debugger
>>>>>                                                         or
>>>>>                                                         -on_error_attach_debugger
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR: or see
>>>>>                                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC
>>>>>                                                         ERROR: or
>>>>>                                                         try
>>>>>                                                         http://valgrind.org
>>>>>                                                           on
>>>>>                                                         GNU/linux
>>>>>                                                         and Apple
>>>>>                                                         Mac OS X
>>>>>                                                         to find
>>>>>                                                         memory
>>>>>                                                         corruption
>>>>>                                                         errors
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         configure
>>>>>                                                         using
>>>>>                                                         --with-debugging=yes,
>>>>>                                                         recompile,
>>>>>                                                         link, and run
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR: to
>>>>>                                                         get more
>>>>>                                                         information on
>>>>>                                                         the crash.
>>>>>                                                         [1]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         User
>>>>>                                                         provided
>>>>>                                                         function()
>>>>>                                                         line 0 in
>>>>>                                                         unknown
>>>>>                                                         directory
>>>>>                                                         unknown
>>>>>                                                         file (null)
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         ------------------------------------------------------------------------
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         Caught
>>>>>                                                         signal
>>>>>                                                         number 11
>>>>>                                                         SEGV:
>>>>>                                                         Segmentation
>>>>>                                                         Violation,
>>>>>                                                         probably
>>>>>                                                         memory
>>>>>                                                         access out
>>>>>                                                         of range
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR: Try
>>>>>                                                         option
>>>>>                                                         -start_in_debugger
>>>>>                                                         or
>>>>>                                                         -on_error_attach_debugger
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR: or see
>>>>>                                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC
>>>>>                                                         ERROR: or
>>>>>                                                         try
>>>>>                                                         http://valgrind.org
>>>>>                                                           on
>>>>>                                                         GNU/linux
>>>>>                                                         and Apple
>>>>>                                                         Mac OS X
>>>>>                                                         to find
>>>>>                                                         memory
>>>>>                                                         corruption
>>>>>                                                         errors
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         configure
>>>>>                                                         using
>>>>>                                                         --with-debugging=yes,
>>>>>                                                         recompile,
>>>>>                                                         link, and run
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR: to
>>>>>                                                         get more
>>>>>                                                         information on
>>>>>                                                         the crash.
>>>>>                                                         [3]PETSC
>>>>>                                                         ERROR:
>>>>>                                                         User
>>>>>                                                         provided
>>>>>                                                         function()
>>>>>                                                         line 0 in
>>>>>                                                         unknown
>>>>>                                                         directory
>>>>>                                                         unknown
>>>>>                                                         file (null)
>>>>>
>>>>>                                                         ...
>>>>>                                                         Thank you.
>>>>>
>>>>>                                                         Yours
>>>>>                                                         sincerely,
>>>>>
>>>>>                                                         TAY wee-beng
>>>>>
>>>>>                                                         On
>>>>>                                                         14/4/2014
>>>>>                                                         9:05 PM,
>>>>>                                                         Barry
>>>>>                                                         Smith wrote:
>>>>>
>>>>>                                                              
>>>>>                                                              Because
>>>>>                                                             IO
>>>>>                                                             doesn’t always
>>>>>                                                             get
>>>>>                                                             flushed immediately
>>>>>                                                             it may
>>>>>                                                             not be
>>>>>                                                             hanging at
>>>>>                                                             this
>>>>>                                                             point.
>>>>>                                                              It is
>>>>>                                                             better
>>>>>                                                             to use
>>>>>                                                             the
>>>>>                                                             option
>>>>>                                                             -start_in_debugger
>>>>>                                                             then
>>>>>                                                             type
>>>>>                                                             cont
>>>>>                                                             in
>>>>>                                                             each
>>>>>                                                             debugger
>>>>>                                                             window
>>>>>                                                             and
>>>>>                                                             then
>>>>>                                                             when
>>>>>                                                             you
>>>>>                                                             think
>>>>>                                                             it is
>>>>>                                                             “hanging”
>>>>>                                                             do a
>>>>>                                                             control C
>>>>>                                                             in
>>>>>                                                             each
>>>>>                                                             debugger
>>>>>                                                             window
>>>>>                                                             and
>>>>>                                                             type
>>>>>                                                             where
>>>>>                                                             to see
>>>>>                                                             where
>>>>>                                                             each
>>>>>                                                             process is
>>>>>                                                             you
>>>>>                                                             can
>>>>>                                                             also
>>>>>                                                             look
>>>>>                                                             around
>>>>>                                                             in the
>>>>>                                                             debugger
>>>>>                                                             at
>>>>>                                                             variables
>>>>>                                                             to see
>>>>>                                                             why it
>>>>>                                                             is
>>>>>                                                             “hanging”
>>>>>                                                             at
>>>>>                                                             that
>>>>>                                                             point.
>>>>>
>>>>>                                                                 Barry
>>>>>
>>>>>                                                              
>>>>>                                                              This
>>>>>                                                             routines
>>>>>                                                             don’t
>>>>>                                                             have
>>>>>                                                             any
>>>>>                                                             parallel
>>>>>                                                             communication
>>>>>                                                             in
>>>>>                                                             them
>>>>>                                                             so are
>>>>>                                                             unlikely
>>>>>                                                             to hang.
>>>>>
>>>>>                                                             On Apr
>>>>>                                                             14,
>>>>>                                                             2014,
>>>>>                                                             at
>>>>>                                                             6:52
>>>>>                                                             AM,
>>>>>                                                             TAY
>>>>>                                                             wee-beng
>>>>>
>>>>>                                                             <zonexo at gmail.com
>>>>>                                                             <mailto:zonexo at gmail.com>>
>>>>>
>>>>>                                                               wrote:
>>>>>
>>>>>
>>>>>
>>>>>                                                                 Hi,
>>>>>
>>>>>                                                                 My
>>>>>                                                                 code
>>>>>                                                                 hangs
>>>>>                                                                 and I
>>>>>                                                                 added
>>>>>                                                                 in
>>>>>                                                                 mpi_barrier
>>>>>                                                                 and print
>>>>>                                                                 to
>>>>>                                                                 catch
>>>>>                                                                 the bug.
>>>>>                                                                 I
>>>>>                                                                 found
>>>>>                                                                 that
>>>>>                                                                 it
>>>>>                                                                 hangs
>>>>>                                                                 after
>>>>>                                                                 printing
>>>>>                                                                 "7".
>>>>>                                                                 Is
>>>>>                                                                 it
>>>>>                                                                 because
>>>>>                                                                 I'm doing
>>>>>                                                                 something
>>>>>                                                                 wrong?
>>>>>                                                                 I
>>>>>                                                                 need
>>>>>                                                                 to
>>>>>                                                                 access
>>>>>                                                                 the u,v,w
>>>>>                                                                 array
>>>>>                                                                 so
>>>>>                                                                 I
>>>>>                                                                 use DMDAVecGetArrayF90.
>>>>>                                                                 After
>>>>>                                                                 access,
>>>>>                                                                 I
>>>>>                                                                 use DMDAVecRestoreArrayF90.
>>>>>
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>>                                                                  if (myid==0)
>>>>>                                                                 print
>>>>>                                                                 *,"3"
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>>                                                                  if (myid==0)
>>>>>                                                                 print
>>>>>                                                                 *,"4"
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>>                                                                  if (myid==0)
>>>>>                                                                 print
>>>>>                                                                 *,"5"
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>>                                                                  if (myid==0)
>>>>>                                                                 print
>>>>>                                                                 *,"6"
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>                                                                  !must
>>>>>                                                                 be
>>>>>                                                                 in
>>>>>                                                                 reverse
>>>>>                                                                 order
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>>                                                                  if (myid==0)
>>>>>                                                                 print
>>>>>                                                                 *,"7"
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 MPI_Barrier(MPI_COMM_WORLD,ierr);
>>>>>                                                                  if (myid==0)
>>>>>                                                                 print
>>>>>                                                                 *,"8"
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  
>>>>>                                                                  call
>>>>>                                                                 DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>                                                                 -- 
>>>>>                                                                 Thank
>>>>>                                                                 you.
>>>>>
>>>>>                                                                 Yours
>>>>>                                                                 sincerely,
>>>>>
>>>>>                                                                 TAY wee-beng
>>>>>
>>>>>
>>>>>
>>>>>                                                 <code.txt>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             -- 
>>>>>             What most experimenters take for granted before they
>>>>>             begin their experiments is infinitely more interesting
>>>>>             than any results to which their experiments lead.
>>>>>             -- Norbert Wiener
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         What most experimenters take for granted before they begin
>>>>         their experiments is infinitely more interesting than any
>>>>         results to which their experiments lead.
>>>>         -- Norbert Wiener
>>>
>>>
>>>
>>>
>>>     -- 
>>>     What most experimenters take for granted before they begin their
>>>     experiments is infinitely more interesting than any results to
>>>     which their experiments lead.
>>>     -- Norbert Wiener
>>
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their 
>> experiments is infinitely more interesting than any results to which 
>> their experiments lead.
>> -- Norbert Wiener
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140421/bd5fca36/attachment-0001.html>


More information about the petsc-users mailing list