[petsc-users] TaoSolve() not working for multiple processors

Matthew Knepley knepley at gmail.com
Thu Oct 16 15:07:12 CDT 2014


On Thu, Oct 16, 2014 at 3:02 PM, Justin Chang <jychang48 at gmail.com> wrote:

> Actually, I have some related issues:
>
> In a previous thread, I saw that to output as hdf5/mxf you basically need
> the following two runtime options:
>
> -dm_view hdf5:sol.h5 -snes_view_solution hdf5:sol.h5::append
>

Can you send the code and we will get it to output. Its easiest to have
something to work from,
and we have not tried this out yet. You can send it to
petsc-maint at mcs.anl.gov and its private.

  Thanks,

     Matt


> 1) For the code that I had shown earlier in this thread (test.c), it works
> serially if I used the SNES solver instead of the TaoSolver. However, when
> I run it on multiple processors I get the following error:
>
> HDF5-DIAG: Error detected in HDF5 (1.8.12) MPI-process 0:
>   #000: H5F.c line 2061 in H5Fclose(): decrementing file ID failed
>     major: Object atom
>     minor: Unable to close file
>   #001: H5I.c line 1479 in H5I_dec_app_ref(): can't decrement ID ref count
>     major: Object atom
>     minor: Unable to decrement reference count
>   #002: H5F.c line 1838 in H5F_close(): can't close file
>     major: File accessibilty
>     minor: Unable to close file
>   #003: H5F.c line 2000 in H5F_try_close(): problems closing file
>     major: File accessibilty
>     minor: Unable to close file
>   #004: H5F.c line 1145 in H5F_dest(): low level truncate failed
>     major: File accessibilty
>     minor: Write failed
>   #005: H5FD.c line 1897 in H5FD_truncate(): driver truncate request failed
>     major: Virtual File Layer
>     minor: Can't update object
>   #006: H5FDmpio.c line 1989 in H5FD_mpio_truncate(): MPI_File_set_size
> failed
>     major: Internal error (too specific to document in detail)
>     minor: Some MPI function failed
>   #007: H5FDmpio.c line 1989 in H5FD_mpio_truncate(): Invalid argument,
> error stack:
> MPI_FILE_SET_SIZE(74): Inconsistent arguments to collective routine
>     major: Internal error (too specific to document in detail)
>     minor: MPI Error String
> HDF5-DIAG: Error detected in HDF5 (1.8.12) MPI-process 1:
>   #000: H5F.c line 2061 in H5Fclose(): decrementing file ID failed
>     major: Object atom
>     minor: Unable to close file
>   #001: H5I.c line 1479 in H5I_dec_app_ref(): can't decrement ID ref count
>     major: Object atom
>     minor: Unable to decrement reference count
>   #002: H5F.c line 1838 in H5F_close(): can't close file
>     major: File accessibilty
>     minor: Unable to close file
>   #003: H5F.c line 2000 in H5F_try_close(): problems closing file
>     major: File accessibilty
>     minor: Unable to close file
>   #004: H5F.c line 1145 in H5F_dest(): low level truncate failed
>     major: File accessibilty
>     minor: Write failed
>   #005: H5FD.c line 1897 in H5FD_truncate(): driver truncate request failed
>     major: Virtual File Layer
>     minor: Can't update object
>   #006: H5FDmpio.c line 1989 in H5FD_mpio_truncate(): MPI_File_set_size
> failed
>     major: Internal error (too specific to document in detail)
>     minor: Some MPI function failed
>   #007: H5FDmpio.c line 1989 in H5FD_mpio_truncate(): Invalid argument,
> error stack:
> MPI_FILE_SET_SIZE(74): Inconsistent arguments to collective routine
>     major: Internal error (too specific to document in detail)
>     minor: MPI Error String
>
>
> I compared my code with what was in SNES ex12 and I don't seem to
> understand why I am getting this error.
>
> 2) Is there a way to output as HDF5/xmf via the Tao Solver? I tried using
> -ksp_view_solution hdf5:sol.h5::append but it gave me no solution. The
> multiple processor issue also applies to this
>
>
> Any help appreciated. Thanks
>
> On Tue, Oct 14, 2014 at 11:58 PM, Justin Chang <jychang48 at gmail.com>
> wrote:
>
>> Hi Jason,
>>
>> Both of those algorithms resolved my problem. Thanks!
>>
>> Justin
>>
>> On Tue, Oct 14, 2014 at 10:12 PM, Jason Sarich <jason.sarich at gmail.com>
>> wrote:
>>
>>> Hi Justin,
>>>
>>> If  the only constraints you have are bounds on the variables, then you
>>> are much better off using the TAO algorithms tron or blmvm. IPM builds its
>>> own KKT matrix for a ksp solve that is undoubtably much less efficient than
>>> what will be used in tron (blmvm avoids ksp solves all together).
>>>
>>> That begin said, it looks like there's a bug in IPM and I'll try to
>>> track it down, thanks for the report.
>>>
>>> Jason
>>>
>>>
>>> On Tue, Oct 14, 2014 at 9:34 PM, Justin Chang <jychang48 at gmail.com>
>>> wrote:
>>>
>>>>   Attached is the source code and the makefile used to compile/run the
>>>> code.
>>>>
>>>>  The source code is basically a dumbed down version of SNES ex12 plus
>>>> necessary TAO routines.
>>>>
>>>>  Thanks
>>>>
>>>> On Tue, Oct 14, 2014 at 8:39 PM, Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>>  On Tue, Oct 14, 2014 at 8:04 PM, Justin Chang <jychang48 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>  Hi all,
>>>>>>
>>>>>>  So I am writing a non-negative diffusion equation using DMPlex's FEM
>>>>>> and Tao's SetVariableBounds functions. My code works really perfectly when
>>>>>> I run it with one processor. However, once I use 2 or more processors, I
>>>>>> get this error:
>>>>>>
>>>>>
>>>>>  It looks like the problem is in the TAO definition, but you can
>>>>> check by just solving your problem with, for instance, BJacobi-LU in
>>>>> parallel.
>>>>>
>>>>>
>>>>>>  [0]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [0]PETSC ERROR: Nonconforming object sizes
>>>>>> [0]PETSC ERROR: Vector wrong size 89 for scatter 88 (scatter reverse
>>>>>> and vector to != ctx from size)
>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [1]PETSC ERROR: Nonconforming object sizes
>>>>>> [1]PETSC ERROR: Vector wrong size 87 for scatter 88 (scatter reverse
>>>>>> and vector to != ctx from size)
>>>>>> [1]PETSC ERROR: See
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>>>> shooting.
>>>>>> [1]PETSC ERROR: Petsc Development GIT revision: v3.5.2-526-gfaecc80
>>>>>> GIT Date: 2014-10-04 20:10:35 -0500
>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: See
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>>>> shooting.
>>>>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.5.2-526-gfaecc80
>>>>>> GIT Date: 2014-10-04 20:10:35 -0500
>>>>>> [0]PETSC ERROR: ./bin/diff2D on a arch-linux2-c-debug named pacotaco
>>>>>> by justin Tue Oct 14 19:48:50 2014
>>>>>> [0]PETSC ERROR: ./bin/diff2D on a arch-linux2-c-debug named pacotaco
>>>>>> by justin Tue Oct 14 19:48:50 2014
>>>>>> [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
>>>>>> --with-fc=gfortran --download-fblaslapack --download-mpich
>>>>>> --with-debugging=1 --download-metis --download-parmetis --download-triangle
>>>>>> --with-cmake=cmake --download-ctetgen --download-superlu
>>>>>> --download-scalapack --download-mumps --download-hdf5 --with-valgrind=1
>>>>>> -with-cmake=cmake
>>>>>> [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
>>>>>> --with-fc=gfortran --download-fblaslapack --download-mpich
>>>>>> --with-debugging=1 --download-metis --download-parmetis --download-triangle
>>>>>> --with-cmake=cmake --download-ctetgen --download-superlu
>>>>>> --download-scalapack --download-mumps --download-hdf5 --with-valgrind=1
>>>>>> -with-cmake=cmake
>>>>>> [0]PETSC ERROR: #1 VecScatterBegin() line 1713 in
>>>>>> /home/justin/petsc-master/src/vec/vec/utils/vscat.c
>>>>>> #1 VecScatterBegin() line 1713 in
>>>>>> /home/justin/petsc-master/src/vec/vec/utils/vscat.c
>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: #2 MatMultTranspose_MPIAIJ() line
>>>>>> 1010 in /home/justin/petsc-master/src/mat/impls/aij/mpi/mpiaij.c
>>>>>> [0]PETSC ERROR: #2 MatMultTranspose_MPIAIJ() line 1010 in
>>>>>> /home/justin/petsc-master/src/mat/impls/aij/mpi/mpiaij.c
>>>>>> [1]PETSC ERROR: #3 MatMultTranspose() line 2242 in
>>>>>> /home/justin/petsc-master/src/mat/interface/matrix.c
>>>>>> #3 MatMultTranspose() line 2242 in
>>>>>> /home/justin/petsc-master/src/mat/interface/matrix.c
>>>>>> [0]PETSC ERROR: #4 IPMComputeKKT() line 616 in
>>>>>> /home/justin/petsc-master/src/tao/constrained/impls/ipm/ipm.c
>>>>>> [1]PETSC ERROR: #4 IPMComputeKKT() line 616 in
>>>>>> /home/justin/petsc-master/src/tao/constrained/impls/ipm/ipm.c
>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: #5 TaoSolve_IPM() line 50 in
>>>>>> /home/justin/petsc-master/src/tao/constrained/impls/ipm/ipm.c
>>>>>> [0]PETSC ERROR: #5 TaoSolve_IPM() line 50 in
>>>>>> /home/justin/petsc-master/src/tao/constrained/impls/ipm/ipm.c
>>>>>> [1]PETSC ERROR: #6 TaoSolve() line 190 in
>>>>>> /home/justin/petsc-master/src/tao/interface/taosolver.c
>>>>>> #6 TaoSolve() line 190 in
>>>>>> /home/justin/petsc-master/src/tao/interface/taosolver.c
>>>>>> [0]PETSC ERROR: #7 main() line 341 in
>>>>>> /home/justin/Dropbox/Research_Topics/Petsc_Nonneg_diffusion/src/diff2D.c
>>>>>> [1]PETSC ERROR: #7 main() line 341 in
>>>>>> /home/justin/Dropbox/Research_Topics/Petsc_Nonneg_diffusion/src/diff2D.c
>>>>>> [1]PETSC ERROR: [0]PETSC ERROR: ----------------End of Error Message
>>>>>> -------send entire error message to petsc-maint at mcs.anl.gov----------
>>>>>> ----------------End of Error Message -------send entire error message
>>>>>> to petsc-maint at mcs.anl.gov----------
>>>>>> application called MPI_Abort(MPI_COMM_WORLD, 60) - process 0
>>>>>> application called MPI_Abort(MPI_COMM_WORLD, 60) - process 1
>>>>>> [cli_1]: aborting job:
>>>>>> application called MPI_Abort(MPI_COMM_WORLD, 60) - process 1
>>>>>> [cli_0]: aborting job:
>>>>>> application called MPI_Abort(MPI_COMM_WORLD, 60) - process 0
>>>>>>
>>>>>>
>>>>>>  I have no idea how or why I am getting this error. What does this
>>>>>> mean?
>>>>>>
>>>>>
>>>>>   It looks like one dof is given to proc 0 which should live on proc
>>>>> 1. We have to look at the divisions
>>>>> made in the KKT solver. Can you send a small example?
>>>>>
>>>>>    Matt
>>>>>
>>>>>
>>>>>>  My code is essentially built off of SNES ex12.c. The Jacobian
>>>>>> matrix, residual vector, and solution vector were created using DMPlex and
>>>>>> the built-in FEM functions within. The Hessian matrix and gradient vector
>>>>>> were created by simple matmult() functions of the jacobian and residual.
>>>>>> The lower bounds vector was created by duplicating the solution vector
>>>>>> (initial guess set to zero). My FormFunctionGradient() is basically the
>>>>>> same thing as in the maros.c example. can give more information if needed.
>>>>>>
>>>>>> Thanks,
>>>>>> Justin
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141016/50be1f83/attachment-0001.html>


More information about the petsc-users mailing list