[petsc-users] Problem with PETSc + HDF5 VecView

Håkon Strandenes haakon at hakostra.net
Tue Nov 25 14:34:00 CST 2014

I have done a whole lot of debugging on this issue. In my first post I 
wrote that "The VecLoad problems can rest for now, but the VecView are 
more serious.". I now believe that these problems in essence are the 
same. I have however not yet found out where the problem is, in PETSc, 
HDF5, MPT or (perhaps unlikely) the Lustre file system.

The VecLoad bug is as follows: I have a properly formatted HDF5-file 
with data corresponding to a vector on a DMDA grid. If I try to load 
this data from the HDF5-file and into a Vec, it works with no error 
messages. The data is however severely corrupted. This is shown in the 
attached figure fig-COLELCTIVE.png, where the left column of figures is 
the three components of a vector that actually is in the HDF5 file, 
while the right column is how PETSc's VecLoad() reads the data. This bug 
does not cause any error messages what so ever, and is again dependent 
on the decomposition pattern.

Another strange phenomena related to VecLoad() is that the occurrence of 
this bug seems to be dependent on the chunking in the dataset being 
loaded. Different chinking patterns seems to produce different results. 
This might be an important lead.

My workaround is to replace the two occurrences of
   H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE)
   H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_INDEPENDENT)
in file src/dm/impls/da/gr2.c These two occurrences are found in the 
functions VecView_MPI_HDF5_DA() and VecLoad_HDF5_DA().

This solves all problems (at least so far). Both VecLoad() and VecView() 
works as expected. See attached figure fig-INDEPENDENT.png. The left 
column is again input and the right column is output. I have actually 
made a small example program that performs two tasks: loads a Vec from a 
HDF5-file and writes the same Vec back again to a new file. When 
comparing the data inside these files, it is exactly identical for all 
combinations of decomposition, number of processes and chunking patterns 
in the input file I have tried.

This leads me to an important question I cannot answer because I don't 
have enough knowledge and insight in HDF5: Is PETSc using the 
collective/independent flags correctly? Can you use collective IO in all 
cases for all DMDA's and all variants of decomposition?

The following document describe some features of parallel HDF5:

Does PETSc currently satisfy the conditions for using collective IO and 
chunking as described on page 5-6? Perhaps the reason why this is not 
occurring on my workstation is that HDF5 recognise that my file system 
is serial and falls back to some simple serial IO routines? My cluster 
is equipped with a parallel Lustre file system, and as far as I know 
HDF5 handles these differently. The same document also mentions that 
HDF5 in deed does create some internal datatypes to accomplish 
collective IO on chunked storage (top of page 6), as Matthew suggested.

Any comments?

Tanks for your time.

Best regards,
Håkon Strandenes

On 25. nov. 2014 18:47, Matthew Knepley wrote:
> On Mon, Nov 24, 2014 at 1:10 PM, Håkon Strandenes <haakon at hakostra.net
> <mailto:haakon at hakostra.net>> wrote:
>     Hi,
>     I have some problems with PETSc and HDF5 VecLoad/VecView. The
>     VecLoad problems can rest for now, but the VecView are more serious.
>     In short: I have a 3D DMDA with and some vectors that I want to save
>     to a HDF5 file. This works perfectly on my workstation, but not on
>     the compute cluster I have access to. I have attached a typical
>     error message.
>     I have also attached an piece of code that can trigger the error.
>     The code is merely a 2D->3D rewrite of DMDA ex 10
>     (http://www.mcs.anl.gov/petsc/__petsc-current/src/dm/examples/__tutorials/ex10.c.html
>     <http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html>),
>     nothing else is done.
>     The program typically works on small number of processes. I have
>     successfully executed the attached program on up to 32 processes.
>     That works. Always. I have never had a single success when trying to
>     run on 64 processes. Always same error.
>     The computer I am struggling with is an SGI machine with SLES 11sp1
>     and Intel CPUs, hence I have used Intels compilers. I have tried
>     both 2013, 2014 and 2015 versions of the compilers, so that's
>     probably not the cause. I have also tried GCC 4.9.1, just to be
>     safe, same error there. The same compiler is used for both HDF5 and
>     PETSc. The same error message occurs for both debug and release
>     builds. I have tried HDF5 versions 1.8.11 and 1.8.13. I have tried
>     PETSc version 3.4.1 and the latest from Git. The MPI implementation
>     on the machine is SGI's MPT, and i have tried both 2.06 and 2.10.
>     Always same error. Other MPI implementations is unfortunately not
>     available.
>     What really drives me mad is that this works like a charm on my
>     workstation with Linux Mint... I have successfully executed the
>     attached example on 254 processes (my machine breaks down if I try
>     anything more than that).
>     Does any of you have any tips on how to attack this problem and find
>     out what's wrong?
> This does sound like a pain to track down. It seems to be complaining
> about an MPI datatype:
> #005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io():
> MPI_Type_struct failed
>      major: Internal error (too specific to document in detail)
>      minor: Some MPI function failed
>    #006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid
> datatype argument
>      major: Internal error (too specific to document in detail)
>      minor: MPI Error String
> In this call, we pass in 'scalartype', which is H5T_NATIVE_DOUBLE
> (unless you configured for
> single precision). This was used successfully to create the dataspace,
> so it is unlikely to be
> the problem. I am guessing that HDF5 creates internal MPI datatypes to
> use in the MPI/IO
> routines (maybe using MPI_Type_struct).
> I believe we have seen type creation routines fail in some MPI
> implementations if you try to
> create too many of them. Right now, this looks a lot like a bug in MPT,
> although it might be
> an HDF5 bug with forgetting to release MPI types that they do not need.
>    Thanks,
>      Matt
>     Regards,
>     Håkon Strandenes
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fig-COLLECTIVE.png
Type: image/png
Size: 454244 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141125/26d4c62e/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fig-INDEPENDENT.png
Type: image/png
Size: 330622 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141125/26d4c62e/attachment-0003.png>

More information about the petsc-users mailing list