<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Nov 25, 2014 at 2:34 PM, Håkon Strandenes <span dir="ltr"><<a href="mailto:haakon@hakostra.net" target="_blank">haakon@hakostra.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I have done a whole lot of debugging on this issue. In my first post I wrote that "The VecLoad problems can rest for now, but the VecView are more serious.". I now believe that these problems in essence are the same. I have however not yet found out where the problem is, in PETSc, HDF5, MPT or (perhaps unlikely) the Lustre file system.<br>
<br>
The VecLoad bug is as follows: I have a properly formatted HDF5-file with data corresponding to a vector on a DMDA grid. If I try to load this data from the HDF5-file and into a Vec, it works with no error messages. The data is however severely corrupted. This is shown in the attached figure fig-COLELCTIVE.png, where the left column of figures is the three components of a vector that actually is in the HDF5 file, while the right column is how PETSc's VecLoad() reads the data. This bug does not cause any error messages what so ever, and is again dependent on the decomposition pattern.<br>
<br>
Another strange phenomena related to VecLoad() is that the occurrence of this bug seems to be dependent on the chunking in the dataset being loaded. Different chinking patterns seems to produce different results. This might be an important lead.<br>
<br>
My workaround is to replace the two occurrences of<br>
H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE)<br>
with<br>
H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_INDEPENDENT)<br>
in file src/dm/impls/da/gr2.c These two occurrences are found in the functions VecView_MPI_HDF5_DA() and VecLoad_HDF5_DA().<br>
<br>
This solves all problems (at least so far). Both VecLoad() and VecView() works as expected. See attached figure fig-INDEPENDENT.png. The left column is again input and the right column is output. I have actually made a small example program that performs two tasks: loads a Vec from a HDF5-file and writes the same Vec back again to a new file. When comparing the data inside these files, it is exactly identical for all combinations of decomposition, number of processes and chunking patterns in the input file I have tried.<br>
<br>
This leads me to an important question I cannot answer because I don't have enough knowledge and insight in HDF5: Is PETSc using the collective/independent flags correctly? Can you use collective IO in all cases for all DMDA's and all variants of decomposition?<br>
<br>
The following document describe some features of parallel HDF5:<br>
<a href="http://www.hdfgroup.org/HDF5/PHDF5/parallelhdf5hints.pdf" target="_blank">http://www.hdfgroup.org/HDF5/<u></u>PHDF5/parallelhdf5hints.pdf</a><br>
<br>
Does PETSc currently satisfy the conditions for using collective IO and chunking as described on page 5-6? Perhaps the reason why this is not occurring on my workstation is that HDF5 recognise that my file system is serial and falls back to some simple serial IO routines? My cluster is equipped with a parallel Lustre file system, and as far as I know HDF5 handles these differently. The same document also mentions that HDF5 in deed does create some internal datatypes to accomplish collective IO on chunked storage (top of page 6), as Matthew suggested.<br>
<br>
Any comments?<br></blockquote><div><br></div><div>First, this is great debugging.</div><div><br></div><div>Second, my reading of the HDF5 document you linked to says that either selection should be valid:</div><div><br></div><div> "For non-regular hyperslab selection, parallel HDF5 uses independent IO internally for this option."</div><div><br></div><div>so it ought to fall back to the INDEPENDENT model if it can't do collective calls correctly. However,</div><div>it appears that the collective call has bugs.</div><div><br></div><div>My conclusion: Since you have determined that changing the setting to INDEPENDENT produces</div><div>correct input/output in all the test cases, and since my understanding of the HDF5 documentation is</div><div>that we should always be able to use COLLECTIVE as an option, this is an HDF5 or MPT bug.</div><div><br></div><div>Does anyone else see the HDF5 differently? Also, it really looks to me like HDF5 messed up the MPI</div><div>data type in the COLLECTIVE picture below, since it appears to be sliced incorrectly.</div><div><br></div><div>Possible Remedies:</div><div><br></div><div> 1) We can allow you to turn off <span style="color:rgb(51,51,51);font-family:Consolas,Menlo,'Liberation Mono',Courier,monospace;font-size:12px;line-height:1.4;background-color:initial">H5Pset_dxpl_mpio()</span></div><div><span style="color:rgb(51,51,51);font-family:Consolas,Menlo,'Liberation Mono',Courier,monospace;font-size:12px;line-height:1.4;background-color:initial"><br></span></div><div> 2) Send this test case to the MPI/IO people at ANL</div><div><br></div><div>If you think 1) is what you want, we can do it. If you can package this work for 2), it would be really valuable.</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Tanks for your time.<br>
<br>
Best regards,<br>
Håkon Strandenes<span class=""><br>
<br>
<br>
On 25. nov. 2014 18:47, Matthew Knepley wrote:<br>
</span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">
On Mon, Nov 24, 2014 at 1:10 PM, Håkon Strandenes <<a href="mailto:haakon@hakostra.net" target="_blank">haakon@hakostra.net</a><br></span><span class="">
<mailto:<a href="mailto:haakon@hakostra.net" target="_blank">haakon@hakostra.net</a>>> wrote:<br>
<br>
Hi,<br>
<br>
I have some problems with PETSc and HDF5 VecLoad/VecView. The<br>
VecLoad problems can rest for now, but the VecView are more serious.<br>
<br>
In short: I have a 3D DMDA with and some vectors that I want to save<br>
to a HDF5 file. This works perfectly on my workstation, but not on<br>
the compute cluster I have access to. I have attached a typical<br>
error message.<br>
<br>
I have also attached an piece of code that can trigger the error.<br>
The code is merely a 2D->3D rewrite of DMDA ex 10<br></span>
(<a href="http://www.mcs.anl.gov/petsc/__petsc-current/src/dm/examples/__tutorials/ex10.c.html" target="_blank">http://www.mcs.anl.gov/petsc/<u></u>__petsc-current/src/dm/<u></u>examples/__tutorials/ex10.c.<u></u>html</a><br>
<<a href="http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html" target="_blank">http://www.mcs.anl.gov/petsc/<u></u>petsc-current/src/dm/examples/<u></u>tutorials/ex10.c.html</a>>),<div><div class="h5"><br>
nothing else is done.<br>
<br>
The program typically works on small number of processes. I have<br>
successfully executed the attached program on up to 32 processes.<br>
That works. Always. I have never had a single success when trying to<br>
run on 64 processes. Always same error.<br>
<br>
The computer I am struggling with is an SGI machine with SLES 11sp1<br>
and Intel CPUs, hence I have used Intels compilers. I have tried<br>
both 2013, 2014 and 2015 versions of the compilers, so that's<br>
probably not the cause. I have also tried GCC 4.9.1, just to be<br>
safe, same error there. The same compiler is used for both HDF5 and<br>
PETSc. The same error message occurs for both debug and release<br>
builds. I have tried HDF5 versions 1.8.11 and 1.8.13. I have tried<br>
PETSc version 3.4.1 and the latest from Git. The MPI implementation<br>
on the machine is SGI's MPT, and i have tried both 2.06 and 2.10.<br>
Always same error. Other MPI implementations is unfortunately not<br>
available.<br>
<br>
What really drives me mad is that this works like a charm on my<br>
workstation with Linux Mint... I have successfully executed the<br>
attached example on 254 processes (my machine breaks down if I try<br>
anything more than that).<br>
<br>
Does any of you have any tips on how to attack this problem and find<br>
out what's wrong?<br>
<br>
<br>
This does sound like a pain to track down. It seems to be complaining<br>
about an MPI datatype:<br>
<br>
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(<u></u>):<br>
MPI_Type_struct failed<br>
major: Internal error (too specific to document in detail)<br>
minor: Some MPI function failed<br>
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(<u></u>): Invalid<br>
datatype argument<br>
major: Internal error (too specific to document in detail)<br>
minor: MPI Error String<br>
<br>
In this call, we pass in 'scalartype', which is H5T_NATIVE_DOUBLE<br>
(unless you configured for<br>
single precision). This was used successfully to create the dataspace,<br>
so it is unlikely to be<br>
the problem. I am guessing that HDF5 creates internal MPI datatypes to<br>
use in the MPI/IO<br>
routines (maybe using MPI_Type_struct).<br>
<br>
I believe we have seen type creation routines fail in some MPI<br>
implementations if you try to<br>
create too many of them. Right now, this looks a lot like a bug in MPT,<br>
although it might be<br>
an HDF5 bug with forgetting to release MPI types that they do not need.<br>
<br>
Thanks,<br>
<br>
Matt<br>
<br>
Regards,<br>
Håkon Strandenes<br>
<br>
<br>
<br>
<br>
--<br>
What most experimenters take for granted before they begin their<br>
experiments is infinitely more interesting than any results to which<br>
their experiments lead.<br>
-- Norbert Wiener<br>
</div></div></blockquote>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>
</div></div>