[Darshan-users] HDF5 version support

Snyder, Shane ssnyder at mcs.anl.gov
Fri Apr 29 10:00:59 CDT 2022


I see, thanks for the details.

Since you are providing --with-hdf5=/path/to/hdf5​ you are correct that Darshan's configure script should look for h5dump in '/path/to/hdf5/bin', so you would think that would override any sort of system-provided HDF5...

Would you mind sharing the 'darshan-runtime/config.log' file generated in your build directory? There's got to be something not quite right at configure time -- the error message you are triggering in Darshan is conditionally compiled in based on the results of these configure tests, so it has to be the case that configure is somehow thinking it is a version <1.10.

--Shane
________________________________
From: Jiří Nádvorník <nadvornik.ji at gmail.com>
Sent: Friday, April 29, 2022 4:32 AM
To: Snyder, Shane <ssnyder at mcs.anl.gov>
Cc: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
Subject: Re: [Darshan-users] HDF5 version support

Aha,

See inline.




On Thu, Apr 28, 2022, 23:20 Snyder, Shane <ssnyder at mcs.anl.gov<mailto:ssnyder at mcs.anl.gov>> wrote:
Hi Jiri,

HDF5 should be generally supported for any version greater than 1.8.

The Darshan error you're seeing indicates that Darshan believes it was configured with a version less than 1.10 (which it then complains about since it sees version 1.12 at runtime). You should see a line like this in the configure output for darshan-runtime:
           HDF5          module support  - no
In my case above, HDF5 support wasn't enabled, but if HDF5 is enabled it will indicate what version it found at configure time. We should make sure that says 1.12 as a starting point.

I believe this was indeed the 1.12, for sure it says HDF5 module support yes and if I screwed the hdf5 path up it would fail on but finding "hdf5.h" header. So I'm fairly confident it actually links against this given path correctly as well.

Darshan is indirectly determining the version by running 'h5dump --version', so might be good to sanity check that 'h5dump' in PATH is associated with the 1.12 HDF5 install you are trying to use.

This is the issue. You're right I'm building my own HDF5 and h5py and the system preinstalled hdf5 is different. How can I tell that to darshan? It's a little inconsistent behavior as I'm specifying my custom hdf5 path to the darshan-runtime ./configure via the --with-hdf5= and would be expecting it checked the version there.

Btw I have quite good understanding how the h5py and hdf5 building works now as that was a lengthy process to get it working properly indeed :). So if you need to know more about the process I should be able to clarify.



I should note that in my experience, getting h5py working right with Darshan was a challenge. I tried once and had some issues, but one of our Python contributors indicated they have gotten it to work in the past -- in their case, they recommended building h5py from source according to these directions: https://docs.h5py.org/en/stable/build.html#custom-installation

Maybe you are already doing that? The trick is to use an external HDF5 install (i.e., not one packaged with h5py python wheels) that you then use to build both h5py and Darshan. I could try to get further myself if we can't figure out how to get it working for you.

Thanks,
--Shane
________________________________
From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov<mailto:darshan-users-bounces at lists.mcs.anl.gov>> on behalf of Jiří Nádvorník <nadvornik.ji at gmail.com<mailto:nadvornik.ji at gmail.com>>
Sent: Wednesday, April 27, 2022 12:29 PM
To: darshan-users at lists.mcs.anl.gov<mailto:darshan-users at lists.mcs.anl.gov> <darshan-users at lists.mcs.anl.gov<mailto:darshan-users at lists.mcs.anl.gov>>
Subject: [Darshan-users] HDF5 version support

Hi all,

I was able to successfully run the darshan runtime without any warnings withou HDF5.

Now when I compile with HDF5 (the results above are without it):
./configure --with-log-path=/gpfs/raid/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc --enable-hdf5-mod --with-hdf5=/gpfs/raid/SDSSCube/ext_lib//hdf5-1.12.0/hdf5/

It messes up my runtime and causes python to crash:
mpirun -x DARSHAN_CONFIG_PATH=/gpfs/raid/SDSSCube/darshan.conf -x LD_PRELOAD=/gpfs/raid/shared_libs/darshan/darshan-runtime/lib/.libs/libdarshan.so:/gpfs/raid/SDSSCube/ext_lib/hdf5-1.12.0/hdf5/lib/libhdf5.so -np 65 --hostfile hosts --map-by node /gpfs/raid/SDSSCube/venv_par/bin/python hisscube.py --truncate ../sdss_data/ results/SDSS_cube_c_par.h5

Resulting in:
INFO:rank[0]:Rank 0 pid: 137058
Darshan HDF5 module error: runtime library version (1.12) incompatible with Darshan module (1.10-).
Traceback (most recent call last):
  File "hisscube.py", line 74, in <module>
    writer.ingest(fits_image_path, fits_spectra_path, truncate_file=args.truncate)
  File "/gpfs/raid/SDSSCube/hisscube/ParallelWriterMWMR.py", line 45, in ingest
    self.process_metadata(image_path, image_pattern, spectra_path, spectra_pattern, truncate_file)
  File "/gpfs/raid/SDSSCube/hisscube/CWriter.py", line 150, in process_metadata
    h5_file = self.open_h5_file_serial(truncate_file)
  File "/gpfs/raid/SDSSCube/hisscube/CWriter.py", line 170, in open_h5_file_serial
    return h5py.File(self.h5_path, 'w', fs_strategy="page", fs_page_size=4096, libver="latest")
  File "/gpfs/raid/SDSSCube/venv_par/lib/python3.8/site-packages/h5py-3.6.0-py3.8-linux-x86_64.egg/h5py/_hl/files.py", line 533, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/gpfs/raid/SDSSCube/venv_par/lib/python3.8/site-packages/h5py-3.6.0-py3.8-linux-x86_64.egg/h5py/_hl/files.py", line 232, in make_fid
    fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 126, in h5py.h5f.create
  File "h5py/defs.pyx", line 693, in h5py.defs.H5Fcreate
RuntimeError: Unspecified error in H5Fcreate (return value <0)

You are saying that darshan should be compatible with HDF5 > 1.8, which 1.12 should be, right? I checked the source and there is hardcoded if minor version > 10: not supported.

Thanks for explaining what's supported.

Cheers,

Jiri Nadvornik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20220429/00cf8d11/attachment-0001.html>


More information about the Darshan-users mailing list