[Darshan-users] Module contains incomplete data

Jiří Nádvorník nadvornik.ji at gmail.com
Wed Apr 27 06:37:38 CDT 2022


Aha! I just realized there is an obvious "prepare.sh" script that I didn't
run, I found out by trial and error though, could be more documented :).

Now I'm further.. With a config file:
MAX_RECORDS     102400     POSIX,MPI-IO,STDIO
MODMEM  1024
APP_EXCLUDE     git,ls

I'm getting for:
darshan-parser --show-incomplete
 caucau_python_id127447-127447_4-27-48556-1842455298968263838_1.darshan
|grep incomplete

output:
# *WARNING*: The POSIX module contains incomplete data!
# *WARNING*: The STDIO module contains incomplete data!
Warning: no log utility handlers defined for module (null), SKIPPING.

I don't think I have more than 100000 files to be touched by my poor tiny
python script, right?

By the way I've encountered another problem, not sure whether to put it to
another thread. If I compile with HDF5 (the results above are without it):
./configure --with-log-path=/gpfs/raid/darshan-logs
--with-jobid-env=PBS_JOBID CC=mpicc --enable-hdf5-mod
--with-hdf5=/gpfs/raid/SDSSCube/ext_lib//hdf5-1.12.0/hdf5/

It messes up my runtime and causes python to crash:
mpirun -x DARSHAN_CONFIG_PATH=/gpfs/raid/SDSSCube/darshan.conf -x
LD_PRELOAD=/gpfs/raid/shared_libs/darshan/darshan-runtime/lib/.libs/libdarshan.so:/gpfs/raid/SDSSCube/ext_lib/hdf5-1.12.0/hdf5/lib/libhdf5.so
-np 65 --hostfile hosts --map-by node
/gpfs/raid/SDSSCube/venv_par/bin/python hisscube.py --truncate
../sdss_data/ results/SDSS_cube_c_par.h5

Resulting in:
INFO:rank[0]:Rank 0 pid: 137058
Darshan HDF5 module error: runtime library version (1.12) incompatible with
Darshan module (1.10-).
Traceback (most recent call last):
  File "hisscube.py", line 74, in <module>
    writer.ingest(fits_image_path, fits_spectra_path,
truncate_file=args.truncate)
  File "/gpfs/raid/SDSSCube/hisscube/ParallelWriterMWMR.py", line 45, in
ingest
    self.process_metadata(image_path, image_pattern, spectra_path,
spectra_pattern, truncate_file)
  File "/gpfs/raid/SDSSCube/hisscube/CWriter.py", line 150, in
process_metadata
    h5_file = self.open_h5_file_serial(truncate_file)
  File "/gpfs/raid/SDSSCube/hisscube/CWriter.py", line 170, in
open_h5_file_serial
    return h5py.File(self.h5_path, 'w', fs_strategy="page",
fs_page_size=4096, libver="latest")
  File
"/gpfs/raid/SDSSCube/venv_par/lib/python3.8/site-packages/h5py-3.6.0-py3.8-linux-x86_64.egg/h5py/_hl/files.py",
line 533, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File
"/gpfs/raid/SDSSCube/venv_par/lib/python3.8/site-packages/h5py-3.6.0-py3.8-linux-x86_64.egg/h5py/_hl/files.py",
line 232, in make_fid
    fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 126, in h5py.h5f.create
  File "h5py/defs.pyx", line 693, in h5py.defs.H5Fcreate
RuntimeError: Unspecified error in H5Fcreate (return value <0)

You are saying that darshan should be compatible with HDF5 > 1.8, which
1.12 should be, right?

Thanks for help!

Cheers,

Jiri






st 27. 4. 2022 v 8:43 odesílatel Jiří Nádvorník <nadvornik.ji at gmail.com>
napsal:

> Hi,
>
> I think I will chew through the documentation just fine but two things are
> not clear:
>
>    1. Does the darshan library provide its own config file that I need to
>    change or do I need to always create my own?
>    2. How can I build the git version? I didn't find any instructions and
>    the usual autoconf just throws:
>       1. root at kub-b1:/gpfs/raid/shared_libs/darshan/darshan-runtime#
>       autoconf
>       configure.ac:19: error: possibly undefined macro:
>       AC_CONFIG_MACRO_DIRS
>             If this token and others are legitimate, please use
>       m4_pattern_allow.
>             See the Autoconf documentation.
>       configure.ac:21: error: possibly undefined macro: AM_INIT_AUTOMAKE
>       configure.ac:22: error: possibly undefined macro: AM_SILENT_RULES
>       configure.ac:23: error: possibly undefined macro: AM_MAINTAINER_MODE
>       configure.ac:713: error: possibly undefined macro: AM_CONDITIONAL
>       root at kub-b1:/gpfs/raid/shared_libs/darshan/darshan-runtime#
>       ./configure
>       configure: error: cannot find install-sh, install.sh, or shtool in
>       ../maint/scripts "."/../maint/scripts
>
> Thanks for help.
>
> Cheers,
>
> Jiri
>
> út 26. 4. 2022 v 17:43 odesílatel Snyder, Shane <ssnyder at mcs.anl.gov>
> napsal:
>
>> Hi Jiri,
>>
>> For some background, Darshan enforces some internal memory limits to
>> avoid ballooning memory usage at runtime. Specifically, all of our
>> instrumentation modules should pre-allocate file records for up to 1,024
>> files opened by the app -- if your app opens more than 1,024 files
>> per-process, Darshan stops instrumenting and issues those warning messages
>> when parsing the log file.
>>
>> We have users hit this issue pretty frequently now, and we actually just
>> wrapped up development of some new mechanisms to help out with this. They
>> were just merged into our main branch, and we will be formally releasing a
>> pre-release version of this code in the next week or so. For the time
>> being, you should be able to use the 'main' branch of our repo (
>> https://github.com/darshan-hpc/darshan) to leverage this new
>> functionality.
>>
>> There are 2 new mechanisms that can help out, both of which require you
>> to provide a configuration file to Darshan at runtime:
>>
>>    - MAX_RECORDS setting can be used to bump up the number of
>>    pre-allocated records for different modules. In your case, you might try to
>>    bump up the default number of records for the POSIX, MPI-IO, and STDIO
>>    modules  by setting something like this in your config file (this would
>>    allow you to instrument up to 4000 files per-process for each of these
>>    modules):
>>       - MAX_RECORDS    4000    POSIX,MPI-IO,STDIO
>>    - An alternative (or complementary) approach to bumping up the record
>>    limit is to limit instrumentation to particular files. You can use the
>>    NAME_EXCLUDE setting to avoid instrumenting specific directory paths, file
>>    extensions, etc by specifying regular expressions. E.g, the following
>>    settings would avoid instrumenting files with .so prefixes or files located
>>    in a directory we don't care about for all modules (* denotes all modules):
>>       - NAME_EXCLUDE    .so$    *
>>       - NAME_EXCLUDE    ^/path/to/avoid    *
>>
>> I'm attaching the updated runtime documentation for Darshan for your
>> reference. Section 8 provides a ton of details on how to provide a config
>> file to Darshan that should help clear up any missing gaps in my
>> description above.
>>
>> Please let us know if you have any further questions or issues, though!
>>
>> Thanks,
>> --Shane
>> ------------------------------
>> *From:* Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on
>> behalf of Jiří Nádvorník <nadvornik.ji at gmail.com>
>> *Sent:* Sunday, April 24, 2022 3:00 PM
>> *To:* darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
>> *Subject:* [Darshan-users] Module contains incomplete data
>>
>> Hi All,
>>
>> I just tried out Darshan and the potential output seems perfect for my
>> HDF5 MPI application! Although I'm not able to get there :(.
>>
>> I have a log that has a big stamp "This darshan log contains incomplete
>> data".
>>
>> When I run:
>> darshan-parser --show-incomplete  mylog.darshan |grep incomplete
>> Output is:
>> # *WARNING*: The POSIX module contains incomplete data!
>> # *WARNING*: The MPI-IO module contains incomplete data!
>> # *WARNING*: The STDIO module contains incomplete data!
>>
>> Would you be able to point me to some setting that would improve the
>> measurements? Can I actually rely on the profiling results if it says the
>> data is incomplete in some of the categories?
>>
>> Thank you very much for your help!
>>
>> Cheers,
>>
>> Jiri
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20220427/d28f1ea9/attachment-0001.html>


More information about the Darshan-users mailing list