[Darshan-users] Error in job_summary

Jeffrey Layton laytonjb at gmail.com
Mon Jul 26 09:15:13 CDT 2021


Good morning,

I'm post-processing a darshan file for a Tensorflow training of a simple
model (CIFAR-10). The post-processing completes just fine, but I see an
error on the first page:


WARNING: This Darshan log contains incomplete data. This happens when a
module runs out of memory to store
new record data. Please run darshan-parser on the log file for more
information.

So I ran darshan-parser on the file and I see the following at the end.


# *******************************************************
# POSIX module data
# *******************************************************

# *ERROR*: The POSIX module contains incomplete data!
#            This happens when a module runs out of
#            memory to store new record data.

# To avoid this error, consult the darshan-runtime
# documentation and consider setting the
# DARSHAN_EXCLUDE_DIRS environment variable to prevent
# Darshan from instrumenting unecessary files.

# You can display the (incomplete) data that is
# present in this log using the --show-incomplete
# option to darshan-parser.


I have a bunch of file systems excluded: /proc,/etc,/dev,/sys,/snap,/run .

How can I get a list of files that Darshan tracked? Is there a way to
increase the amount of memory?

Thanks!

Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20210726/eaf30625/attachment.html>


More information about the Darshan-users mailing list