<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi Jeff,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I similarly tried running job-summary on your log using our current main branch (which is essentially just Darshan 3.3.1), and it worked fine, so not exactly sure what the problem is, but doesn't appear to be a general bug. You might be able to find some hints
about what's going wrong by running job-summary again with the '--verbose' flag -- this persists the temporary directory Darshan is using for creating the PDF files, including pdflatex logs, etc. You might be able to find some error messages in the 'summary.log'
file that give some sort of indication in what's failing/hanging? Not the most straightforward debugging strategy but I don't really have much else to suggest...<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
As a side note, we are in the middle of developing new Darshan analysis tools based on PyDarshan that will hopefully be available before too long. There's a lot more development momentum on our end towards these new PyDarshan-based analysis tools, with the
older tools likely being deprecated once these are available. I just mention this for you and other users so you're aware help is on the way and that we aren't completely ignoring these issues.<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
--Shane<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Darshan-users <darshan-users-bounces@lists.mcs.anl.gov> on behalf of Jeffrey Layton <laytonjb@gmail.com><br>
<b>Sent:</b> Tuesday, July 13, 2021 2:00 PM<br>
<b>To:</b> Latham, Robert J. <robl@mcs.anl.gov><br>
<b>Cc:</b> darshan-users@lists.mcs.anl.gov <darshan-users@lists.mcs.anl.gov><br>
<b>Subject:</b> Re: [Darshan-users] Hang on post-process</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div>Thanks Rob!! I appreciate the pdf (at least I won't look like a slacker and actually produced something).</div>
<div><br>
</div>
<div>What steps do you want to take to debug the issue? I'm guessing it's a configuration issue or dependency issue on my side. BTW - I'm running Ubuntu 20.04 on an AMD system. I built Darshan 3.3.1 using gcc 9.3.0 (Ubuntu 20.04 version).</div>
<div><br>
</div>
<div>Thanks!</div>
<div><br>
</div>
<div>Jeff</div>
<div><br>
</div>
</div>
<br>
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Tue, Jul 13, 2021 at 2:46 PM Latham, Robert J. <<a href="mailto:robl@mcs.anl.gov">robl@mcs.anl.gov</a>> wrote:<br>
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
Howdy Jeff: thanks for sending the log file<br>
<br>
It looks like a legitimate log file to me. `darshan-job-parser`, which<br>
simply dumps the counters and such to stdout, gives me a reasonable<br>
looking log file. here's the header:<br>
<br>
# darshan log version: 3.21<br>
# compression method: ZLIB<br>
# exe: python3 cifar10-4-checkpoint.py <br>
# uid: 1000<br>
# jobid: 6041<br>
# start_time: 1626196275<br>
# start_time_asci: Tue Jul 13 12:11:15 2021<br>
# end_time: 1626196561<br>
# end_time_asci: Tue Jul 13 12:16:01 2021<br>
# nprocs: 1<br>
# run time: 287<br>
# metadata: lib_ver = 3.3.1<br>
# metadata: h = romio_no_indep_rw=true;cb_nodes=4<br>
<br>
# log file regions<br>
# -------------------------------------------------------<br>
# header: 360 bytes (uncompressed)<br>
# job data: 543 bytes (compressed)<br>
# record table: 18164 bytes (compressed)<br>
# POSIX module: 41682 bytes (compressed), ver=4<br>
# STDIO module: 230 bytes (compressed), ver=2<br>
<br>
And a <a href="http://darshan-job-summary.pl" rel="noreferrer" target="_blank">darshan-job-summary.pl</a> that I built back in August 2020 generates<br>
a pdf for me in a few seconds. I've attached it for you but really we<br>
should figure out what's going on in your environment<br>
<br>
<br>
==rob<br>
<br>
On Tue, 2021-07-13 at 13:41 -0400, Jeffrey Layton wrote:<br>
> Good afternoon,<br>
> <br>
> Apologies for posting yet another problem :) I'm trying to use<br>
> Darshan on a Tensorflow/Keras script. It's a simple model operating<br>
> on the CIFAR-10 data set (fairly small). Darshan produces the output<br>
> files but when I try to post-process one using darshan-job-<br>
> <a href="http://summary.pl" rel="noreferrer" target="_blank">summary.pl</a>, it hangs and I end up having to kill the process (I<br>
> waited about an hour - just to be sure).<br>
> <br>
> I run the script using the following:<br>
> <br>
> export DARSHAN_EXCLUDE_DIRS=/proc,/etc,/dev,/sys<br>
> env LD_PRELOAD=/home/laytonjb/bin/darshan-3.3.1/lib/libdarshan.so<br>
> python3 cifar10-4-checkpoint.py<br>
> <br>
> (I can provide the script if needed). It produces four files:<br>
> <br>
> $ ls -s<br>
> total 72<br>
> 4 laytonjb_ptxas_id6210-6210_7-13-47480-<br>
> 2131301613401632697_1.darshan 60 laytonjb_python3_id6041-6041_7-13-<br>
> 47475-2131301613401632697_1.darshan<br>
> 4 laytonjb_ptxas_id6211-6211_7-13-47480-<br>
> 2131301613401632697_1.darshan 4 laytonjb_uname_id6056-6056_7-13-<br>
> 47475-2131301613401632697_1.darshan<br>
> <br>
> <br>
> I chose to post-process the "python3" output but this is where it<br>
> hangs. I'm attaching the darshan output file if that is of any help.<br>
> <br>
> Thanks for any help.<br>
> <br>
> Jeff<br>
> <br>
> <br>
> <br>
> <br>
> _______________________________________________<br>
> Darshan-users mailing list<br>
> <a href="mailto:Darshan-users@lists.mcs.anl.gov" target="_blank">Darshan-users@lists.mcs.anl.gov</a><br>
> <a href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users" rel="noreferrer" target="_blank">
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a><br>
</blockquote>
</div>
</div>
</body>
</html>