[Darshan-users] darshan-job-summary.pl reports non-stop IO

Rob Latham robl at mcs.anl.gov
Wed May 8 08:09:26 CDT 2013


On Tue, May 07, 2013 at 04:34:47PM -0600, David Shrader wrote:
> Hello Kevin,
> 
> Thank you for the explanation. It seems I didn't think all the way
> through what "first to last access" meant in the context of messing
> with the same files multiple times.

This is one of those areas where Darshan's lightweight statistics
collection has a big impact on how Darshan can report activity.  

Darshan is trying to balance "interesting I/O behavior" with "store
the least amount of data".  So, things like "time of first i/o" and
"time of last i/o" are nice: a fixed amount of data (tens of bytes)
that can tell you a bit about how an application behaves.

If Darshan were to keep a more detailed trace of every I/O action,
we'd possibly end up with a large log file indeed, storing 8 bytes for
every i/o operation.

==rob


> 
> Thanks again!
> David
> 
> On 05/07/2013 04:26 PM, Kevin Harms wrote:
> >>Does anyone have any insight on why the job summary seems to depict continuous IO operations? Could the fact that the files are held open even though no IO to disk is actually happening be the reason the graphs suggest continuous IO operations?
> >   Yes, the graph shows the the first to last access time. So if you open a file, write to it, close it and do lots of stuff then re-open, update and close it again, you will see a continuous line. This graph "works" for checkpoint type files, where they are opened and used once, so you will see a clean split between the IO phases. Log type files that are modified periodically during the run will look like what you have.
> >
> >kevin
> >
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the Darshan-users mailing list