[Darshan-users] Summary report of several darshan logfiles of a non-MPI job ?

Carns, Phil carns at mcs.anl.gov
Fri May 30 10:02:32 CDT 2025


Hi Niels,

I apologize for the delayed response.  We don't have a comprehensive solution for merging logs, but there are a few things you can try:


  *
The latest Darshan release includes some new command line utilities to aid in summarizing data for a large number of log files.  This does not produce a report in the same form as a Darshan summary, but rather tables that can summarize and rank logs or files in various ways.  It may provide the information you are looking for, or at least help you to focus on the most important logs.  You can find information about these tools at https://www.mcs.anl.gov/research/projects/darshan/docs/pydarshan/usage.html#other-darshan-cli-tools .
  *
If you want a single summary report (as you would get from a single log) you might be able to use the "darshan-merge" command line tool to join multiple logs into a single log.   The darshan-merge utility was designed for a different use case, though (one in which a job terminates abruptly and we want to join fragments of a single log back together), and it depending on the nature of the logs it won't necessarily behave as expected for joining truly independent logs.  If it works, though, the resulting unified log can be analyzed using the usual summary tool.

Thank you for pointing out this use case though; we have received questions about this before.  It would be a great feature to add to Darshan, ideally by generalizing the darshan-merge utility to make sure that it handles more cases.

thanks,
-Phil

________________________________
From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on behalf of OGER Niels <niels.oger at meteo.fr>
Sent: Friday, May 23, 2025 1:56 PM
To: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
Subject: [Darshan-users] Summary report of several darshan logfiles of a non-MPI job ?

This Message Is From an External Sender
This message came from outside your organization.

Hello,

I'm trying to use Darshan for the first time to get the I/O profil of several kinds of jobs (Fortran and Python).
For the Fortran jobs I have no issue, I get 1 darshan logfile per execution and I can build a report with this command python -m darshan summary "log.darshan".

For the Python jobs, I use export DARSHAN_ENABLE_NONMPI=1 before running the job.
My Python script uses multiprocessing to read several files in parallel. I get one darshan logfile for each sub-process at the end of the execution.

I have tried to use " python -m darshan summary " to get a report merging all the logfiles but did not manage to get the right syntax. I am wondering if what I want to do is even possible with Pydarshan.
I tried using the "--include_names" argument but it seems to be a filter on the files accessed by the job and not on the log files themselves.

Can someone confirm me if it's possible to 'merge' several logfiles in one report or not ?

best regards,
Niels
--
----- Météo-France -----
OGER NIELS
DSI/D - Chef de projet Calcul Intensif
niels.oger at meteo.fr
Fixe : +33 561078198
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20250530/861e6f6f/attachment.html>


More information about the Darshan-users mailing list