[Darshan-users] profiling DASK application using darshan

Snyder, Shane ssnyder at mcs.anl.gov
Wed Oct 21 15:37:42 CDT 2020


Hi Razvan,

Just wanted to clarify the note you asked about in your initial email, but I think all Python programs are viewed as dynamically-linked. I.e.:

shane at shane-x1-carbon ~ $ ldd `which python`
linux-vdso.so.1 (0x00007ffcadf26000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa3756eb000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa3756c8000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa3756c2000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fa3756bd000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fa3756a1000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa375552000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa375caa000)

So, I think all of these Python frameworks should be the same in this regard and should work with Dashan non-MPI instrumentation. That caveat in the documentation probably applies better to C applications that have been statically-linked, in which case the Darshan library has no hooks to interpose itself between the application and I/O libraries.

Thanks,
--Shane
________________________________
From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on behalf of Razvan Stefanescu <razvan.stefanescu at spire.com>
Sent: Sunday, October 18, 2020 4:33 PM
To: Sandra A. Mendez <smendez.fi.unju at gmail.com>
Cc: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
Subject: Re: [Darshan-users] profiling DASK application using darshan

Hello Sandra,

This is really helpful. Thank you so much!

Razvan

On Sun, Oct 18, 2020 at 1:01 PM Sandra A. Mendez <smendez.fi.unju at gmail.com<mailto:smendez.fi.unju at gmail.com>> wrote:
Hi Razvan,
My python programs were non-MPI for this reason I set up the DARSHAN_ENABLE_NONMPI variable.
For a simple python program in my local machine, I executed as follows:
export LD_PRELOAD=$PATH_INSTALL/darshan/3.2.1-master/lib64/libdarshan.so
export DARSHAN_ENABLE_NONMPI=1
python images-cla.py

Another example in a cluster (tensorflow benchmark):
# I use my Darshan installation so I need to set up LD_PRELOAD in the job submission.
export LD_PRELOAD=$PATH_INSTALL/darshan-runtime/3.2.1-master/lib/libdarshan.so
export DARSHAN_ENABLE_NONMPI=1
horovodrun -np $SLURM_NTASKS $HOSTS_FLAG --network-interface ib0 --gloo python3.7 scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model resnet101 --batch_size 64 --variable_update horovod

Thanks,
Sandra.-

On Sun, 18 Oct 2020 at 20:30, Razvan Stefanescu <razvan.stefanescu at spire.com<mailto:razvan.stefanescu at spire.com>> wrote:
Hello Sandra,

What command did you actually use for tracing your python code? Did you use something like

mpiexec -n 4 -env LD_PRELOAD $PATH/libdarshan.so python script.py  ?

Thank you,

Razvan

On Sun, Oct 18, 2020 at 7:30 AM Sandra A. Mendez <smendez.fi.unju at gmail.com<mailto:smendez.fi.unju at gmail.com>> wrote:
Hi Razvan,
I have traced a python program by using Darshan. You only need to set up the DARSHAN_ENABLE_NONMPI variable as follows:
export DARSHAN_ENABLE_NONMPI=1
Before running your application. Only a comment, the maximum number of files to trace by Darshan is 1024 files per parallel task. In case of python applications, Darshan traces all the files opened not only the application files (but also the files opened by the python libraries). So take care if your application overcomes that limit.
Thanks,
Sandra.-


On Sun, 18 Oct 2020 at 15:09, Razvan Stefanescu <razvan.stefanescu at spire.com<mailto:razvan.stefanescu at spire.com>> wrote:
Hello All,

Following the documentation note saying that Darshan instrumentation of non-MPI applications is only possible with dynamically-linked applications, it seems I have to convert the multiprocessing and multithreading DASK code to a dynamically-linked code. I know some folks were able to profile Spark and TensorFlow codes with Darshan, so I wonder if you could provide some suggestions about creating such dynamically-linked code.

Thank you,

Razvan

--
RAZVAN STEFANESCU
Head of Statistics and Machine Learning Branch

Senior Data Assimilation and Data Scientist

Spire Global, Inc.

1050 Walnut Street, Suite 402, Boulder, CO 80302 USA

+1-720-<tel:7206432245>643-2231

+1-850-443-1718<tel:4178491948>
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov<mailto:Darshan-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users


--
RAZVAN STEFANESCU
Head of Statistics and Machine Learning Branch

Senior Data Assimilation and Data Scientist

Spire Global, Inc.

1050 Walnut Street, Suite 402, Boulder, CO 80302 USA

+1-720-<tel:7206432245>643-2231

+1-850-443-1718<tel:4178491948>


--
RAZVAN STEFANESCU
Head of Statistics and Machine Learning Branch

Senior Data Assimilation and Data Scientist

Spire Global, Inc.

1050 Walnut Street, Suite 402, Boulder, CO 80302 USA

+1-720-<tel:7206432245>643-2231

+1-850-443-1718<tel:4178491948>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20201021/633b9073/attachment.html>


More information about the Darshan-users mailing list