[Darshan-users] profiling DASK application using darshan

Sandra A. Mendez smendez.fi.unju at gmail.com
Sun Oct 18 14:00:57 CDT 2020


Hi Razvan,
My python programs were non-MPI for this reason I set up the
*DARSHAN_ENABLE_NONMPI* variable.
For a simple python program in my local machine, I executed as follows:

export LD_PRELOAD=$PATH_INSTALL/darshan/3.2.1-master/lib64/libdarshan.so
export *DARSHAN_ENABLE_NONMPI=1*
python images-cla.py


Another example in a cluster (tensorflow benchmark):
# I use my Darshan installation so I need to set up LD_PRELOAD in the job
submission.

export
LD_PRELOAD=$PATH_INSTALL/darshan-runtime/3.2.1-master/lib/libdarshan.so
*export DARSHAN_ENABLE_NONMPI=1*
horovodrun -np $SLURM_NTASKS $HOSTS_FLAG --network-interface ib0
--gloo *python3.7
scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py* --model resnet101
--batch_size 64 --variable_update horovod


Thanks,
Sandra.-

On Sun, 18 Oct 2020 at 20:30, Razvan Stefanescu <razvan.stefanescu at spire.com>
wrote:

> Hello Sandra,
>
> What command did you actually use for tracing your python code? Did you
> use something like
>
> mpiexec -n 4 -env LD_PRELOAD $PATH/libdarshan.so python script.py  ?
>
> Thank you,
>
> Razvan
>
> On Sun, Oct 18, 2020 at 7:30 AM Sandra A. Mendez <
> smendez.fi.unju at gmail.com> wrote:
>
>> Hi Razvan,
>> I have traced a python program by using Darshan. You only need to set up
>> the DARSHAN_ENABLE_NONMPI variable as follows:
>>
>> export DARSHAN_ENABLE_NONMPI=1
>>
>> Before running your application. Only a comment, the maximum number of
>> files to trace by Darshan is 1024 files per parallel task. In case of
>> python applications, Darshan traces all the files opened not only the
>> application files (but also the files opened by the python libraries). So
>> take care if your application overcomes that limit.
>> Thanks,
>> Sandra.-
>>
>>
>> On Sun, 18 Oct 2020 at 15:09, Razvan Stefanescu <
>> razvan.stefanescu at spire.com> wrote:
>>
>>> Hello All,
>>>
>>> Following the documentation note saying that Darshan instrumentation of
>>> non-MPI applications is only possible with dynamically-linked applications,
>>> it seems I have to convert the multiprocessing and multithreading DASK code
>>> to a dynamically-linked code. I know some folks were able to profile Spark
>>> and TensorFlow codes with Darshan, so I wonder if you could provide some
>>> suggestions about creating such dynamically-linked code.
>>>
>>> Thank you,
>>>
>>> Razvan
>>>
>>> --
>>> *RAZVAN STEFANESCU *
>>> Head of Statistics and Machine Learning Branch
>>>
>>> Senior Data Assimilation and Data Scientist
>>>
>>> *Spire Global, Inc.*
>>>
>>> 1050 Walnut Street, Suite 402, Boulder, CO 80302 USA
>>>
>>> +1-720- <7206432245>*643-2231*
>>> +1-850-443-1718 <4178491948>
>>> _______________________________________________
>>> Darshan-users mailing list
>>> Darshan-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>>
>>
>
> --
> *RAZVAN STEFANESCU *
> Head of Statistics and Machine Learning Branch
>
> Senior Data Assimilation and Data Scientist
>
> *Spire Global, Inc.*
>
> 1050 Walnut Street, Suite 402, Boulder, CO 80302 USA
>
> +1-720- <7206432245>*643-2231*
> +1-850-443-1718 <4178491948>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20201018/2b19e0d3/attachment-0001.html>


More information about the Darshan-users mailing list