[Darshan-users] profiling DASK application using darshan

Razvan Stefanescu razvan.stefanescu at spire.com
Sun Oct 18 16:33:24 CDT 2020


Hello Sandra,

This is really helpful. Thank you so much!

Razvan

On Sun, Oct 18, 2020 at 1:01 PM Sandra A. Mendez <smendez.fi.unju at gmail.com>
wrote:

> Hi Razvan,
> My python programs were non-MPI for this reason I set up the
> *DARSHAN_ENABLE_NONMPI* variable.
> For a simple python program in my local machine, I executed as follows:
>
> export LD_PRELOAD=$PATH_INSTALL/darshan/3.2.1-master/lib64/libdarshan.so
> export *DARSHAN_ENABLE_NONMPI=1*
> python images-cla.py
>
>
> Another example in a cluster (tensorflow benchmark):
> # I use my Darshan installation so I need to set up LD_PRELOAD in the job
> submission.
>
> export
> LD_PRELOAD=$PATH_INSTALL/darshan-runtime/3.2.1-master/lib/libdarshan.so
> *export DARSHAN_ENABLE_NONMPI=1*
> horovodrun -np $SLURM_NTASKS $HOSTS_FLAG --network-interface ib0 --gloo *python3.7
> scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py* --model resnet101
> --batch_size 64 --variable_update horovod
>
>
> Thanks,
> Sandra.-
>
> On Sun, 18 Oct 2020 at 20:30, Razvan Stefanescu <
> razvan.stefanescu at spire.com> wrote:
>
>> Hello Sandra,
>>
>> What command did you actually use for tracing your python code? Did you
>> use something like
>>
>> mpiexec -n 4 -env LD_PRELOAD $PATH/libdarshan.so python script.py  ?
>>
>> Thank you,
>>
>> Razvan
>>
>> On Sun, Oct 18, 2020 at 7:30 AM Sandra A. Mendez <
>> smendez.fi.unju at gmail.com> wrote:
>>
>>> Hi Razvan,
>>> I have traced a python program by using Darshan. You only need to set up
>>> the DARSHAN_ENABLE_NONMPI variable as follows:
>>>
>>> export DARSHAN_ENABLE_NONMPI=1
>>>
>>> Before running your application. Only a comment, the maximum number of
>>> files to trace by Darshan is 1024 files per parallel task. In case of
>>> python applications, Darshan traces all the files opened not only the
>>> application files (but also the files opened by the python libraries). So
>>> take care if your application overcomes that limit.
>>> Thanks,
>>> Sandra.-
>>>
>>>
>>> On Sun, 18 Oct 2020 at 15:09, Razvan Stefanescu <
>>> razvan.stefanescu at spire.com> wrote:
>>>
>>>> Hello All,
>>>>
>>>> Following the documentation note saying that Darshan instrumentation of
>>>> non-MPI applications is only possible with dynamically-linked applications,
>>>> it seems I have to convert the multiprocessing and multithreading DASK code
>>>> to a dynamically-linked code. I know some folks were able to profile Spark
>>>> and TensorFlow codes with Darshan, so I wonder if you could provide some
>>>> suggestions about creating such dynamically-linked code.
>>>>
>>>> Thank you,
>>>>
>>>> Razvan
>>>>
>>>> --
>>>> *RAZVAN STEFANESCU *
>>>> Head of Statistics and Machine Learning Branch
>>>>
>>>> Senior Data Assimilation and Data Scientist
>>>>
>>>> *Spire Global, Inc.*
>>>>
>>>> 1050 Walnut Street, Suite 402, Boulder, CO 80302 USA
>>>>
>>>> +1-720- <7206432245>*643-2231*
>>>> +1-850-443-1718 <4178491948>
>>>> _______________________________________________
>>>> Darshan-users mailing list
>>>> Darshan-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>>>
>>>
>>
>> --
>> *RAZVAN STEFANESCU *
>> Head of Statistics and Machine Learning Branch
>>
>> Senior Data Assimilation and Data Scientist
>>
>> *Spire Global, Inc.*
>>
>> 1050 Walnut Street, Suite 402, Boulder, CO 80302 USA
>>
>> +1-720- <7206432245>*643-2231*
>> +1-850-443-1718 <4178491948>
>>
>

-- 
*RAZVAN STEFANESCU *
Head of Statistics and Machine Learning Branch

Senior Data Assimilation and Data Scientist

*Spire Global, Inc.*

1050 Walnut Street, Suite 402, Boulder, CO 80302 USA

+1-720- <7206432245>*643-2231*
+1-850-443-1718 <4178491948>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20201018/a9711fba/attachment.html>


More information about the Darshan-users mailing list