[Darshan-users] [EXTERNAL] Multiple MPI libraries on one system?
Christopher J. Morrone
morrone2 at llnl.gov
Fri Mar 6 15:54:10 CST 2020
Hi Philip,
That could very well be our best option. We already have local custom
mpicc wrappers and force everything to be rpath'ed. That is, to the best
that I understand it so far, how we manage to allow MPI applications to
run correctly when the lmod environment is set wrong at run time. The
environment is largely ignored because of heavy rpath usage.
Thank you!
Chris
On 3/6/20 8:31 AM, Carns, Philip H. wrote:
> If you are interested in experimenting, I have another (not well tested)
> idea that you could try out.
>
> If you can get Darshan link options in the link command line early
> enough, it might be possible to inject Darshan instrumentation into
> dynamically linked executables, and tie it to a particular darshan build
> that matches the MPI used at link time, without using LD_PRELOAD at all.
>
> Under the covers (within the mpi compiler wrapper or similar) this
> would look something like this:
>
> ld foo.c -o foo \
> -Wl,-rpath=<darshan lib dir path> \
> -L<darshan lib dir path> \
> -ldarshan \
> <whatever other link options and libs would normally be there>
>
>
> Putting the darshan library path in rpath for the executable will ensure
> that when the executable is run it pulls in the appropriate darshan
> library build that was loaded at link time (regardless of the runtime
> environment).
>
> Putting -ldarshan before other libraries (hopefully, this is the part
> that particularly needs testing) then Darshan's symbols for open(),
> MPI_file_open() etc. will be chosen by the loader at run time *before*
> the underlying system implementation of those functions. Darshan will
> then find the correct "real" functions to pass through to using
> dlsym(RTLD_NEXT...). If Darshan is too late in the link command line
> then I believe it would miss intercepting the function call (the app
> will still run, so not dangerous, but also not capturing the desired
> instrumentation).
>
> You or someone else on the list might be able to point out why this
> won't work or find a counter example. I've only tried a few cases so
> far by manually constructing link lines on one system, definitely not
> enough to have confidence it will work on other systems or in other
> scenarios.
>
> At any rate, I would like to find a clean solution for this (either
> getting Darshan in at link time, or finding a reasonably portable way to
> get LD_PRELOAD right in the job environment) on dynamic linking
> systems. We haven't settled on a great general purpose solution yet;
> it's been mostly system-specific.
>
> thanks,
> -Phil
>
>
> ------------------------------------------------------------------------
> *From:* Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on
> behalf of Christopher J. Morrone <morrone2 at llnl.gov>
> *Sent:* Thursday, March 5, 2020 6:14 PM
> *To:* Curry, Matthew Leon <mlcurry at sandia.gov>;
> darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
> *Subject:* Re: [Darshan-users] [EXTERNAL] Multiple MPI libraries on one
> system?
>
> Crud, it sounds like here our applications often do not have their lmod
> module environment set up to match what their application was built
> against. So it looks like I'll need to set things up the hard way.
>
> Chris
>
> On 3/5/20 2:37 PM, Christopher J. Morrone wrote:
>> Oh, great! Hopefully I can exploit lmod in the same way. Thanks!
>>
>> Chris
>>
>> On 3/5/20 1:56 PM, Curry, Matthew Leon wrote:
>>> Hi Chris,
>>>
>>> Our ThunderX2 systems have openmpi3, openmpi4, and hpempi. The general strategy is as you say: We build Darshan's runtime with each MPI (and some recent version of the GNU compiler). In the end, we have created one darshan-runtime package for each MPI, and a single darshan-util package that includes the analysis tools.
>>>
>>> Our lmod modules for our MPIs change the MODULEPATH to point to software compiled for that MPI. When we do a "module swap" to change MPIs, lmod is smart enough to change out the darshan module for one that is in the new MODULEPATH. We manage LD_PRELOAD in the Darshan module files, so it gets updated automatically.
>>>
>>> Matthew
>>>
>>>
>>> -----Original Message-----
>>> From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> On Behalf Of Christopher J. Morrone
>>> Sent: Thursday, March 5, 2020 2:48 PM
>>> To: darshan-users at lists.mcs.anl.gov
>>> Subject: [EXTERNAL] [Darshan-users] Multiple MPI libraries on one system?
>>>
>>> Has anyone out there dealt with using darshan on a system that has multiple MPI libraries installed? Since various MPIs are not necessarily ABI compatible, I presume that I'll need multiple builds of darshan, and a mechanism to select the correct darshan LD_PRELOAD based on which MPI any particular application is using.
>>>
>>> Any tips for making this easier?
>>>
>>> Chris
>>> _______________________________________________
>>> Darshan-users mailing list
>>> Darshan-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>> .
>>>
>>
>> .
>>
>
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
More information about the Darshan-users
mailing list