[Darshan-users] [EXTERNAL] Multiple MPI libraries on one system?
Carns, Philip H.
carns at mcs.anl.gov
Fri Mar 6 15:03:58 CST 2020
Another quick note. This part probably isn't that helpful, but I'll throw it out there anyway in case it spurs an idea. If you can link in Darshan as a dynamic library, but can't guarantee it's order in the link line, then you can just link it in any order and then promote it at runtime like this:
LD_PRELOAD=libdarshan.so
Note that there is no path there, meaning that there is no need to make sure the link time library matches the run time library. It just promotes whatever libdarshan.so is already in the library path for that executable (including it's rpath).
The reason it's not that helpful is because of what happens if you set that LD_PRELOAD on an executable that does not have libdarshan.so in its link path (i.e. just a normal mpi executable). In that case ld.so will produce a warning to stderr. If it would quietly fail that would be better for our use case, but I understand why that's not the ld.so default 🙂
thanks,
-Phil
________________________________
From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on behalf of Carns, Philip H. <carns at mcs.anl.gov>
Sent: Friday, March 6, 2020 11:31 AM
To: Christopher J. Morrone <morrone2 at llnl.gov>; Curry, Matthew Leon <mlcurry at sandia.gov>; darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
Subject: Re: [Darshan-users] [EXTERNAL] Multiple MPI libraries on one system?
If you are interested in experimenting, I have another (not well tested) idea that you could try out.
If you can get Darshan link options in the link command line early enough, it might be possible to inject Darshan instrumentation into dynamically linked executables, and tie it to a particular darshan build that matches the MPI used at link time, without using LD_PRELOAD at all.
Under the covers (within the mpi compiler wrapper or similar) this would look something like this:
ld foo.c -o foo \
-Wl,-rpath=<darshan lib dir path> \
-L<darshan lib dir path> \
-ldarshan \
<whatever other link options and libs would normally be there>
Putting the darshan library path in rpath for the executable will ensure that when the executable is run it pulls in the appropriate darshan library build that was loaded at link time (regardless of the runtime environment).
Putting -ldarshan before other libraries (hopefully, this is the part that particularly needs testing) then Darshan's symbols for open(), MPI_file_open() etc. will be chosen by the loader at run time *before* the underlying system implementation of those functions. Darshan will then find the correct "real" functions to pass through to using dlsym(RTLD_NEXT...). If Darshan is too late in the link command line then I believe it would miss intercepting the function call (the app will still run, so not dangerous, but also not capturing the desired instrumentation).
You or someone else on the list might be able to point out why this won't work or find a counter example. I've only tried a few cases so far by manually constructing link lines on one system, definitely not enough to have confidence it will work on other systems or in other scenarios.
At any rate, I would like to find a clean solution for this (either getting Darshan in at link time, or finding a reasonably portable way to get LD_PRELOAD right in the job environment) on dynamic linking systems. We haven't settled on a great general purpose solution yet; it's been mostly system-specific.
thanks,
-Phil
________________________________
From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on behalf of Christopher J. Morrone <morrone2 at llnl.gov>
Sent: Thursday, March 5, 2020 6:14 PM
To: Curry, Matthew Leon <mlcurry at sandia.gov>; darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
Subject: Re: [Darshan-users] [EXTERNAL] Multiple MPI libraries on one system?
Crud, it sounds like here our applications often do not have their lmod
module environment set up to match what their application was built
against. So it looks like I'll need to set things up the hard way.
Chris
On 3/5/20 2:37 PM, Christopher J. Morrone wrote:
> Oh, great! Hopefully I can exploit lmod in the same way. Thanks!
>
> Chris
>
> On 3/5/20 1:56 PM, Curry, Matthew Leon wrote:
>> Hi Chris,
>>
>> Our ThunderX2 systems have openmpi3, openmpi4, and hpempi. The general strategy is as you say: We build Darshan's runtime with each MPI (and some recent version of the GNU compiler). In the end, we have created one darshan-runtime package for each MPI, and a single darshan-util package that includes the analysis tools.
>>
>> Our lmod modules for our MPIs change the MODULEPATH to point to software compiled for that MPI. When we do a "module swap" to change MPIs, lmod is smart enough to change out the darshan module for one that is in the new MODULEPATH. We manage LD_PRELOAD in the Darshan module files, so it gets updated automatically.
>>
>> Matthew
>>
>>
>> -----Original Message-----
>> From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> On Behalf Of Christopher J. Morrone
>> Sent: Thursday, March 5, 2020 2:48 PM
>> To: darshan-users at lists.mcs.anl.gov
>> Subject: [EXTERNAL] [Darshan-users] Multiple MPI libraries on one system?
>>
>> Has anyone out there dealt with using darshan on a system that has multiple MPI libraries installed? Since various MPIs are not necessarily ABI compatible, I presume that I'll need multiple builds of darshan, and a mechanism to select the correct darshan LD_PRELOAD based on which MPI any particular application is using.
>>
>> Any tips for making this easier?
>>
>> Chris
>> _______________________________________________
>> Darshan-users mailing list
>> Darshan-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>> .
>>
>
> .
>
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20200306/b1155986/attachment-0001.html>
More information about the Darshan-users
mailing list