[Darshan-users] LD_PRELOAD system() call

Cristian Simarro cristian.simarro at ecmwf.int
Thu Apr 30 02:07:08 CDT 2015


Hi Phill,

Actually any command under system() call is triggering the problem. The spawned process do not finish and then the task that has issued the call is hung on the waitpid.

This example hangs if we are using LD_PRELOAD mechanism:

#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
  int rank, size;
  int ret;

  MPI_Init (&argc, &argv);
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);
  MPI_Comm_size (MPI_COMM_WORLD, &size);
  if(rank == 0) {
   ret = system("echo calling system");
  }
  printf( "Hello world from process %d of %d\n", rank, size );
  MPI_Finalize();
  return 0;
}

Thanks,
Cristian

------------------------------------------------------------------
Cristian Simarro
Analyst, User Support Section
European Centre for Medium-Range Weather Forecasts (ECMWF)
Shinfield Park, Reading, RG2 9AX, United Kingdom
Tel:    (+44 118) 9499315                Fax:    (+44 118) 9869450
E-mail: Cristian.Simarro at ecmwf.int            http://www.ecmwf.int
------------------------------------------------------------------

----- Original Message -----
From: "Phil Carns" <carns at mcs.anl.gov>
To: darshan-users at lists.mcs.anl.gov
Sent: Wednesday, 29 April, 2015 10:13:54 PM
Subject: Re: [Darshan-users] LD_PRELOAD system() call

On 04/29/2015 02:54 PM, Phil Carns wrote:
> On 04/29/2015 12:17 PM, Cristian Simarro wrote:
>> Hello,
>>
>> We have been facing some problems with system() call inside some 
>> C/Fortran codes in our Cray machine.
>>
>> The method used here is compile dynamically and then use LD_PRELOAD. 
>> When the code calls system(command), it hangs the execution if 
>> preloaded with Darshan because it is trying to instrument an internal 
>> system read() with no initialization.
>>
>> The solution we have designed is to unset LD_PRELOAD (if set before) 
>> in the darshan_mpi_initialize function.
>>
>> Has anybody found a similar problem with LD_PRELOAD + system() calls?
> Hi Cristian,
>
> I don't think I've seen this exact combination before, but it seems 
> like something we should be able to reproduce and isolate.
>
> If I understand correctly, it sounds like the underlying process 
> spawned by system() is inheriting the LD_PRELOAD environment variable 
> from the parent program, and it is the underlying process that is 
> getting hung?  If so, does it matter what you run in the system() call 
> or does it seem like pretty anything triggers it?
>
> thanks,
> -Phil

The solution you have suggested (unsetting LD_PRELOAD grammatically 
during Darshan initialization) might not be a bad long term solution, 
maybe with some extra safety logic to make sure we don't accidentally 
unset unrelated LD_PRELOAD entries.  I imagine that once the application 
has gotten to darshan initialization, then the loader has already 
processed the LD_PRELOAD environment variable and we don't need to keep 
it set any longer.  That would help keep it from interfering with child 
processes.

We would definitely need to do some testing to confirm, though.

thanks,
-Phil
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users


More information about the Darshan-users mailing list