[Darshan-users] LD_PRELOAD system() call
Phil Carns
carns at mcs.anl.gov
Sun May 31 08:40:57 CDT 2015
Hi Cristian,
I just tried to narrow this down further again, but Edison is now
running craype version 2.3.1 and the behavior seems to have changed.
There is no longer a hang, and the system() call executes correctly. I
get the following error messages in stderr, though (in addition to the
wrapper output):
Sun May 31 06:30:30 2015: [unset]:_pmi_alps_sync:alps response not OKAY
Sun May 31 06:30:30 2015: [unset]:_pmiu_daemon:_pmi_alps_sync failed
Sun May 31 06:30:30 2015: [PE_0]:_pmi_daemon_barrier:PE pipe read failed
from daemon errno = Success
Sun May 31 06:30:30 2015: [PE_0]:_pmi_init:_pmi_daemon_barrier returned -1
-Phil
On 05/06/2015 04:47 PM, Carns, Philip H. wrote:
> Hi Cristian,
>
> Here are the results from an execution on Edison, with that fprintf
> added just before the existing one. I think line 260 is probably the
> last valid line before the hang (the rest are messages printed after the
> scheduler starts trying to kill the job).
>
> thanks,
> -Phil
>
> On 05/06/2015 02:33 PM, Phil Carns wrote:
>> On 05/05/2015 07:09 AM, Cristian Simarro wrote:
>>> Hi Phill,
>>>
>>> I have tried your test in our Cray but I can not reproduce the
>>> behaviour. Actually the test program is finishing properly.
>> Interesting! I assume it is printing output showing that the wrappers
>> were triggered, though, right?
>>
>> I'm attaching my job script just in case there is any difference there
>> in how we are executing the test case. I think I probably compiled
>> the example program with Intel compilers, while the read-wrapper
>> library was compiled with GNU, though I wouldn't think that part would
>> matter here. The system is running craype 2.2.1.
>>
>>
>>
>>> Can you add traceback information about the call?
>>> Dl_info dli;
>>>
>>> original_read = dlsym(RTLD_NEXT, "read");
>>> dladdr(original_read,&dli);
>>> fprintf(stderr, "debug trace [%d]: %s "
>>> "called by %p [ %s(%p) %s(%p) ].\n",
>>> getpid(), __func__,
>>> __builtin_return_address(0),
>>> strrchr(dli.dli_fname, '/') ?
>>> strrchr(dli.dli_fname, '/')+1 :
>>> dli.dli_fname,
>>> dli.dli_fbase, dli.dli_sname, dli.dli_saddr);
>> Sure, I'll give that a try and report back. Thanks for the example; I
>> was thinking that a backtrace might be very helpful but I wasn't sure
>> how to best go about collecting it :)
>>
>> thanks,
>> -Phil
>>
>>> Best regards,
>>> Cristian
>>>
>>> ------------------------------------------------------------------
>>> Cristian Simarro
>>> Analyst, User Support Section
>>> European Centre for Medium-Range Weather Forecasts (ECMWF)
>>> Shinfield Park, Reading, RG2 9AX, United Kingdom
>>> Tel: (+44 118) 9499315 Fax: (+44 118) 9869450
>>> E-mail: Cristian.Simarro at ecmwf.int http://www.ecmwf.int
>>> ------------------------------------------------------------------
>>>
>>> ----- Original Message -----
>>> From: "Phil Carns" <carns at mcs.anl.gov>
>>> To: darshan-users at lists.mcs.anl.gov
>>> Sent: Sunday, 3 May, 2015 3:35:46 PM
>>> Subject: Re: [Darshan-users] LD_PRELOAD system() call
>>>
>>> Hi Cristian,
>>>
>>> This is definitely the same problem that Kalyana and I were looking at
>>> earlier with fork(). I built a very small example reproducer library to
>>> try to simplify the problem (see Makefile and read-wrapper.c). The
>>> read-wrapper.c isn't doing anything except intercepting the read()
>>> function, printing some information, then calling the real read()
>>> function. I've been building this library with PrgEnv-gnu.
>>>
>>> The test.c is your example program, and the
>>> test-preload-read-wrapper.pbs.e* is an example stderr file from trying
>>> to run it with the example read wrapper library preloaded.
>>>
>>> There isn't any Darshan code involved here, but the example still
>>> hangs. It looks like you could trigger it with *any* wrapper on the
>>> read() function in the Cray environment in conjunction with a fork() or
>>> system() call. Maybe there is some sort of recursion here?
>>>
>>> I'll keep thinking about this some, but I thought I would share what I'm
>>> seeing with the list in case anyone else has an idea.
>>>
>>> thanks,
>>> -Phil
>>>
>>> On 05/01/2015 12:06 PM, Carns, Philip H. wrote:
>>>> Hi Cristian,
>>>>
>>>> I was testing on Edison, an XC30 system at NERSC. I compiled with
>>>> cray-mpich 7.1.1, and I think it is using Torque as the batch system.
>>>> FYI, to run this example program I have to launch the executable using
>>>> aprun (otherwise MPI won't initialize properly). I think this will be
>>>> reproducible with non-MPI programs as well, though.
>>>>
>>>> thanks,
>>>> -Phil
>>>>
>>>> On 05/01/2015 04:53 AM, Cristian Simarro wrote:
>>>>> Hi Phil,
>>>>>
>>>>> Could you please tell me the batch system that you are using in
>>>>> your Cray machine? Is the MPI implementation cray-mpich?
>>>>>
>>>>> Thanks,
>>>>> Cristian
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "Phil Carns" <carns at mcs.anl.gov>
>>>>> To: "Cristian Simarro" <cristian.simarro at ecmwf.int>
>>>>> Cc: darshan-users at lists.mcs.anl.gov
>>>>> Sent: Thursday, 30 April, 2015 8:45:11 PM
>>>>> Subject: Re: [Darshan-users] LD_PRELOAD system() call
>>>>>
>>>>> Thanks for the test program, Cristian. I can confirm that it hangs
>>>>> with
>>>>> LD_PRELOAD on a Cray, but not on a Linux workstation. I'm not exactly
>>>>> sure what the underlying difference is in this case, but it is
>>>>> definitely 100% reproducible in the Cray environment.
>>>>>
>>>>> Kalyana Chadalavada has actually observed something very similar when
>>>>> using fork() directly; I imagine that it is the underlying fork()
>>>>> within
>>>>> the system() call that is causing the problem.
>>>>>
>>>>> thanks,
>>>>> -Phil
>>>>>
>>>>> On 04/30/2015 03:07 AM, Cristian Simarro wrote:
>>>>>> Hi Phill,
>>>>>>
>>>>>> Actually any command under system() call is triggering the
>>>>>> problem. The spawned process do not finish and then the task that
>>>>>> has issued the call is hung on the waitpid.
>>>>>>
>>>>>> This example hangs if we are using LD_PRELOAD mechanism:
>>>>>>
>>>>>> #include <stdio.h>
>>>>>> #include <mpi.h>
>>>>>> #include <stdlib.h>
>>>>>>
>>>>>> int main (int argc, char *argv[])
>>>>>> {
>>>>>> int rank, size;
>>>>>> int ret;
>>>>>>
>>>>>> MPI_Init (&argc, &argv);
>>>>>> MPI_Comm_rank (MPI_COMM_WORLD, &rank);
>>>>>> MPI_Comm_size (MPI_COMM_WORLD, &size);
>>>>>> if(rank == 0) {
>>>>>> ret = system("echo calling system");
>>>>>> }
>>>>>> printf( "Hello world from process %d of %d\n", rank, size );
>>>>>> MPI_Finalize();
>>>>>> return 0;
>>>>>> }
>>>>>>
>>>>>> Thanks,
>>>>>> Cristian
>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>> Cristian Simarro
>>>>>> Analyst, User Support Section
>>>>>> European Centre for Medium-Range Weather Forecasts (ECMWF)
>>>>>> Shinfield Park, Reading, RG2 9AX, United Kingdom
>>>>>> Tel: (+44 118) 9499315 Fax: (+44 118) 9869450
>>>>>> E-mail: Cristian.Simarro at ecmwf.int http://www.ecmwf.int
>>>>>> ------------------------------------------------------------------
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: "Phil Carns" <carns at mcs.anl.gov>
>>>>>> To: darshan-users at lists.mcs.anl.gov
>>>>>> Sent: Wednesday, 29 April, 2015 10:13:54 PM
>>>>>> Subject: Re: [Darshan-users] LD_PRELOAD system() call
>>>>>>
>>>>>> On 04/29/2015 02:54 PM, Phil Carns wrote:
>>>>>>> On 04/29/2015 12:17 PM, Cristian Simarro wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> We have been facing some problems with system() call inside some
>>>>>>>> C/Fortran codes in our Cray machine.
>>>>>>>>
>>>>>>>> The method used here is compile dynamically and then use
>>>>>>>> LD_PRELOAD.
>>>>>>>> When the code calls system(command), it hangs the execution if
>>>>>>>> preloaded with Darshan because it is trying to instrument an
>>>>>>>> internal
>>>>>>>> system read() with no initialization.
>>>>>>>>
>>>>>>>> The solution we have designed is to unset LD_PRELOAD (if set
>>>>>>>> before)
>>>>>>>> in the darshan_mpi_initialize function.
>>>>>>>>
>>>>>>>> Has anybody found a similar problem with LD_PRELOAD + system()
>>>>>>>> calls?
>>>>>>> Hi Cristian,
>>>>>>>
>>>>>>> I don't think I've seen this exact combination before, but it seems
>>>>>>> like something we should be able to reproduce and isolate.
>>>>>>>
>>>>>>> If I understand correctly, it sounds like the underlying process
>>>>>>> spawned by system() is inheriting the LD_PRELOAD environment
>>>>>>> variable
>>>>>>> from the parent program, and it is the underlying process that is
>>>>>>> getting hung? If so, does it matter what you run in the system()
>>>>>>> call
>>>>>>> or does it seem like pretty anything triggers it?
>>>>>>>
>>>>>>> thanks,
>>>>>>> -Phil
>>>>>> The solution you have suggested (unsetting LD_PRELOAD grammatically
>>>>>> during Darshan initialization) might not be a bad long term solution,
>>>>>> maybe with some extra safety logic to make sure we don't accidentally
>>>>>> unset unrelated LD_PRELOAD entries. I imagine that once the
>>>>>> application
>>>>>> has gotten to darshan initialization, then the loader has already
>>>>>> processed the LD_PRELOAD environment variable and we don't need to
>>>>>> keep
>>>>>> it set any longer. That would help keep it from interfering with
>>>>>> child
>>>>>> processes.
>>>>>>
>>>>>> We would definitely need to do some testing to confirm, though.
>>>>>>
>>>>>> thanks,
>>>>>> -Phil
>>>>>> _______________________________________________
>>>>>> Darshan-users mailing list
>>>>>> Darshan-users at lists.mcs.anl.gov
>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>>> _______________________________________________
>>>> Darshan-users mailing list
>>>> Darshan-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>> _______________________________________________
>>> Darshan-users mailing list
>>> Darshan-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
More information about the Darshan-users
mailing list