<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Myriam,<br>
<br>
I was able to reproduce the problem here and confirm my initial
suspicion. The malloc initialization hooks in Open MPI are
calling stat(), which conflicts with the Darshan stat() wrapper
because it needs calloc() to work in order to dynamically load
function symbols.<br>
<br>
I've attached a patch that works around this problem if you can
try it out and report back.<br>
<br>
I'm not sure what to do about this in the long run. I would be
nervous about integrating that particular patch into the official
code base because it will be quite fragile if the Open MPI malloc
init hooks ever change in the future. I'll keep thinking about it
and see if I can come up with any other solution. Our best bet
might be to provide a patch to the Open MPI team that converts
those stat() calls into access() calls. Darshan does not
intercept access() calls, and the Open MPI code doesn't really
need the results of the stat operation. They are just checking
for the existence of particular files.<br>
<br>
-Phil<br>
<br>
On 01/02/2013 01:14 PM, Phil Carns wrote:<br>
</div>
<blockquote cite="mid:50E47917.9000209@mcs.anl.gov" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<div class="moz-cite-prefix">Hi Myriam,<br>
<br>
Thank you for the detailed bug report. We'll try to reproduce
this and get back to you. I assume that Open MPI is configured
to use IB in this environment? <br>
<br>
I think the issue here is that Open MPI very early on is setting
up its own wrappers for malloc, and it happens to make a stat()
or fstat() call as part of that process. This is problematic
because Darshan wants to intercept the stat() calls, but it
needs malloc working (as part of the symbol resolution process)
before it can intercept any functions via LD_PRELOAD. I'm not
yet sure how to handle this but we'll have a look at it.<br>
<br>
-Phil <br>
<br>
<br>
On 12/28/2012 06:00 AM, <a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:myriam.botalla@bull.net">myriam.botalla@bull.net</a>
wrote:<br>
</div>
<blockquote
cite="mid:OF02E1FAC6.6D59C096-ONC1257AE1.0039265C-C1257AE2.003D00C1@bull.net"
type="cite"><font size="2" face="sans-serif">Hi,</font> <br>
<font size="2" face="sans-serif">When I use LD_PRELOAD to get an
application instrumented with Darshan at run time, a
SEGMENTATION FAULT raises. </font> <br>
<font size="2" face="sans-serif">Also, the behaviour is the same
when simply running mpicc to get the version - and the
generated coredump displays the same stack - which seems to
point the MPI wrappers as being the potential suspect.</font>
<br>
<br>
<font size="2" face="sans-serif"># which mpicc</font> <br>
<font size="2" face="sans-serif">/home_nfs/papaureg/workspace/openmpi-1.6.2/current_AE2__2/x86_64_bullxlinux6.1.1/bin/mpicc</font>
<br>
<font size="2" face="sans-serif"># mpicc --showme:version</font>
<br>
<font size="2" face="sans-serif">Erreur de segmentation (core
dumped)</font> <br>
<font size="2" face="sans-serif"># gdb mpicc core.14109</font> <br>
<font size="2" face="sans-serif">GNU gdb (GDB) bullx Linux
(7.2-50.bl6.Bull.1.20120306)</font> <br>
<font size="2" face="sans-serif">Copyright (C) 2010 Free
Software Foundation, Inc.</font> <br>
<font size="2" face="sans-serif">License GPLv3+: GNU GPL version
3 or later <</font><a moz-do-not-send="true"
href="http://gnu.org/licenses/gpl.html"><font size="2"
face="sans-serif">http://gnu.org/licenses/gpl.html</font></a><font
size="2" face="sans-serif">></font> <br>
<font size="2" face="sans-serif">This is free software: you are
free to change and redistribute it.</font> <br>
<font size="2" face="sans-serif">Reading symbols from
/home_nfs/papaureg/workspace/openmpi-1.6.2/current_AE2__2/x86_64_bullxlinux6.1.1/bin/mpicc...done.</font><br>
<font size="2" face="sans-serif">[New Thread 14109]</font> <br>
<font size="2" face="sans-serif">Missing separate debuginfo for
/home_nfs/botallam/install/darshan.4/lib/libdarshan.so</font>
<br>
<font size="2" face="sans-serif">....</font> <br>
<font size="2" face="sans-serif">Reading symbols from
/home_nfs/papaureg/workspace/openmpi-1.6.2//current_AE2__2/x86_64_bullxlinux6.1.1/lib/libmpi.so.1...done.</font><br>
<font size="2" face="sans-serif">Loaded symbols for
/home_nfs/papaureg/workspace/openmpi-1.6.2//current_AE2__2/x86_64_bullxlinux6.1.1/lib/libmpi.so.1</font><br>
<font size="2" face="sans-serif">Core was generated by
`/home_nfs/papaureg/workspace/openmpi-1.6.2/current_AE2__2/x86_64_bullxlinux6.1.'.</font><br>
<font size="2" face="sans-serif">Program terminated with signal
11, Segmentation fault.</font> <br>
<font size="2" face="sans-serif">#0 0x00000036b7879446 in
calloc () from /lib64/libc.so.6</font> <br>
<font size="2" face="sans-serif">Missing separate debuginfos,
use: debuginfo-install glibc-2.12-1.47.bl6_2.9.x86_64
libgcc-4.4.5-6.bl6.x86_64 numactl-2.0.3-9.bl6.x86_64
zlib-1.2.3-25.bl6.x86_64</font> <br>
<font size="2" face="sans-serif">(gdb) where</font> <br>
<font size="2" face="sans-serif">#0 0x00000036b7879446 in
calloc () from /lib64/libc.so.6</font> <br>
<font size="2" face="sans-serif">#1 0x00000036b7c01310 in
_dlerror_run () from /lib64/libdl.so.2</font> <br>
<font size="2" face="sans-serif">#2 0x00000036b7c0107a in dlsym
() from /lib64/libdl.so.2</font> <br>
<font size="2" face="sans-serif">#3 0x00007f50f98c3487 in
__xstat (vers=1, path=0x7f50f966ffac "/dev/ummunotify",
buf=0x7fff6b6f08c0) at lib/darshan-posix.c:711</font> <br>
<font size="2" face="sans-serif">#4 0x00007f50f9661a64 in
opal_memory_linux_malloc_init_hook () at hooks.c:756</font> <br>
<font size="2" face="sans-serif">#5 0x00000036b7875b63 in
ptmalloc_init () from /lib64/libc.so.6</font> <br>
<font size="2" face="sans-serif">#6 0x00000036b7879987 in
malloc_hook_ini () from /lib64/libc.so.6</font> <br>
<font size="2" face="sans-serif">#7 0x00000036b78a6da1 in
__alloc_dir () from /lib64/libc.so.6</font> <br>
<font size="2" face="sans-serif">#8 0x00000036b94053cd in ?? ()
from /usr/lib64/libnuma.so.1</font> <br>
<font size="2" face="sans-serif">#9 0x00000036b740e515 in
_dl_init_internal () from /lib64/ld-linux-x86-64.so.2</font> <br>
<font size="2" face="sans-serif">#10 0x00000036b7400b3a in
_dl_start_user () from /lib64/ld-linux-x86-64.so.2</font> <br>
<font size="2" face="sans-serif">#11 0x0000000000000002 in ?? ()</font>
<br>
<font size="2" face="sans-serif">#12 0x00007fff6b6f1a43 in ?? ()</font>
<br>
<font size="2" face="sans-serif">#13 0x00007fff6b6f1a9e in ?? ()</font>
<br>
<font size="2" face="sans-serif">#14 0x0000000000000000 in ?? ()</font>
<br>
<font size="2" face="sans-serif">(gdb)</font> <br>
<br>
<font size="2" face="sans-serif">Can someone help me to
understand the issue?</font> <br>
<font size="2" face="sans-serif">Thanks,</font> <br>
<font size="2" face="sans-serif">Myriam.</font> <br>
<br>
<br>
<br>
<font size="2" face="sans-serif">HERE IS the environment:</font>
<br>
<br>
<font size="2" face="sans-serif">LD_PRELOAD=/home_nfs/botallam/install/darshan.4/lib/libdarshan.so</font>
<br>
<br>
<font size="2" face="sans-serif">The Darshan library was
configured and generated with environment variable OMPI_CC=gcc</font>
<br>
<font size="2" face="sans-serif">using mpicc: Open MPI 1.6.2
(Language: C)</font> <br>
<br>
<font size="2" face="sans-serif"># which mpicc</font> <br>
<font size="2" face="sans-serif">/home_nfs/papaureg/workspace/openmpi-1.6.2/current_AE2__2/x86_64_bullxlinux6.1.1/bin/mpicc</font>
<br>
<font size="2" face="sans-serif"># CC=mpicc CFLAGS=-g
./configure --prefix=/home_nfs/botallam/install/darshan.4
--with-mem-align=16 --with-log-path-by-env=DARSHAN_LOGPATH
--with-jobid-env=SLURM_JOBID</font> <br>
<br>
<font size="2" face="sans-serif"># ldd
/home_nfs/botallam/install/darshan.4/lib/libdarshan.so</font>
<br>
<font size="2" face="sans-serif"> linux-vdso.so.1 =>
(0x00007fffe9e93000)</font> <br>
<font size="2" face="sans-serif"> libdl.so.2 =>
/lib64/libdl.so.2 (0x00007f8a753bd000)</font> <br>
<font size="2" face="sans-serif"> libpthread.so.0 =>
/lib64/libpthread.so.0 (0x00007f8a751a0000)</font> <br>
<font size="2" face="sans-serif"> librt.so.1 =>
/lib64/librt.so.1 (0x00007f8a74f98000)</font> <br>
<font size="2" face="sans-serif"> libz.so.1 =>
/lib64/libz.so.1 (0x00007f8a74d83000)</font> <br>
<font size="2" face="sans-serif"> libmpi.so.1 =>
/home_nfs/papaureg/workspace/openmpi-1.6.2//current_AE2__2/x86_64_bullxlinux6.1.1/lib/libmpi.so.1(0x00007f8a74967000)</font>
<br>
<font size="2" face="sans-serif"> libm.so.6 =>
/lib64/libm.so.6 (0x00007f8a746e3000)</font> <br>
<font size="2" face="sans-serif"> libnuma.so.1 =>
/usr/lib64/libnuma.so.1 (0x00007f8a744db000)</font> <br>
<font size="2" face="sans-serif"> libnsl.so.1 =>
/lib64/libnsl.so.1 (0x00007f8a742c1000)</font> <br>
<font size="2" face="sans-serif"> libutil.so.1 =>
/lib64/libutil.so.1 (0x00007f8a740be000)</font> <br>
<font size="2" face="sans-serif"> libc.so.6 =>
/lib64/libc.so.6 (0x00007f8a73d2e000)</font> <br>
<font size="2" face="sans-serif">
/lib64/ld-linux-x86-64.so.2 (0x00000036b7400000)</font> <br>
<font size="2" face="sans-serif"> libimf.so =>
/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64/libimf.so
(0x00007f8a73871000)</font> <br>
<font size="2" face="sans-serif"> libsvml.so =>
/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64/libsvml.so
(0x00007f8a72fa3000)</font> <br>
<font size="2" face="sans-serif"> libgcc_s.so.1 =>
/lib64/libgcc_s.so.1 (0x00007f8a72d8d000)</font> <br>
<font size="2" face="sans-serif"> libintlc.so.5 =>
/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64/libintlc.so.5
(0x00007f8a72b3e000)</font> <br>
<font size="2" face="sans-serif">#</font> <br>
<font size="2" face="sans-serif">#</font> <br>
<font size="2" face="sans-serif"># nm
/home_nfs/botallam/install/darshan.4/lib/libdarshan.so|grep
mpi</font> <br>
<font size="2" face="sans-serif">0000000000227020 B
__real_ncmpi_close</font> <br>
<font size="2" face="sans-serif">0000000000227010 B
__real_ncmpi_create</font> <br>
<font size="2" face="sans-serif">0000000000227018 B
__real_ncmpi_open</font> <br>
<font size="2" face="sans-serif">00000000000048cc T
darshan_mpi_initialize</font> <br>
<font size="2" face="sans-serif">00000000000179d2 T ncmpi_close</font>
<br>
<font size="2" face="sans-serif">000000000001763c T ncmpi_create</font>
<br>
<font size="2" face="sans-serif">0000000000017807 T ncmpi_open</font>
<br>
<font size="2" face="sans-serif"> U
ompi_mpi_byte</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_char</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_comm_world</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_double</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_info_null</font> <br>
<font size="2" face="sans-serif"> U ompi_mpi_int</font>
<br>
<font size="2" face="sans-serif"> U
ompi_mpi_long</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_op_land</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_op_lor</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_op_max</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_op_null</font> <br>
<font size="2" face="sans-serif"> U
ompi_mpi_op_sum</font> <br>
<font size="2" face="sans-serif">0000000000016444 T
resolve_mpi_symbols</font> <br>
<font size="2" face="sans-serif">#</font> <br>
<br>
<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Darshan-users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Darshan-users@lists.mcs.anl.gov">Darshan-users@lists.mcs.anl.gov</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users">https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a>
</pre>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Darshan-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Darshan-users@lists.mcs.anl.gov">Darshan-users@lists.mcs.anl.gov</a>
<a class="moz-txt-link-freetext" href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users">https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a>
</pre>
</blockquote>
<br>
</body>
</html>