[Darshan-users] Darshan v3.2.1 hangs with mvapich2 2.3.3 (IOR)
Latham, Robert J.
robl at mcs.anl.gov
Tue Sep 22 14:35:14 CDT 2020
On Fri, 2020-09-18 at 13:31 -0500, Cormac Garvey wrote:
> The IOR job hangs at the end if I export the darshan LD_PRELOAD (Runs
> correctly if I remove the LD_PRELOAD)
If at all possible, can you attach a debugger to some of the hung IOR
jobs and give us a backtrace? It will be really helpful to know what
operation these proceses are stuck in.
I don't know your exact environment but attaching to proceses probably
looks like
- ask PBS where your jobs are running
- ssh to a client node
- get the pid of the ior processs on that node
- "gdb -ex where -ex quit -p 1234" (or whatever that process id is)
==rob
>
> When I kill the job, i get the following information in PBS stderr
> file.
>
> "darshan_library_warning: unable to write header to file
> /share/home/hpcuser/darshan_logs/hpcuser_ior_id26_9-18-65156-
> 2174583250033636718.darshan_partial."
>
> The IOR benchmark completes (i.e can see the I/O stats), but does not
> appear to exit correctly (remains in a hung state until I kill it)
>
>
> Running a similar job with IOR+mpich or IOR+OpenMPI works fine with
> darshan.
>
> Any ideas what I am missing?
>
> Thanks for your support.
>
> Regards,
> Cormac.
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
More information about the Darshan-users
mailing list