[Darshan-users] Darshan v3.2.1 hangs with mvapich2 2.3.3 (IOR)

Latham, Robert J. robl at mcs.anl.gov
Tue Sep 22 14:35:14 CDT 2020


On Fri, 2020-09-18 at 13:31 -0500, Cormac Garvey wrote:
> The IOR job hangs at the end if I export the darshan LD_PRELOAD (Runs
> correctly if I remove the LD_PRELOAD)

If at all possible, can you attach a debugger to some of the hung IOR
jobs and give us a backtrace?  It will be really helpful to know what
operation these proceses are stuck in.

I don't know your exact environment but attaching to proceses probably
looks like

- ask PBS where your jobs are running
- ssh to a client node
- get the pid of the ior processs on that node
- "gdb -ex where -ex quit -p 1234" (or whatever that process id is)

==rob

> 
> When I kill the job, i get the following information in PBS stderr
> file.
> 
> "darshan_library_warning: unable to write header to file
> /share/home/hpcuser/darshan_logs/hpcuser_ior_id26_9-18-65156-
> 2174583250033636718.darshan_partial."
> 
> The IOR benchmark completes (i.e can see the I/O stats), but does not
> appear to exit correctly (remains in a hung state until I kill it)
> 
> 
> Running a similar job with IOR+mpich or IOR+OpenMPI works fine with
> darshan.
> 
> Any ideas what I am missing?
> 
> Thanks for your support.
> 
> Regards,
> Cormac.
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users


More information about the Darshan-users mailing list