[Darshan-users] Darshan unable to write log files

Gunter, David O dog at lanl.gov
Thu Feb 26 16:30:19 CST 2015


I’m using Open-MPI 1.6.5 (it is the latest version installed on this system) which I believe has ROMIO Version 2008-03-09. I imagine that it is a really buggy, old version.

-david
--
David Gunter
HPC-5: Applications Readiness Team




> On Feb 26, 2015, at 2:54 PM, Rob Latham <robl at mcs.anl.gov> wrote:
> 
> 
> 
> On 02/26/2015 03:16 PM, Gunter, David O wrote:
>> Good call, Rob. Thanks!
>> 
>> Saving to a non-Panasas directory worked, as did writing to Panasas with the ufs: prefix.
> 
> Ok, super strange.  When darshan writes the log files, it does it through MPI-IO, so if one can write via MPI-IO to your panasass file system, why can't darshan?
> 
> the error "MPI_ERR_IO: input/output error" is not very helpful.  It's the error you get if it was not ENAMETOOLONG, ENOENT, ENOTDIR, ELOOP, EACCES, EROFS.
> 
> Last summer I commited some code to ROMIO to also catch EDQUOT, ENOSPC, and EEXIST.
> 
> But even in the "all other cases" case, ROMIO's supposed to call strerror() and give you something -- anything! -- more helpful than "Uh, some IO error happened"
> 
> What MPI implementation are you using?
> 
> ==rob
> 
> 
> 
>> 
>> -david
>> --
>> David Gunter
>> HPC-5: Applications Readiness Team
>> 
>> 
>> 
>> 
>>> On Feb 26, 2015, at 2:04 PM, Rob Latham <robl at mcs.anl.gov> wrote:
>>> 
>>> 
>>> 
>>> On 02/26/2015 02:54 PM, Gunter, David O wrote:
>>> 
>>>> My app writes to a Panasas file system, /scratch/dog/test_prob and I have set
>>>> DARSHAN_LOGPATH to /scratch/dog/darshan_logs/
>>> 
>>> this bit, about Panasas, is the only thing that looks out of the ordinary to me.
>>> 
>>> Can you try a non-panasas file system?  If not, can you try prefixing the file with ufs: (DARSHAN_LOGPATH=ufs:/scratch/dog/darshan_logs/
>>> 
>>> 
>>> Did you set up the year/month/day directories?
>>> (darshan-runtime/darshan-mk-log-dirs.pl )
>>> 
>>> ==rob
>>> 
>>>> 
>>>> The permissions on the directory are good. My mpi job runs to completion and then I get the error message.
>>>> 
>>>> $ mpirun -n 16 ./higrad_driver_noCBE ./params.16pe.in
>>>> 
>>>> Total time for simulation = 69.468311
>>>> HiGrad simulation is complete!!
>>>> Shutting down MPI environment!
>>>> darshan library warning: unable to open log file /scratch/dog/darshan_logs/dog_higrad_driver_noCBE_id501442_2-26-49816-14771330631875741401.darshan_partial: MPI_ERR_IO: input/output error
>>>> darshan library warning: unable to write log file /scratch/dog/darshan_logs/dog_higrad_driver_noCBE_id501442_2-26-49816-14771330631875741401.darshan_partial
>>>> 
>>>> --
>>>> David Gunter
>>>> HPC-5: Applications Readiness Team
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Darshan-users mailing list
>>>> Darshan-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>>> 
>>> 
>>> --
>>> Rob Latham
>>> Mathematics and Computer Science Division
>>> Argonne National Lab, IL USA
>> 
>> _______________________________________________
>> Darshan-users mailing list
>> Darshan-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>> 
> 
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users



More information about the Darshan-users mailing list