[Darshan-users] Darshan unable to write log files
Gunter, David O
dog at lanl.gov
Thu Feb 26 16:30:19 CST 2015
I’m using Open-MPI 1.6.5 (it is the latest version installed on this system) which I believe has ROMIO Version 2008-03-09. I imagine that it is a really buggy, old version.
-david
--
David Gunter
HPC-5: Applications Readiness Team
> On Feb 26, 2015, at 2:54 PM, Rob Latham <robl at mcs.anl.gov> wrote:
>
>
>
> On 02/26/2015 03:16 PM, Gunter, David O wrote:
>> Good call, Rob. Thanks!
>>
>> Saving to a non-Panasas directory worked, as did writing to Panasas with the ufs: prefix.
>
> Ok, super strange. When darshan writes the log files, it does it through MPI-IO, so if one can write via MPI-IO to your panasass file system, why can't darshan?
>
> the error "MPI_ERR_IO: input/output error" is not very helpful. It's the error you get if it was not ENAMETOOLONG, ENOENT, ENOTDIR, ELOOP, EACCES, EROFS.
>
> Last summer I commited some code to ROMIO to also catch EDQUOT, ENOSPC, and EEXIST.
>
> But even in the "all other cases" case, ROMIO's supposed to call strerror() and give you something -- anything! -- more helpful than "Uh, some IO error happened"
>
> What MPI implementation are you using?
>
> ==rob
>
>
>
>>
>> -david
>> --
>> David Gunter
>> HPC-5: Applications Readiness Team
>>
>>
>>
>>
>>> On Feb 26, 2015, at 2:04 PM, Rob Latham <robl at mcs.anl.gov> wrote:
>>>
>>>
>>>
>>> On 02/26/2015 02:54 PM, Gunter, David O wrote:
>>>
>>>> My app writes to a Panasas file system, /scratch/dog/test_prob and I have set
>>>> DARSHAN_LOGPATH to /scratch/dog/darshan_logs/
>>>
>>> this bit, about Panasas, is the only thing that looks out of the ordinary to me.
>>>
>>> Can you try a non-panasas file system? If not, can you try prefixing the file with ufs: (DARSHAN_LOGPATH=ufs:/scratch/dog/darshan_logs/
>>>
>>>
>>> Did you set up the year/month/day directories?
>>> (darshan-runtime/darshan-mk-log-dirs.pl )
>>>
>>> ==rob
>>>
>>>>
>>>> The permissions on the directory are good. My mpi job runs to completion and then I get the error message.
>>>>
>>>> $ mpirun -n 16 ./higrad_driver_noCBE ./params.16pe.in
>>>>
>>>> Total time for simulation = 69.468311
>>>> HiGrad simulation is complete!!
>>>> Shutting down MPI environment!
>>>> darshan library warning: unable to open log file /scratch/dog/darshan_logs/dog_higrad_driver_noCBE_id501442_2-26-49816-14771330631875741401.darshan_partial: MPI_ERR_IO: input/output error
>>>> darshan library warning: unable to write log file /scratch/dog/darshan_logs/dog_higrad_driver_noCBE_id501442_2-26-49816-14771330631875741401.darshan_partial
>>>>
>>>> --
>>>> David Gunter
>>>> HPC-5: Applications Readiness Team
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Darshan-users mailing list
>>>> Darshan-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>>>
>>>
>>> --
>>> Rob Latham
>>> Mathematics and Computer Science Division
>>> Argonne National Lab, IL USA
>>
>> _______________________________________________
>> Darshan-users mailing list
>> Darshan-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
More information about the Darshan-users
mailing list