[Darshan-users] [Ext] Re: Issues with Fio benchmark tool being profiled with Darshan
Neeraj Rajesh
nrajesh at hawk.iit.edu
Fri Aug 7 11:24:35 CDT 2020
Hello Shane
Thank you for investigating this and providing work arounds. I really appreciate it.
Thank you for your time, stay safe
Yours sincerely
Neeraj Rajesh
Aug 7, 2020 07:21:30 Snyder, Shane <ssnyder at mcs.anl.gov>:
> Hi Neeraj,
>
> Apologies for the delay, but I finally had a chance to dig into this further.
>
> It turns out the issue is that fio is using the fork() call to spawn another child process that does the actual I/O to the file, with the parent process just using stat a few times on that file (which is why we only see the stat calls of the parent in the log files).
>
> I recently had another Darshan use case that was using fork(), and I re-worked Darshan a bit so that it would actually generate logs for both the parent and the child processes, which I thought would be all that we would need to get fio working...but it didn't actually work for me when I tried it. Digging further, the child process that does the I/O calls _exit() (rather than exit(), as processes generally do when they terminate), which short-circuits a lot of the shutdown procedures processes will go through, including Darshan's shutdown procedures that generate the log files.
>
> There's not really a way to work around this, unfortunately. As a matter of best practice, child processes are supposed to call _exit(), so fio is not really doing anything wrong, we just lack the hooks to get instrumentation out of the child. You could manually modify the code so that the child calls exit() instead if you really want instrumentation, it shouldn't cause anything to explode. You could also consider running fio with the '--thread' option which uses pthreads to do the I/O to the file rather than forking another process -- Darhan has no problems instrumenting different threads like that, and I confirmed the logs look normal in that case.
>
> Thanks,
>
> --Shane
>
>
> ----------------------------------------
> From: Neeraj Rajesh <nrajesh at hawk.iit.edu>
> Sent: Wednesday, July 22, 2020 5:33 PM
> To: Snyder, Shane <ssnyder at mcs.anl.gov>
> Cc: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
> Subject: Re: [Ext] Re: [Darshan-users] Issues with Fio benchmark tool being profiled with Darshan
>
> Thank you Shane. I really appreciate that.
>
> Thank you, stay safe
> Yours sincerely
> Neeraj Rajesh
>
> Jul 22, 2020 08:35:15 Snyder, Shane <ssnyder at mcs.anl.gov>:
>
>> Thanks for the update, Neeraj, and thanks for taking the time to test that out for us. That's unfortunate that the problem wasn't as simple as we hoped, but let me take an action item to do some testing with FIO to see if I can reproduce and find a workaround for the problem.
>>
>> I'll let you know what I find.
>>
>> --Shane
>>
>>
>> ----------------------------------------
>> From: Neeraj Rajesh <nrajesh at hawk.iit.edu>
>> Sent: Tuesday, July 21, 2020 12:47 PM
>> To: Snyder, Shane <ssnyder at mcs.anl.gov>
>> Cc: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
>> Subject: Re: [Ext] Re: [Darshan-users] Issues with Fio benchmark tool being profiled with Darshan
>>
>> Hello Shane
>>
>> Sorry for the delay, it looks like there is still an issue with profiling Fio with darshan-master and the counters are still 0.
>>
>> Thank you, stay safe
>>
>> Yours sincerely
>>
>> Neeraj Rajesh
>>
>> On Wed, Jul 15, 2020 at 1:43 PM Neeraj Rajesh <nrajesh at hawk.iit.edu> wrote:
>>
>>> Hello Shane
>>>
>>> I am using the release version on the Darshan website, (3.2.1). I shall check out the master branch to see if I can generate accurate profiles and can report back on it.
>>>
>>> Thank you, Stay safe
>>>
>>> Yours Sincerely
>>>
>>> Neeraj Rajesh
>>>
>>> On Wed, Jul 15, 2020 at 1:35 PM Snyder, Shane <ssnyder at mcs.anl.gov> wrote:
>>>
>>>> Hi Neeraj,
>>>>
>>>> It looks like FIO might be using the 'openat()' call to open the files it is doing I/O to, which we only recently added support for. Assuming you aren't already, would it be possible for you to try the master branch of Darshan to see whether you can generate accurate profiles using that?
>>>>
>>>> Thanks,
>>>>
>>>> --Shane
>>>>
>>>>
>>>> ----------------------------------------
>>>> From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on behalf of Neeraj Rajesh <nrajesh at hawk.iit.edu>
>>>> Sent: Tuesday, July 14, 2020 3:15 PM
>>>> To: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
>>>> Subject: [Darshan-users] Issues with Fio benchmark tool being profiled with Darshan
>>>>
>>>> Hello
>>>>
>>>> I am trying to profile an Fio command, where I do roughly a gigabyte of reads or writes or mixed at a fixed block size.
>>>>
>>>> eg.
>>>>
>>>> ```
>>>>
>>>> fio --filename=./test --size=1024MB --direct=1 --rw=randrw --bs=4k --ioengine=sync --iodepth=256 --numjobs=1 --name=test-job --minimal --stats=0
>>>>
>>>> ```
>>>>
>>>> I have tested it with the following 'ioengines':
>>>>
>>>> - sync
>>>>
>>>> - psync
>>>>
>>>> - posixaio
>>>>
>>>> - pvsync
>>>>
>>>> I do get a darshan file, but all of the POSIX counters are 0, except for POSIX_F_META_TIME, POSIX_STATS, POSIX_FILE_ALIGNMENT and POSIX_MMAPS. I have ensured the operation is not simultaneously doing both POSIX and STDIO.
>>>>
>>>> As per my configuration it should not be getting 0s especially for the counters POSIX_SIZE_WRITE_1K_10K, POSIX_WRITES, POSIX_READS, POSIX_BYTES_WRITTEN, POSIX_BYTES_READ etc.
>>>>
>>>> Kindly advise what to do next.
>>>>
>>>> Thank you for your time, Stay safe
>>>>
>>>> Yours Sincerely
>>>>
>>>> Neeraj Rajesh
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20200807/65b7cd76/attachment-0001.html>
More information about the Darshan-users
mailing list