[ClusterLabs] Antw: [EXT] Re: Q: About a false negative of storage_mon

Andrei Borzenkov arvidjaar at gmail.com
Wed Aug 3 02:58:29 EDT 2022


On 03.08.2022 09:02, Ulrich Windl wrote:
>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 02.08.2022 um 16:09 in
> Nachricht
> <0a2125a43bbfc09d2ca5bad1a693710f00e33731.camel at redhat.com>:
>> On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote:
>>> Hi,
>>>
>>> Since O_DIRECT is not specified in open() [1], it reads the buffer
>>> cache and
>>> may result in a false negative. I fear that this possibility
>>> increases
>>> in environments with large buffer cache and running disk-reading
>>> applications
>>> such as database.
>>>
>>> So, I think it's better to specify O_RDONLY|O_DIRECT, but what about
>>> it?
>>> (in this case, lseek() processing is unnecessary.)
>>>
>>> # I am ready to create a patch that works with O_DIRECT. Also, I
>>> wouldn't mind
>>> # a "change to add a new mode of inspection with O_DIRECT
>>> # (add a option to storage_mon) while keeping the current inspection
>>> process".
>>>
>>> [1] 
>>>
>>
> https://github.com/ClusterLabs/resource-agents/blob/main/tools/storage_mon.c#
> 
>> L47-L90
>>>
>>> Best Regards,
>>> Kazunori INOUE
>>
>> I agree, it makes sense to use O_DIRECT when available. I don't think
>> an option is necessary.
>>
>> However, O_DIRECT is not available on all OSes, so the configure script
>> should detect support. Also, it is not supported by all filesystems, so
>> if the open fails, we should retry without O_DIRECT.
> 
> I just looked it up: It seems POSIX has O_RSYNC and O_SYNC and O_DSYNC)
> instead.

That is something entirely different. O_SYNC etc are about *file system
level*, while O_DIRECT is about *device* level. O_DIRECT makes process
to talk directly to device. It is unclear whether this is side effect of
implementation or intentional.

> The buffer cache handling may be different though.
> 

Synchronous operation does not actually imply media access.

O_RSYNC: "the operation has been completed or diagnosed if unsuccessful.
The read is complete only when an image of the data has been
successfully transferred to the requesting process". Returning buffered
data satisfies this definition. Besides, Linux does not support O_RSYNC.

O_DSYNC: "the operation has been completed or diagnosed if unsuccessful.
The write is complete only when the data specified in the write request
is successfully transferred and all file system information required to
retrieve the data is successfully transferred". Writing to journal
located on external device seems to comply with this definition.

O_SYNC simply adds filesystem metadata update completion.

So no, O_SYNC & Co cannot replace O_DIRECT.


More information about the Users mailing list