[ClusterLabs] Antw: Re: Antw: [EXT] Re: Q: About a false negative of storage_mon

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Aug 3 04:34:41 EDT 2022


>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 03.08.2022 um 08:58 in
Nachricht <caa85e40-6511-09a7-da60-b17d30a0aa9e at gmail.com>:
> On 03.08.2022 09:02, Ulrich Windl wrote:
>>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 02.08.2022 um 16:09 in
>> Nachricht
>> <0a2125a43bbfc09d2ca5bad1a693710f00e33731.camel at redhat.com>:
>>> On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote:
>>>> Hi,
>>>>
>>>> Since O_DIRECT is not specified in open() [1], it reads the buffer
>>>> cache and
>>>> may result in a false negative. I fear that this possibility
>>>> increases
>>>> in environments with large buffer cache and running disk-reading
>>>> applications
>>>> such as database.
>>>>
>>>> So, I think it's better to specify O_RDONLY|O_DIRECT, but what about
>>>> it?
>>>> (in this case, lseek() processing is unnecessary.)
>>>>
>>>> # I am ready to create a patch that works with O_DIRECT. Also, I
>>>> wouldn't mind
>>>> # a "change to add a new mode of inspection with O_DIRECT
>>>> # (add a option to storage_mon) while keeping the current inspection
>>>> process".
>>>>
>>>> [1] 
>>>>
>>>
>> 
>
https://github.com/ClusterLabs/resource-agents/blob/main/tools/storage_mon.c#

>> 
>>> L47-L90
>>>>
>>>> Best Regards,
>>>> Kazunori INOUE
>>>
>>> I agree, it makes sense to use O_DIRECT when available. I don't think
>>> an option is necessary.
>>>
>>> However, O_DIRECT is not available on all OSes, so the configure script
>>> should detect support. Also, it is not supported by all filesystems, so
>>> if the open fails, we should retry without O_DIRECT.
>> 
>> I just looked it up: It seems POSIX has O_RSYNC and O_SYNC and O_DSYNC)
>> instead.
> 
> That is something entirely different. O_SYNC etc are about *file system
> level*, while O_DIRECT is about *device* level. O_DIRECT makes process
> to talk directly to device. It is unclear whether this is side effect of
> implementation or intentional.

Well the process still uses the filesystem API to open the device, so I wonder
whether it makes a big difference if O_SYNC operate on the filesystem level or
the device level.
Also if the filesystem level does not pass on O_DIRECT flags to the device,
it'll be kind of useless. OTOH, it's discussable whether an external SAN disk
system should bypass ist internal cache and read the data anew from some RAID
disks if O_DIRECT is in effect.

> 
>> The buffer cache handling may be different though.
>> 
> 
> Synchronous operation does not actually imply media access.

I'm unsure: I tought of I write with O_DSYNC and then the system has a power
failure it's guaranteed that the data is on permanent storage (otherwise it
would be kind of useless).

> 
> O_RSYNC: "the operation has been completed or diagnosed if unsuccessful.
> The read is complete only when an image of the data has been
> successfully transferred to the requesting process". Returning buffered
> data satisfies this definition. Besides, Linux does not support O_RSYNC.

OK, that's a weak definition, specifically as all normal reads are synchronous
anyway.
> 
> O_DSYNC: "the operation has been completed or diagnosed if unsuccessful.
> The write is complete only when the data specified in the write request
> is successfully transferred and all file system information required to
> retrieve the data is successfully transferred". Writing to journal
> located on external device seems to comply with this definition.
> 
> O_SYNC simply adds filesystem metadata update completion.
> 
> So no, O_SYNC & Co cannot replace O_DIRECT.

I rather meant to say: "It's better than nothing if O_DIRECT is missing"

Regards,
Ulrich

> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list