[ClusterLabs] Antw: [EXT] sbd v1.4.2

Klaus Wenninger kwenning at redhat.com
Tue Dec 8 07:12:33 EST 2020


On 12/8/20 11:51 AM, Klaus Wenninger wrote:
> On 12/3/20 9:29 AM, Reid Wahl wrote:
>> On Thu, Dec 3, 2020 at 12:03 AM Ulrich Windl
>> <Ulrich.Windl at rz.uni-regensburg.de> wrote:
>>> Hi!
>>>
>>> See comments inline...
>>>
>>>>>> Klaus Wenninger <kwenning at redhat.com> schrieb am 02.12.2020 um 22:05 in
>>> Nachricht <1b29fa92-b1b7-2315-fbcf-0787ec0e1e68 at redhat.com>:
>>>> Hi sbd ‑ developers & users!
>>>>
>>>> Thanks to everybody for contributing to tests and
>>>> further development.
>>>>
>>>> Improvements in build/CI‑friendlyness and
>>>> added robustness against misconfiguration
>>>> justify labeling the repo v1.4.2.
>>>>
>>>> I tried to quickly summarize the changes in the
>>>> repo since it was labeled v1.4.1:
>>>>
>>>> ‑ improve build/CI‑friendlyness
>>>>
>>>>   * travis: switch to F32 as build‑host
>>>>             switch to F32 & leap‑15.2
>>>>             changes for mock‑2.0
>>>>             turn off loop‑devices & device‑mapper on x86_64 targets because
>>>>             of changes in GCE
>>>>   * regressions.sh: get timeouts from disk‑header to go with proper
>>> defaults
>>>>                     for architecture
>>>>   * use configure for watchdog‑default‑timeout & others
>>>>   * ship sbd.pc with basic sbd build information for downstream packages
>>>>     to use
>>>>   * add number of commits since version‑tag to build‑counter
>>>>
>>>> ‑ add robustness against misconfiguration / improve documentation
>>>>
>>>>   * add environment section to man‑page previously just available in
>>>>     template‑config
>>>>   * inform the user to restart the sbd service after disk‑initialization
>>> I thought with adding UUIDs sbd automatically detects a header change.
> You're having a valid point here.
> Actually a disk-init on an operational cluster should be
> quite safe. (A very small race between header and slot
> read does exist.)
> Might make sense to think over taking the message back
> or revising it.
Yan Gao just pointed me to the timeout configuration not being
updated if it changes in the header.
Guess until that is tackled one way or another the message
is a good idea.

Klaus
>>>>   * refuse to start if any of the configured device names is invalid
>>> Is this a good idea? Assume you configured two devices, and one device fails.
>>> Do you really want to prevent sbd startup then?
>> AFAICT, it's just making sure the device name is of a valid format.
>>
>> https://github.com/ClusterLabs/sbd/blob/master/src/sbd-inquisitor.c#L830-L833
>> -> https://github.com/ClusterLabs/sbd/blob/master/src/sbd-inquisitor.c#L65-L78
>> -- --> https://github.com/ClusterLabs/sbd/blob/master/src/sbd-common.c#L1189-L1220
>>
>>>>   * add handshake to sync startup/shutdown with pacemakerd
>>>>     Previously sbd just waited for the cib‑connnection to show up/go away
>>>>     which isn't robust at all.
>>>>     The new feature needs new pacemakerd‑api as counterpart.
>>>>     Thus build checks for presence of pacemakerd‑api.
>>>>     To simplify downstream adoption behavior is configurable at runtime
>>>>     via configure‑file with a build‑time‑configurable default.
>>>>   * refuse to start if qdevice‑sync_timeout doesn't match watchdog‑timeout
>>>>     Needed in particular as qdevice‑sync_timeout delays quorum‑state‑update
>>>>     and has a default of 30s that doesn't match the 5s watchdog‑timeout
>>>>     default.
>>>>
>>>> ‑ Fix: sbd‑pacemaker: handle new no_quorum_demote + robustness against new
>>>>                       policies added
>>>> ‑ Fix: agent: correctly compare string values when calculating timeout
>>>> ‑ Fix: scheduling: overhaul the whole thing
>>>>   * prevent possible lockup when format in proc changes
>>>>   * properly get and handle scheduler policy & prio
>>>>   * on SCHED_RR failing push to the max with SCHED_OTHER
>>> Do you also mess with ioprio/ionice?
> Yes, IOPRIO_CLASS_RT.
> But a good reminder to check in how far the hacky code doing this
> is still state of the art and in how far it is even effective using AIO.
>
> Klaus
>>> Regards,
>>> Ulrich
>>>
>>>> Regards,
>>>> Klaus
>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>



More information about the Users mailing list