[ClusterLabs] Antw: Re: 2-Node Cluster Pointless?
Digimer
lists at alteeve.ca
Sat Apr 22 19:02:31 CEST 2017
On 22/04/17 04:39 AM, Andrei Borzenkov wrote:
> 22.04.2017 11:31, Klaus Wenninger пишет:
>>>>>
>>>> I wonder how SBD fits into this discussion. It is marketed as stonith
>>>> agent, but it is based on committing suicide so relies on well-behaving
>>>> nodes. Which we by definition cannot trust to behave well, otherwise
>>>> we'd not need stonith in the first place.
>>> The logic, when using a watchdog timer, is that if the node is alive
>>> enough to kick the watchdog, it's alive enough to not do something dumb
>>> to the cluster. If it's not able to kick the timer, the watchdog timer
>>> will reset the machine. This works *if* all resources hang when messages
>>> stop coming back from the peer (a side effect of corosync's virtual
>>> synchrony).
>>
>> In fact watchdog-implementations (meaning the software that
>> kicks the hardware-watchdog) are a little bit smarter - and
>> so is SBD.
>> By having the watchdog-kicking and observation-code in a
>> simple loop that is executed periodically you don't need the
>> 'if it is alive enough to do the kicking it will behave well'
>> paradigm.
>> This burns down to making the critical part of the code very
>> small and on top hard to control failures that result in any
>> kind of hanging don't bother us.
>>
>>>
>>> So as I understand it, for SBD to be safe, it requires a hardware
>>> watchdog timer and a properly configured cluster.
>>
>> Yes, yes and yes ... as important as fencing I would say ;-)
>>
>
> So I gather that for SBD to be reasonably safe, it needs real hardware
> watchdog. I often see SBD recommended as stonith agent inside a VM,
> where we do not have "hardware watchdog" by definition. I still wonder
> whether it can be trusted in this case.
I suppose it depends. The fact that it requires some measure of
predictable behaviour is concerning for me. That said, I have the same
reservation with IPMI itself. So to me, "proper" fencing requires a
backup, totally external, option like a pair of switched PDUs. Of
course, I'm more paranoid than most.
Having SBD properly configured is *massively* safer than no fencing at
all. So for people where other fence methods are not available for
whatever reason, SBD is the way to go.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
More information about the Users
mailing list