[ClusterLabs] fence_apc delay?
Dan Swartzendruber
dswartz at druber.com
Sat Sep 3 13:50:10 UTC 2016
On 2016-09-03 08:41, Marek Grac wrote:
> Hi,
>
> There are two problems mentioned in the email.
>
> 1) power-wait
>
> Power-wait is a quite advanced option and there are only few fence
> devices/agent where it makes sense. And only because the HW/firmware
> on the device is somewhat broken. Basically, when we execute power
> ON/OFF operation, we wait for power-wait seconds before we send next
> command. I don't remember any issue with APC and this kind of
> problems.
>
> 2) the only theory I could come up with was that maybe the fencing
> operation was considered complete too quickly?
>
> That is virtually not possible. Even when power ON/OFF is
> asynchronous, we test status of device and fence agent wait until
> status of the plug/VM/... matches what user wants.
I think you misunderstood my point (possibly I wasn't clear.) Not
saying anything is wrong with either the fencing agent or the PDU,
rather, my theory is that if the agent flips the power off, then back
on, if the interval it is off is 'too short', possibly a host like the
R905 can continue to operate for a couple of seconds, continuing to
write data to the disks past the point where the other node begins to do
likewise. If power_wait is not the right way to wait, say, 10 seconds
to make 100% sure node A is dead as a doornail, what *is* the right way?
More information about the Users
mailing list