[ClusterLabs] question about fence-virsh
Digimer
lists at alteeve.ca
Fri May 19 18:53:13 EDT 2017
On 19/05/17 05:30 PM, Ken Gaillot wrote:
> On 05/19/2017 03:47 PM, Andrew Kerber wrote:
>> What I am trying to say here is when I get one of the virtual machines
>> in a bad state, I can still log in and reboot it with the reboot
>> command. But I need my fencing resource to handle that reboot.
>>
>> On Fri, May 19, 2017 at 1:32 PM, Andrew Kerber <andrew.kerber at gmail.com
>> <mailto:andrew.kerber at gmail.com>> wrote:
>>
>> Thanks for the answer, but thats not the problem. I dont have
>> access to the console, its a security issue. I only have access
>> within the virtual machines, so I want to send the reboot command
>> within the virtual machine, not to the console. Typically our
>> hangups are such that the reboot command works, and the machine
>> hangs at starting back up, and I get an admin to go hit the console.
>
> What you're asking for is an "ssh" fence agent. While such can be found,
> they are not considered reliable fence agents.
>
> Your *typical* problem may be solvable with running "reboot" inside the
> VM, but there are situations in which that won't work (kernel panic,
> loss of network connectivity in the VM, crippling load, etc.). Only
> access to the hypervisor can provide a reliable fence mechanism for the VM.
>
> If you're lucky, whoever is providing your VM can also provide you an
> API to use to request a hard reboot of the VM at the hypervisor level.
> Then, you can see if there is a fence agent already written for that
> API, or modify an existing one to handle it.
>
> If you can't even get API access to the hypervisor, then you're not
> going to get full HA. You could search for an ssh fence agent, but be
> aware that's a partial solution at best, and you won't be able to
> recover from certain failure scenarios.
Ken is correct. Fencing must work no matter what state the victim is in.
You can see this by running 'echo c > /proc/sysrq-trigger' to cause a
kernel panic and your cluster will hang.
You need to talk to your security team to get access to the hypervisor.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
More information about the Users
mailing list