[Pacemaker] Problem with stonith in rhel7 + pacemaker 1.1.10 + fence_virsh

Digimer lists at alteeve.ca
Mon Dec 23 21:31:40 UTC 2013


On 23/12/13 02:31 PM, David Vossel wrote:
>
>
>
>
> ----- Original Message -----
>> From: "Digimer" <lists at alteeve.ca>
>> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
>> Sent: Monday, December 23, 2013 12:42:23 PM
>> Subject: Re: [Pacemaker] Problem with stonith in rhel7 + pacemaker 1.1.10 +	fence_virsh
>>
>> On 23/12/13 01:30 PM, David Vossel wrote:
>>> ----- Original Message -----
>>>> From: "Digimer" <lists at alteeve.ca>
>>>> To: "The Pacemaker cluster resource manager"
>>>> <pacemaker at oss.clusterlabs.org>
>>>> Sent: Saturday, December 21, 2013 2:39:46 PM
>>>> Subject: [Pacemaker] Problem with stonith in rhel7 + pacemaker 1.1.10 +
>>>> 	fence_virsh
>>>>
>>>> Hi all,
>>>>
>>>>      I'm trying to learn pacemaker (still) using a pair of RHEL 7 beta
>>>> VMs. I've got stonith configured and it technically works (crashed node
>>>> reboots), but pacemaker hangs...
>>>>
>>>> Here is the config:
>>>>
>>>> ====
>>>> Cluster Name: rhel7-pcmk
>>>> Corosync Nodes:
>>>>     rhel7-01.alteeve.ca rhel7-02.alteeve.ca
>>>> Pacemaker Nodes:
>>>>     rhel7-01.alteeve.ca rhel7-02.alteeve.ca
>>>>
>>>> Resources:
>>>>
>>>> Stonith Devices:
>>>>     Resource: fence_n01_virsh (class=stonith type=fence_virsh)
>>>>      Attributes: pcmk_host_list=rhel7-01 ipaddr=lemass action=reboot
>>>> login=root passwd_script=/root/lemass.pw delay=15 port=rhel7_01
>>>>      Operations: monitor interval=60s
>>>>      (fence_n01_virsh-monitor-interval-60s)
>>>>     Resource: fence_n02_virsh (class=stonith type=fence_virsh)
>>>>      Attributes: pcmk_host_list=rhel7-02 ipaddr=lemass action=reboot
>>>
>>>
>>> When using fence_virt, the easiest way to configure everything is to name
>>> the actual virtual machines the same as what their corosync node names are
>>> going to be.
>>>
>>> If you run this command in a virtual machine, you can see the names
>>> fence_virt thinks all the nodes are.
>>> fence_xvm -o list
>>> node1          c4dbe904-f51a-d53f-7ef0-2b03361c6401 on
>>> node2          c4dbe904-f51a-d53f-7ef0-2b03361c6402 on
>>> node3          c4dbe904-f51a-d53f-7ef0-2b03361c6403 on
>>>
>>> If you name the vm the same as the node name, you don't even have to list
>>> the static host list. Stonith will do all that magic behind the scenes. If
>>> the node names do not match, try the 'pcmk_host_map' option. I believe you
>>> should be able to map the corosync node name to the vm's name using that
>>> option.
>>>
>>> Hope that helps :)
>>>
>>> -- Vossel
>>
>> Hi David,
>>
>>     I'm using fence_virsh,
>
> ah sorry, missed that.
>
>> not fence_virtd/fence_xvm. For reasons I've
>> not been able to resolve, fence_xvm has been unreliable on Fedora for
>> some time now.
>
> the multicast bug :(

That's the one.

I'm rebuilding the nodes now with VM/virsh names that match the host 
name. Will see if that helps/makes a difference.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?




More information about the Pacemaker mailing list