[Pacemaker] Reg: Pacemaker, DRBD, CMAN : Stopping pacemaker service hangs indefinitely and waits for kvm guest to shutdown

Lohit Valleru lohitv at gwmail.gwu.edu
Sat Nov 9 22:10:19 UTC 2013


Hello all,

I have brought up a test Pacemaker cluster with,

Pacemaker 1.1.8-7.el6 (Build: 394e906)
DRBD 8.4.4
CMAN 3.0.12

I am trying to make KVM guests highly available using 2 server hosts. KVM
guests are not included in pacemaker cluster. It is only KVM hosts that are
included in pacemaker cluster.

Dummy service, and DRBD seem to migrate cleanly, when i shutdown the
service on one of the physical nodes. However, the same thing does not
happen for VirtualDomain Resource.

I am using ocf:hearbeat:VirtualDomain resource to manage kvm guest
(rsc_lvpvm01), and tried to include the colocation and order constraints
with DRBD block device (VmData2Clone) - as below :



pcs -f fs_cfg constraint colocation add rsc_lvpvm01 VmData2Clone
INFINITY with-rsc-role=Master

pcs -f fs_cfg constraint order promote VmData2Clone then start rsc_lvpvm01


I have created the VirtualDomain resource as below :

pcs -f kvm_cfg resource create rsc_lvpvm01 ocf:heartbeat:VirtualDomain
hypervisor="qemu:///system" config="/etc/libvirt/qemu/lvpvm01.xml"
meta allow-migrate="true" op monitor timeout="30" interval="10" op
start timeout="120s" op stop timeout="120s"


I have just included IMM fencing, for the two physical nodes.


These are the steps that i observed :

1. When i shutdown pacemaker service on physical node, it indefinetely waits at
 " waiting for managed resources to shutdown...."

2. As soon as, I try to stop the resource using : " pcs resource
rsc_lvpvm01 stop " , the above shutdown service

   cleanly shutsdown and i see that my services are migrated, but
VirtualDomain resource does not start in other

   physical node, unless i manually start it in other node using :
"pcs resouce rsc_lvpvm01 start".

3. If i forcefully shutdown my kvm physical host, I wont be able to
start VirtualDomain in other node as expected.

4. I would be able to start VirtualDomain in other node, only if i
manually stop the resource as in step 2, instead

   of abruptly powering off the physical host.


It might be that i need to add better constraints,delay and fencing
for VirtualDomain, but i do not understand where
exactly the problem is.

May i please ask for some help on this issue.

Please find my CIB dump as attached.

Thanks,

Lohit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131109/48f3561f/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dump_cfg.xml
Type: text/xml
Size: 15697 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131109/48f3561f/attachment-0003.xml>


More information about the Pacemaker mailing list