[Pacemaker] Problem with Xen live migration

Vladislav Bogdanov bubble at hoster-ok.com
Tue Jan 18 08:16:08 EST 2011


18.01.2011 14:45, Vadym Chepkov wrote:
> 
> On Jan 17, 2011, at 6:44 PM, Jean-Francois Malouin wrote:
> 
>> Back again to setup an active/passive cluster for Xen with live migration
>> but so far, no go. Xen DomU is shutdown and restarted when I move the
>> Xen resource.
>>
>> I'm using Debian Squeeze, pacemaker 1.0.9.1, corosync 1.2.1-4 with Xen 4.0.1-1
>> and kernel 2.6.32-30. DRBD is at 8.3.8.
>>
>> A logical volume 'xen_vg' is sitting on top of a drbd block device and an OCFS2
>> filesystem is created on the LV to hold the disk image for the Xen guest:
>>
>> [ drbd resDRBDr1 ] -> [ LVM resLVM ] -> [ OCFS2 resOCFS2 ] 
>>
>> The cluster logic is (timeouts, etc removed) something along those
>> lines:
>>
>> primitive resDRBDr1 ocf:linbit:drbd params drbd_resource="r1" ...
>> primitive resLVM ocf:heartbeat:LVM params volgrpname="xen_vg" ...
>> primitive resOCFS2 ocf:heartbeat:Filesystem fstype="ocfs2" ...
>> primitive resXen1 ocf:heartbeat:Xen \
>>        params xmfile="/etc/xen/xen1cfg" name="xen1" \
>>        meta allow-migrate="true"
>> group groLVM-OCFS resLVM resOCFS2 
>> ms msDRBDr1 resDRBDr1 \
>>        meta notify="true" master-max="2" interleave="true"
>> colocation colLVM-OCFS-on-DRBDr1Master inf: groLVM-OCFS msDRBDr1:Master
>> colocation colXen-with-OcfsXen inf: resXen1 groLVM-OCFS
>> order ordDRBDr1-before-LVM inf: msDRBDr1:promote groLVM-OCFS:start
>> order ordLVM-OCFS-before-Xen inf: groLVM-OCFS:start resXen1:start
>>
>> DRBD is configured with 'allow-to-primary'.
>>
>> When I try to live migrate 'crm resource move' the Xen guest I get:
>>
>> pengine: [11978]: notice: check_stack_element: Cannot migrate resXen1
>> due to dependency on group groLVM-OCFS (coloc)
>>
>> and the guest is shutdown and restarted on the other node.
>>
>> What am I missing? Something obvious or the cluster logic as it is
>> cannot permit live Xen migration? 
>>
>> I have verified that Xen live migration works as (without pacemaker in
>> the picture) on the now passive node I can manually promote the drbd
>> block device, vgscan to scan for the logical volume, 'lvchange -ay' to
>> make it available, mount the OCFS2 filesystem, 'xm migrate --live' on
>> the active node, and the DomU is available on the other node.
>>
>> Any help or examples very much appreciated!
>> jf
> 
> 
> I have tried it myself, but concluded it's impossible to do it reliably with the current code.
> For the live migration to work you have to remove any colocation constraints (group included) with the Xen resource.
> drbd code includes a "helper" script - /etc/xen/scripts/block-drbd, but this script can't be used in pacemaker environment, 
> because it is not cluster aware. And pacemaker is not handling this scenario at the moment:
> When Xen on drbd is stopped - both drbd nodes are secondary - makes pacemaker "unhappy".
> You need to have both drbd nodes as primary during migration only, 
> but if you specify master-max="2", then both drbd nodes are primary all the time - disaster waiting to happen.

Unless clustered LVM locking is enabled and working:
# sed -ri 's/^([ \t]+locking_type).*/    locking_type = 3/'
/etc/lvm/lvm.conf
# sed -ri 's/^([ \t]+fallback_to_local_locking).*/
fallback_to_local_locking = 1/' /etc/lvm/lvm.conf
# vgchange -cy VG_NAME
# service clvmd start
# vgs|grep VG_NAME

Of cause, this may vary from one distro to another.

Further step can be declaring clvmd as a pacemaker clone resource (but
stock Fedora clvmd LSB script is badly suitable for this, so I use
slightly modified version of OSF RA published by Novell).

BTW config above has master-max="2"...
The only thing missing there is a clone for groLVM-OCFS (or clones for
individual primitives + extra colocations/orders).

Jean-Francois, you can try (assuming you set up cLVM):
primitive resDRBDr1 ocf:linbit:drbd params drbd_resource="r1" ...
primitive resLVM ocf:heartbeat:LVM params volgrpname="xen_vg" ...
primitive resOCFS2 ocf:heartbeat:Filesystem fstype="ocfs2" ...
primitive resXen1 ocf:heartbeat:Xen \
       params xmfile="/etc/xen/xen1cfg" name="xen1" \
       meta allow-migrate="true"
group groLVM-OCFS resLVM resOCFS2
clone cl-groLVM-OCFS groLVM-OCFS \
       meta interleave="true"
ms msDRBDr1 resDRBDr1 \
       meta notify="true" master-max="2" interleave="true"
colocation colLVM-OCFS-on-DRBDr1Master inf: cl-groLVM-OCFS msDRBDr1:Master
colocation colXen-with-OcfsXen inf: resXen1 cl-groLVM-OCFS
order ordDRBDr1-before-LVM inf: msDRBDr1:promote cl-groLVM-OCFS:start
order ordLVM-OCFS-before-Xen inf: cl-groLVM-OCFS:start resXen1:start

This should help (I use something similar for KVM with live migration).

Best,
Vladislav




More information about the Pacemaker mailing list