[Pacemaker] Problem with Xen live migration

Jean-Francois Malouin Jean-Francois.Malouin at bic.mni.mcgill.ca
Tue Jan 18 16:18:31 EST 2011


* Vladislav Bogdanov <bubble at hoster-ok.com> [20110118 08:41]:
> > Unless clustered LVM locking is enabled and working:
> > # sed -ri 's/^([ \t]+locking_type).*/    locking_type = 3/'
> > /etc/lvm/lvm.conf
> > # sed -ri 's/^([ \t]+fallback_to_local_locking).*/
> > fallback_to_local_locking = 1/' /etc/lvm/lvm.conf
> > # vgchange -cy VG_NAME
> > # service clvmd start
> > # vgs|grep VG_NAME
> > 
> > Of cause, this may vary from one distro to another.
> > 
> > Further step can be declaring clvmd as a pacemaker clone resource (but
> > stock Fedora clvmd LSB script is badly suitable for this, so I use
> > slightly modified version of OSF RA published by Novell).
> > 
> > BTW config above has master-max="2"...
> > The only thing missing there is a clone for groLVM-OCFS (or clones for
> > individual primitives + extra colocations/orders).
> > 
> > Jean-Francois, you can try (assuming you set up cLVM):
> > primitive resDRBDr1 ocf:linbit:drbd params drbd_resource="r1" ...
> > primitive resLVM ocf:heartbeat:LVM params volgrpname="xen_vg" ...
> > primitive resOCFS2 ocf:heartbeat:Filesystem fstype="ocfs2" ...
> > primitive resXen1 ocf:heartbeat:Xen \
> >        params xmfile="/etc/xen/xen1cfg" name="xen1" \
> >        meta allow-migrate="true"
> > group groLVM-OCFS resLVM resOCFS2
> > clone cl-groLVM-OCFS groLVM-OCFS \
> >        meta interleave="true"
> > ms msDRBDr1 resDRBDr1 \
> >        meta notify="true" master-max="2" interleave="true"
> > colocation colLVM-OCFS-on-DRBDr1Master inf: cl-groLVM-OCFS msDRBDr1:Master
> > colocation colXen-with-OcfsXen inf: resXen1 cl-groLVM-OCFS
> > order ordDRBDr1-before-LVM inf: msDRBDr1:promote cl-groLVM-OCFS:start
> > order ordLVM-OCFS-before-Xen inf: cl-groLVM-OCFS:start resXen1:start

I had a go at this (cloning the LVM-OCFS group) and setting up the
constraints-colocation rules but in that case the Xen resource would
not even start.

> 
> Ah, you should also run dlm_controld before clvmd (or dlm_controld.pcmk,
> depending on what stack does your cluster use) and ocfs2_controld(.pcmk)
> in parallel with clvmd. This is outlined in pacemaker docs.

Yes, I've read that
(http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2) 
and my attempt to have the dlm and o2cb services (Debian Squeeze comes
with dlm_controld.pcmk and ocfs2_controld.pcmk) managed by the cluster
failed. I suspect some hooks are not presents or at leat enabled for
the DLM resource to get started. From my notes here are the relevent
logs:

udevd-work[18267]: kernel-provided name 'dlm-monitor' and NAME= 'misc/dlm-monitor' disagree, please use SYMLINK+= or change the kernel to provide the proper name 
udevd-work[19491]: kernel-provided name 'dlm-control' and NAME= 'misc/dlm-control' disagree, please use SYMLINK+= or change the kernel to provide the proper name
udevd-work[18268]: kernel-provided name 'dlm_plock' and NAME= 'misc/dlm_plock' disagree, please use SYMLINK+= or change the kernel to provide the proper name
kernel: [11754.231771] DLM (built Sep 17 2010 21:58:47) installed
lrmd: [5728]: info: RA output: (resDLM:0:probe:stderr) dlm_controld.pcmk: no process found
o2cb[5752]: INFO: configfs not laoded
o2cb[6224]: ERROR: ocfs2_controld.pcmk did not come up
corosync[5715]:   [pcmk  ] info: pcmk_notify: Enabling node notifications for child 10291 (0x7f282000c250)
ocfs2_controld.pcmk: Unable to connect to CKPT: Object does not exist

This thread shed a little bit of light about the ckpt missing:
http://www.gossamer-threads.com/lists/linuxha/pacemaker/65702

but still didn't manage to have it working properly when I added the
ckpt to corosync.

thanks for the help,
jf


> 
> > 
> > This should help (I use something similar for KVM with live migration).
> > 
> > Best,
> > Vladislav
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

-- 
<° >< Jean-François Malouin          McConnell Brain Imaging Centre        
Systems/Network Administrator       Montréal Neurological Institute
3801 Rue University, Suite WB219          Montréal, Québec, H3A 2B4
Phone: 514-398-8924                               Fax: 514-398-8948




More information about the Pacemaker mailing list