[ClusterLabs] Beginner Question about VirtualDomain

Digimer lists at alteeve.ca
Wed Aug 19 01:10:08 EDT 2020

On 2020-08-17 8:40 a.m., Sameer Dhiman wrote:
> Hi,
> I am a beginner using pacemaker and corosync. I am trying to set up
> a cluster of HA KVM guests as described by Alteeve wiki (CentOS-6) but
> in CentOS-8.2. My R&D  setup is described below
> Physical Host running CentOS-8.2 with Nested Virtualization
> 2 x CentOS-8.2 guest machines as Cluster Node 1 and 2.
> WinXP as a HA guest.
> 1. drbd --> dlm --> lvmlockd --> LVM-activate --> gfs2 (guest machine
> definitions)
> 2. drbd --> dlm --> lvmlockd --> LVM-activate --> raw-lv (guest machine HDD)
> Question(s):
> 1. How to prevent guest startup until gfs2 and raw-lv are available? In
> CentOS-6 Alteeve used autostart=0 in the <vm > tag. Is there any similar
> option in pacemaker because I did not find it in the documentation?
> 2. Suppose, If I configure constraint order gfs2 and raw-lv then guest
> machine. Stopping the guest machine would also stop the complete service
> tree so how can I prevent this?
> -- 
> Sameer Dhiman

Hi Sameer,

  I'm the author of that wiki. It's quite out of date, as you noted, and
we're actively developing a new release for EL8. Though, it would be
ready until near the end if the year.

  There are a few changes we've made that you might want to consider;

1. We never were too happy with DLM, and so we've reworked things to no
longer need it. So we use normal LVM backing DRBD resources. One
resource per VM, on volume per virtual disk backed by an LV. Our tools
will automate this, but you can easily enough manually create them if
your environment is fairly stable.

2. To get around GFS2, we create a
/mnt/shared/{provision,definitions,files,archive} directory (note
/shared -> /mnt/shared to be more LFS friendly). We'll again automate
management of files in Striker, but you can copy the files manually and
rsync out changes as needed (again, if your environment doesn't change

3. We changed DRBD from v8.4 to 9.0, and this meant a few things had to
change. We will integrate support for short-throw DR hosts (async "third
node" in DRBD that is outside pacemaker). We run the resources to only
allow a single primary normally and enable auto-promote. For
live-migration, we temporarily enable live migration, promote the
target, migrate, demote the old host and disable dual-primary. This
makes it safer as it's far less likely that someone could accidentally
start a VM on the passive node (not that it ever happened as our tools
prevented it, but it was _possible_, so we wanted to improve that).

That handle #3, we've written our own custom RA (ocf:alteeve:server
[1]). This RA is smart enough to watch/wait for things to become ready
before starting. It also handles the DRBD stuff I mentioned, and the
virsh call to do the migration. So it means the pacemaker config is
extremely simple. Note though it depends on the rest of our tools so it
won't work outside the Anvil!. That said, if you wanted to use it before
we release Anvil! M3, you could probably adapt it easily enough.

If you have any questions, please let me know and I'll help as best I can.



(Note: during development, this code base is kept outside of
Clusterlabs. We'll move it in when it reaches beta).
1. https://github.com/digimer/anvil/blob/master/ocf/alteeve/server

Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

More information about the Users mailing list