[ClusterLabs] volume group won't start in a nested DRBD setup
Jean-Francois Malouin
Jean-Francois.Malouin at bic.mni.mcgill.ca
Mon Oct 28 15:44:50 EDT 2019
Hi,
Is there any new magic that I'm unaware of that needs to be added to a
pacemaker cluster using a DRBD nested setup? pacemaker 2.0.x and DRBD 8.4.10 on
Debian/Buster on a 2-node cluster with stonith.
Eventually this will host a bunch of Xen VMs.
I had this sort of thing running for years with pacemaker 1.x DRBD 8.4.x
without an itch and now with pacemaker 2.0 and drbd 8.4.10 it gives me errors
on trying to start the volume group vg0 on this chain:
(VG) (LV) (PV) (VG)
vmspace ----> xen_lv0 ----> drbd0 ----> vg0
Only drbd0 and after are managed by pacemaker.
Here's what I have configured so far (stonith is configured but is not shown below):
---
primitive p_lvm_vg0 ocf:heartbeat:LVM \
params volgrpname=vg0 \
op monitor timeout=30s interval=10s \
op_params interval=10s
primitive resDRBDr0 ocf:linbit:drbd \
params drbd_resource=r0 \
op start interval=0 timeout=240s \
op stop interval=0 timeout=100s \
op monitor interval=29s role=Master timeout=240s \
op monitor interval=31s role=Slave timeout=240s \
meta migration-threshold=3 failure-timeout=120s
ms ms_drbd_r0 resDRBDr0 \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
colocation c_lvm_vg0_on_drbd_r0 inf: p_lvm_vg0 ms_drbd_r0:Master
order o_drbd_r0_before_lvm_vg0 Mandatory: ms_drbd_r0:promote p_lvm_vg0:start
---
/etc/lvm/lvm.conf has global_filter set to:
global_filter = [ "a|/dev/drbd.*|", "a|/dev/md.*|", "a|/dev/md/.*|", "r|.*|" ]
But I'm note sure if its sufficient. I seem to be missing some crucial ingredient.
syslog on the DC shows the following when trying to start vg0:
Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Activating volume group vg0
Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Reading all physical volumes. This may take a while... Found volume group "vmspace" using metadata type lvm2 Found volume group "freespace" using metadata type
lvm2 Found volume group "vg0" using metadata type lvm2
Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: 0 logical volume(s) in volume group "vg0" now active
Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM Volume vg0 is not available (stopped)
Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM: vg0 did not activate correctly
Oct 28 14:42:56 node2 pacemaker-execd[27054]: notice: p_lvm_vg0_start_0:8775:stderr [ Configuration node global/use_lvmetad not found ]
Oct 28 14:42:56 node2 pacemaker-execd[27054]: notice: p_lvm_vg0_start_0:8775:stderr [ ocf-exit-reason:LVM: vg0 did not activate correctly ]
Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Result of start operation for p_lvm_vg0 on node2: 7 (not running)
Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: node2-p_lvm_vg0_start_0:77 [ Configuration node global/use_lvmetad not found\nocf-exit-reason:LVM: vg0 did not activate correctly\n ]
Oct 28 14:42:56 node2 pacemaker-controld[27057]: warning: Action 42 (p_lvm_vg0_start_0) on node2 failed (target: 0 vs. rc: 7): Error
Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Transition 602 aborted by operation p_lvm_vg0_start_0 'modify' on node2: Event failed
Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Transition 602 (Complete=28, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-39.bz2): Complete
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: On loss of quorum: Ignore
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node1: not running
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000)
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: * Recover p_lvm_vg0 ( node2 )
Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: Calculated transition 603, saving inputs in /var/lib/pacemaker/pengine/pe-input-40.bz2
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: On loss of quorum: Ignore
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node1: not running
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node2 after 1000000 failures (max=1000000)
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000)
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: * Stop p_lvm_vg0 ( node2 ) due to node availability
Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: Calculated transition 604, saving inputs in /var/lib/pacemaker/pengine/pe-input-41.bz2
Oct 28 14:42:57 node2 pacemaker-controld[27057]: notice: Initiating stop operation p_lvm_vg0_stop_0 locally on node2
Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: Deactivating volume group vg0
Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: 0 logical volume(s) in volume group "vg0" now active
Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: LVM Volume vg0 is not available (stopped)
Any help gratefully accepted!
jf
More information about the Users
mailing list