[ClusterLabs] volume group won't start in a nested DRBD setup
Andrei Borzenkov
arvidjaar at gmail.com
Tue Oct 29 00:30:16 EDT 2019
28.10.2019 22:44, Jean-Francois Malouin пишет:
> Hi,
>
> Is there any new magic that I'm unaware of that needs to be added to a
> pacemaker cluster using a DRBD nested setup? pacemaker 2.0.x and DRBD 8.4.10 on
> Debian/Buster on a 2-node cluster with stonith.
> Eventually this will host a bunch of Xen VMs.
>
> I had this sort of thing running for years with pacemaker 1.x DRBD 8.4.x
> without an itch and now with pacemaker 2.0 and drbd 8.4.10 it gives me errors
> on trying to start the volume group vg0 on this chain:
>
> (VG) (LV) (PV) (VG)
> vmspace ----> xen_lv0 ----> drbd0 ----> vg0
>
> Only drbd0 and after are managed by pacemaker.
>
> Here's what I have configured so far (stonith is configured but is not shown below):
>
> ---
> primitive p_lvm_vg0 ocf:heartbeat:LVM \
> params volgrpname=vg0 \
> op monitor timeout=30s interval=10s \
> op_params interval=10s
>
> primitive resDRBDr0 ocf:linbit:drbd \
> params drbd_resource=r0 \
> op start interval=0 timeout=240s \
> op stop interval=0 timeout=100s \
> op monitor interval=29s role=Master timeout=240s \
> op monitor interval=31s role=Slave timeout=240s \
> meta migration-threshold=3 failure-timeout=120s
>
> ms ms_drbd_r0 resDRBDr0 \
> meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
>
> colocation c_lvm_vg0_on_drbd_r0 inf: p_lvm_vg0 ms_drbd_r0:Master
>
> order o_drbd_r0_before_lvm_vg0 Mandatory: ms_drbd_r0:promote p_lvm_vg0:start
> ---
>
> /etc/lvm/lvm.conf has global_filter set to:
> global_filter = [ "a|/dev/drbd.*|", "a|/dev/md.*|", "a|/dev/md/.*|", "r|.*|" ]
>
> But I'm note sure if its sufficient. I seem to be missing some crucial ingredient.
>
> syslog on the DC shows the following when trying to start vg0:
>
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Activating volume group vg0
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Reading all physical volumes. This may take a while... Found volume group "vmspace" using metadata type lvm2 Found volume group "freespace" using metadata type
> lvm2 Found volume group "vg0" using metadata type lvm2
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: 0 logical volume(s) in volume group "vg0" now active
Resource agent really does just "vgchange vg0". Does it work when you
run it manually?
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM Volume vg0 is not available (stopped)
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM: vg0 did not activate correctly
> Oct 28 14:42:56 node2 pacemaker-execd[27054]: notice: p_lvm_vg0_start_0:8775:stderr [ Configuration node global/use_lvmetad not found ]
> Oct 28 14:42:56 node2 pacemaker-execd[27054]: notice: p_lvm_vg0_start_0:8775:stderr [ ocf-exit-reason:LVM: vg0 did not activate correctly ]
> Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Result of start operation for p_lvm_vg0 on node2: 7 (not running)
> Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: node2-p_lvm_vg0_start_0:77 [ Configuration node global/use_lvmetad not found\nocf-exit-reason:LVM: vg0 did not activate correctly\n ]
> Oct 28 14:42:56 node2 pacemaker-controld[27057]: warning: Action 42 (p_lvm_vg0_start_0) on node2 failed (target: 0 vs. rc: 7): Error
> Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Transition 602 aborted by operation p_lvm_vg0_start_0 'modify' on node2: Event failed
> Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Transition 602 (Complete=28, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-39.bz2): Complete
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: On loss of quorum: Ignore
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node1: not running
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000)
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: * Recover p_lvm_vg0 ( node2 )
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: Calculated transition 603, saving inputs in /var/lib/pacemaker/pengine/pe-input-40.bz2
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: On loss of quorum: Ignore
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node1: not running
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node2 after 1000000 failures (max=1000000)
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000)
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: * Stop p_lvm_vg0 ( node2 ) due to node availability
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: Calculated transition 604, saving inputs in /var/lib/pacemaker/pengine/pe-input-41.bz2
> Oct 28 14:42:57 node2 pacemaker-controld[27057]: notice: Initiating stop operation p_lvm_vg0_stop_0 locally on node2
> Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: Deactivating volume group vg0
> Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: 0 logical volume(s) in volume group "vg0" now active
> Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: LVM Volume vg0 is not available (stopped)
>
> Any help gratefully accepted!
> jf
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list