[ClusterLabs] volume group won't start in a nested DRBD setup

Tue Oct 29 00:30:16 EDT 2019

28.10.2019 22:44, Jean-Francois Malouin пишет:
> Hi,
> 
> Is there any new magic that I'm unaware of that needs to be added to a
> pacemaker cluster using a DRBD nested setup? pacemaker 2.0.x and DRBD 8.4.10 on
> Debian/Buster on a 2-node cluster with stonith.
> Eventually this will host a bunch of Xen VMs.
> 
> I had this sort of thing running for years with pacemaker 1.x  DRBD 8.4.x
> without an itch and now with pacemaker 2.0 and drbd 8.4.10 it gives me errors
> on trying to start the volume group vg0 on this chain:
> 
>  (VG)           (LV)         (PV)       (VG) 
> vmspace ----> xen_lv0 ----> drbd0 ----> vg0 
> 
> Only drbd0 and after are managed by pacemaker.
> 
> Here's what I have configured so far (stonith is configured but is not shown below):
> 
> ---
> primitive p_lvm_vg0 ocf:heartbeat:LVM \
>     params volgrpname=vg0 \
>     op monitor timeout=30s interval=10s \
>     op_params interval=10s
> 
> primitive resDRBDr0 ocf:linbit:drbd \
>     params drbd_resource=r0 \
>     op start interval=0 timeout=240s \
>     op stop interval=0 timeout=100s \
>     op monitor interval=29s role=Master timeout=240s \
>     op monitor interval=31s role=Slave timeout=240s \
>     meta migration-threshold=3 failure-timeout=120s
> 
> ms ms_drbd_r0 resDRBDr0 \
>     meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
> 
> colocation c_lvm_vg0_on_drbd_r0 inf: p_lvm_vg0 ms_drbd_r0:Master
> 
> order o_drbd_r0_before_lvm_vg0 Mandatory: ms_drbd_r0:promote p_lvm_vg0:start
> ---
> 
> /etc/lvm/lvm.conf has global_filter set to:
> global_filter = [ "a|/dev/drbd.*|", "a|/dev/md.*|", "a|/dev/md/.*|", "r|.*|" ]
> 
> But I'm note sure if its sufficient. I seem to be missing some crucial ingredient.
> 
> syslog on the DC shows the following when trying to start vg0:
> 
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Activating volume group vg0
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO:  Reading all physical volumes. This may take a while... Found volume group "vmspace" using metadata type lvm2 Found volume group "freespace" using metadata type
>  lvm2 Found volume group "vg0" using metadata type lvm2 
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO:  0 logical volume(s) in volume group "vg0" now active

Resource agent really does just "vgchange vg0". Does it work when you
run it manually?

> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM Volume vg0 is not available (stopped)
> Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM: vg0 did not activate correctly
> Oct 28 14:42:56 node2 pacemaker-execd[27054]:  notice: p_lvm_vg0_start_0:8775:stderr [   Configuration node global/use_lvmetad not found ]
> Oct 28 14:42:56 node2 pacemaker-execd[27054]:  notice: p_lvm_vg0_start_0:8775:stderr [ ocf-exit-reason:LVM: vg0 did not activate correctly ]
> Oct 28 14:42:56 node2 pacemaker-controld[27057]:  notice: Result of start operation for p_lvm_vg0 on node2: 7 (not running) 
> Oct 28 14:42:56 node2 pacemaker-controld[27057]:  notice: node2-p_lvm_vg0_start_0:77 [   Configuration node global/use_lvmetad not found\nocf-exit-reason:LVM: vg0 did not activate correctly\n ]
> Oct 28 14:42:56 node2 pacemaker-controld[27057]:  warning: Action 42 (p_lvm_vg0_start_0) on node2 failed (target: 0 vs. rc: 7): Error
> Oct 28 14:42:56 node2 pacemaker-controld[27057]:  notice: Transition 602 aborted by operation p_lvm_vg0_start_0 'modify' on node2: Event failed 
> Oct 28 14:42:56 node2 pacemaker-controld[27057]:  notice: Transition 602 (Complete=28, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-39.bz2): Complete
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  notice: On loss of quorum: Ignore
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  warning: Processing failed start of p_lvm_vg0 on node2: not running 
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  warning: Processing failed start of p_lvm_vg0 on node2: not running 
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  warning: Processing failed start of p_lvm_vg0 on node1: not running 
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000)
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  notice:  * Recover    p_lvm_vg0           (                 node2 )  
> Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]:  notice: Calculated transition 603, saving inputs in /var/lib/pacemaker/pengine/pe-input-40.bz2
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  notice: On loss of quorum: Ignore
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  warning: Processing failed start of p_lvm_vg0 on node2: not running 
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  warning: Processing failed start of p_lvm_vg0 on node2: not running 
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  warning: Processing failed start of p_lvm_vg0 on node1: not running 
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  warning: Forcing p_lvm_vg0 away from node2 after 1000000 failures (max=1000000)
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000)
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  notice:  * Stop       p_lvm_vg0           (                 node2 )   due to node availability
> Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]:  notice: Calculated transition 604, saving inputs in /var/lib/pacemaker/pengine/pe-input-41.bz2
> Oct 28 14:42:57 node2 pacemaker-controld[27057]:  notice: Initiating stop operation p_lvm_vg0_stop_0 locally on node2 
> Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: Deactivating volume group vg0
> Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO:  0 logical volume(s) in volume group "vg0" now active 
> Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: LVM Volume vg0 is not available (stopped)
> 
> Any help gratefully accepted!
> jf
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>