[ClusterLabs] Help needed getting DRBD cluster working

Ken Gaillot kgaillot at redhat.com
Mon Oct 5 14:05:29 UTC 2015


On 10/05/2015 08:09 AM, Gordon Ross wrote:
> I’m trying to setup a simple DRBD cluster using Ubuntu 14.04 LTS using Pacemaker & Corosync. My problem is getting the resource to startup.
> 
> I’ve setup the DRBD aspect fine. Checking /proc/drbd I can see that my test DRBD device is all synced and OK.
> 
> Following the examples from the “Clusters From Scratch” document, I built the following cluster configuration:
> 
> property \
> 	stonith-enabled="false" \
> 	no-quorum-policy="stop" \
> 	symmetric-cluster="false"
> node ct1
> node ct2
> node ct3 attributes standby="on"
> primitive drbd_disc0 ocf:linbit:drbd \
> 	params drbd_resource="disc0"
> primitive drbd_disc0_fs ocf:heartbeat:Filesystem \
> 	params fstype="ext4" device="/dev/drbd0" directory="/replicated/disc0"
> ms ms_drbd0 drbd_disc0 \
> 	meta master-max="1" master-node-max="1" clone-max="2" clone-node-max=“1” \
>    notify="true" target-role="Master"
> colocation filesystem_with_disc inf: drbd_disc0_fs ms_drbd0:Master
> 
> ct1 & ct2 are the main DRBD servers, with ct3 being a witness server to avoid split-brain problems.
> 
> When I look at the cluster status, I get:
> 
> crm(live)# status
> Last updated: Mon Oct  5 14:04:12 2015
> Last change: Thu Oct  1 17:31:35 2015 via cibadmin on ct2
> Current DC: ct2 (739377523) - partition with quorum
> 3 Nodes configured
> 3 Resources configured
> 
> 
> Node ct3 (739377524): standby
> Online: [ ct1 ct2 ]
> 
> 
> Failed actions:
>     drbd_disc0_monitor_0 (node=ct1, call=5, rc=6, status=complete, last-rc-change=Thu Oct  1 16:42:11 2015
> , queued=60ms, exec=0ms
> ): not configured
>     drbd_disc0_monitor_0 (node=ct2, call=5, rc=6, status=complete, last-rc-change=Thu Oct  1 16:17:17 2015
> , queued=67ms, exec=0ms
> ): not configured
>     drbd_disc0_monitor_0 (node=ct3, call=5, rc=6, status=complete, last-rc-change=Thu Oct  1 16:42:10 2015
> , queued=54ms, exec=0ms
> ): not configured
> 
> What have I done wrong?

The "rc=6" in the failed actions means the resource's Pacemaker
configuration is invalid. (For OCF return codes, see
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#s-ocf-return-codes
)

The "_monitor_0" means that this was the initial probe that Pacemaker
does before trying to start the resource, to make sure it's not already
running. As an aside, you probably want to add recurring monitors as
well, otherwise Pacemaker won't notice if the resource fails. For
example: op monitor interval="29s" role="Master" op monitor
interval="31s" role="Slave"

As to why the probe is failing, it's hard to tell. Double-check your
configuration to make sure disc0 is the exact DRBD name, Pacemaker can
read the DRBD configuration file, etc. You can also try running the DRBD
resource agent's "status" command manually to see if it prints a more
detailed error message.




More information about the Users mailing list