[ClusterLabs] DRBD: both nodes stuck in secondary mode

Tue Jun 14 14:30:55 UTC 2016

On 06/13/2016 10:13 PM, Kevin THIERRY wrote:
> Hello,
> 
> I've been trying to setup pacemaker to get a HA system for a web
> application but I am having a hard time with DRBD. I have two nodes and
> drbd is configured in single primary mode. The issue I am facing is that
> I always end up with the two nodes being secondary/slaves, no node gets
> promoted to primary/master. I don't have that issue when using drbd by
> itself, without pacemaker.
> 
> I mainly followed this documentation:
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html
> 
> 
> Note that I am running on an up-to-date CentOS 7.2.1511 with all
> packages coming from the repos (some from epel):
> drbd 8.9.5
> pacemaker 1.1.13
> pcs 0.9.143
> corosync 2.3.4
> 
> ########################################################################
> Here is the configuration I used:
> ########################################################################
> 
> pcs cluster cib webapp_cfg
> 
> # VIP
> pcs -f webapp_cfg resource create vip ocf:heartbeat:IPaddr2 \
>     ip=10.5.200.30 cidr_netmask=24 op monitor interval=30s
> pcs -f webapp_cfg constraint location vip prefers billing-primary=INFINITY
> pcs -f webapp_cfg resource move vip billing-primary-sync

The above two commands have the same effect -- setting a score of
INFINITY on the specified node. Since these are the only nodes and you
have a symmetric cluster (where any node can run any resource by
default), the constraints have no effect (both nodes have the same score).

If you want to prefer one node over the other, use the "prefers" command
with a numeric score (not INFINITY), and don't do the move. (By the way,
the move will stay in effect until you do a "pcs resource clear", so you
should do the clear once.)

> # DRBD
> pcs -f webapp_cfg resource create drbd ocf:linbit:drbd \
>     drbd_resource=drbd0 \
>     op monitor interval=60s

I see we need to update the documentation to include separate monitors
on the DRBD master and slave instances. Ideally, the monitor part above
would be "op monitor interval=60s role=Slave op monitor interval=59s
role=Master" (the intervals are slightly different because that's how
pacemaker distinguishes different operations of the same type).

> pcs -f webapp_cfg resource master drbd-master drbd \
>     master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
> pcs -f webapp_cfg constraint location drbd-master prefers
> billing-primary=INFINITY

It's fine to use INFINITY here, but I'd personally prefer a numeric
score. INFINITY is intended for when something logically mandatory;
numeric scores are intended for preferences that can be overridden by
other conditions. In practice, INFINITY is just a large numeric value,
so they are treated similarly, but following the logical meaning helps
make the configuration more readable.

> pcs -f webapp_cfg constraint colocation add drbd-master with vip
> INFINITY with-rsc-role=Master

The above says to colocate drbd-master with the VIP's master instance.
However, the VIP is not a master/slave resource, so this won't have an
effect.

To configure a mandatory colocation of the master instance of
drbd-master with VIP, do:

pcs -f webapp_cfg constraint colocation add master drbd-master with vip

> pcs -f webapp_cfg constraint order promote vip then promote drbd-master

Similarly, the above has no effect because vip can't be promoted (it
isn't a master-slave resource). This should be "start vip".

> pcs cluster cib-push webapp_cfg
> 
> ########################################################################
> Status
> ########################################################################
> 
> # pcs status
> Cluster name: billing-cluster
> Last updated: Tue Jun 14 08:59:57 2016        Last change: Tue Jun 14
> 08:46:41 2016 by root via cibadmin on billing-primary-sync
> Stack: corosync
> Current DC: billing-primary-sync (version 1.1.13-10.el7_2.2-44eb2dd) -
> partition with quorum
> 2 nodes and 3 resources configured
> 
> Online: [ billing-backup-sync billing-primary-sync ]
> 
> Full list of resources:
> 
>  vip    (ocf::heartbeat:IPaddr2):    Started billing-primary-sync
>  Master/Slave Set: drbd-master [drbd]
>      Slaves: [ billing-backup-sync billing-primary-sync ]
> 
> PCSD Status:
>   billing-primary-sync: Online
>   billing-backup-sync: Online
> 
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled
> 
> ########################################################################
> Constraints
> ########################################################################
> 
> # pcs constraint
> Location Constraints:
>   Resource: drbd-master
>     Enabled on: billing-primary (score:INFINITY)
>     Enabled on: billing-primary-sync (score:INFINITY)
>   Resource: vip
>     Enabled on: billing-primary (score:INFINITY)
>     Enabled on: billing-primary-sync (score:INFINITY) (role: Started)
> Ordering Constraints:
>   promote vip then promote drbd-master (kind:Mandatory)
> Colocation Constraints:
>   drbd-master with vip (score:INFINITY) (with-rsc-role:Master)
> 
> ########################################################################
> 
> I found some logs related to the issue in /var/log/messages:
> 
> error: drbd:1 and vip are both allocated but to different nodes:
> billing-backup-sync vs. billing-primary-sync
> notice: Promote drbd:0#011(Slave -> Master billing-primary-sync - blocked)
> 
> Full logs:
> https://zerobin.net/?de9ecb3b7052f6dd#gc1f4j9Z+NqaiYYowA9xiOLCDAMJqmfW1uQpr0d2VpA=
> 
> 
> I didn't see anything suspect in other log files but since it is the
> first time I use pacemaker maybe I missed something so if you need other
> logs, please ask me and I will provide them.
> 
> If you have any idea about what can be wrong with my configuration, I
> would love to hear it :)
> Thanks
> 
> Best regards,
>