[ClusterLabs] Cluster is not promoting DRBD resource to master

Sun Oct 1 12:43:55 EDT 2017

On Fri, 2017-09-29 at 11:36 +0300, Octavian Ciobanu wrote:
> Hello all.
> I've encountered another strange behavior after updating to CentOS
> 7.4. The DRBD resource is no longer promote one node to Master,
> instead both nodes are stuck in Slave.
> 
> The configuration is based on 2 nodes running drbd as cluster
> resource. The resource is configured as Master/Slave.
> 
> The resource is configured with the following commands
> pcs resource create Storage-DRBD_Resource ocf:linbit:drbd
> drbd_resource="ClusterDisk" op start interval="0" timeout="240" op
> stop interval="0" timeout="120" op monitor role="Master"
> interval="10" timeout="20" op monitor role="Slave" interval="20"
> timeout="20"
> pcs resource master Storage-DRBD Storage-DRBD_Resource master-max="1" 
> master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
> If I issue the drbd-admin status I get that both resources are
> UpToDate. If I stop the cluster and start resource manually it starts
> with no error and if I promote one node to Primary it is promoted
> without any fuss. So the issue, from what I see, is not with DRBD.
> 
> Any hints on how to make it work are welcome.

One of the nodes at the time will be the DC -- its logs will have more
"pengine:" messages. In those messages will be a list of actions to
take (or not). In this case, you'll see that it decided to "Leave" the
resource in Slave mode on both nodes. At the end of that, it will list
a PE file name (pe-input/pe-warn/pe-error).

If you grab that file, you can run "crm_simulate -Sx $FILENAME" to get
more information about the cluster's decision-making. If you use "-Ssx" 
you'll get even more information. It's not very user-friendly though;
if you can't make sense of it, attach the PE file here.