[ClusterLabs] big trouble with a DRBD resource

Lentes, Bernd bernd.lentes at helmholtz-muenchen.de
Mon Aug 7 15:16:47 EDT 2017

----- On Aug 4, 2017, at 10:19 PM, kgaillot kgaillot at redhat.com wrote:

> The cluster reacted promptly:
> crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params drbd_resource=idcc-devel \
>    > op monitor interval=60
> WARNING: prim_drbd_idcc_devel: default timeout 20s for start is smaller than the advised 240
> WARNING: prim_drbd_idcc_devel: default timeout 20s for stop is smaller than the advised 100
> WARNING: prim_drbd_idcc_devel: action monitor not advertised in meta-data, it may not be supported by the RA

> Why is it complaining about missing clone-max ? This is a meta attribute for a clone, but not for a simple resource !?! 
> This message is constantly repeated, it still appears although cluster is in standby since three days.

> The "ERROR" message is coming from the DRBD resource agent itself, not
> pacemaker. Between that message and the two separate monitor operations,
> it looks like the agent will only run as a master/slave clone.

This message concerning clone-max still appears once per minute in syslog, although both nodes are in standby for days and the drbd resource is unmanaged too.
With stat i checked that the RA is called once per minute. With strace i found out that it is lrmd which calls the RA with the option "monitor".
But why is it still checking ? I thought "standby" for the nodes means the cluster does not care anymore about the resources. And "unmanaged" means the same for a dedicated resource.
So this should mean doubled "don't care anymore about drbd".

crm(live)# status
Last updated: Mon Aug  7 16:05:13 2017
Last change: Tue Aug  1 18:54:02 2017 by root via cibadmin on ha-idg-2
Stack: classic openais (with plugin)
Current DC: ha-idg-2 - partition with quorum
Version: 1.1.12-f47ea56
2 Nodes configured, 2 expected votes
18 Resources configured

Node ha-idg-1: standby
Node ha-idg-2: standby

 prim_drbd_idcc_devel   (ocf::linbit:drbd):     FAILED (unmanaged)[ ha-idg-1 ha-idg-2 ]

What is interesting: Although saying "action monitor not advertised in meta-data, it may not be supported by the RA", it is:


case $__OCF_ACTION in


And the in the "crm ra info ocf:linbit:drbd" mentioned "monitor_Slave" and "monitor_Master" i can't find in the RA. Strange.
Doesn't "crm ra info ocf:linbit:drbd" retrieve its information from the RA ?

In the RA i just find:
<action name="monitor" depth="0"  timeout="20" interval="20" role="Slave" />
<action name="monitor" depth="0"  timeout="20" interval="10" role="Master" />

This is what "crm ra info ocf:linbit:drbd" says:
Operations' defaults (advisory minimum):

    monitor_Slave timeout=20 interval=20
    monitor_Master timeout=20 interval=10


