[ClusterLabs] big trouble with a DRBD resource

Tue Aug 8 09:36:47 EDT 2017

On Fri, Aug 04, 2017 at 06:20:22PM +0200, Lentes, Bernd wrote:
> Hi,
> 
> first: is there a tutorial or s.th. else which helps in understanding what pacemaker logs in syslog and /var/log/cluster/corosync.log ?
> I try hard to find out what's going wrong, but they are difficult to understand, also because of the amount of information.
> Or should i deal more with "crm histroy" or hb_report ?
> 
> What happened:
> I tried to configure a simple drbd resource following http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457860751296
> I used this simple snip from the doc:
> configure primitive WebData ocf:linbit:drbd params drbd_resource=wwwdata \
>     op monitor interval=60s
> 
> I did it on live cluster, which is in testing currently. I will never do this again. Shadow will be my friend.
> 
> The cluster reacted promptly:
> crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params drbd_resource=idcc-devel \
>    > op monitor interval=60
> WARNING: prim_drbd_idcc_devel: default timeout 20s for start is smaller than the advised 240
> WARNING: prim_drbd_idcc_devel: default timeout 20s for stop is smaller than the advised 100
> WARNING: prim_drbd_idcc_devel: action monitor not advertised in meta-data, it may not be supported by the RA

crm shell in "auto-commit"?
never seen that.

You are sure you did not forget this necessary piece?
ms WebDataClone WebData \
    meta master-max="1" master-node-max="1" clone-max="2"
    clone-node-max="1" notify="true"

anyways: you somehow managed to configure drbd as primitive only,
it does not like that.

If you ever are stuck in a situation like that,
I suggest you put your cluster in "maintenance mode",
then fix up your configuration
(remove the primitive, or add the ms definition),
do cleanups for "everything",
simulate the "maintenance mode off",
and if that looks plausible, commit the maintenance mode off.

Also, even though that has nothing to do with your issue there:
just because you *can* do dual-primary DRBD + GFS2 does not mean that it
is a good idea. That "Cluster from scratch" is a prove of concept,
NOT a "best practices how to set up a web server on pacemaker and DRBD"

If you don't have a *very* good reason to use a cluster file
system, for things like web servers, mail servers, file servers,
...  most services actually, a "classic" file system as xfs or
ext4 in failover configuration will usually easily outperform a
two-node GFS2 setup, while being less complex at the same time.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT