[Pacemaker] Avoid one node from being a target for resources migration

Tue Jan 13 03:17:23 EST 2015

Dmitry Koterov <dmitry.koterov at gmail.com> napisał:
>Hello.
>
>I have 3-node cluster managed by corosync+pacemaker+crm. Node1 and
>Node2
>are DRBD master-slave, also they have a number of other services
>installed
>(postgresql, nginx, ...). Node3 is just a corosync node (for quorum),
>no
>DRBD/postgresql/... are installed at it, only corosync+pacemaker.

Quorum node can work with only corosync (and no pacemaker). It won't show up in crm_mon, but will affect quorum (at least in corosync 2).

>But when I add resources to the cluster, a part of them are somehow
>moved
>to node3 and since then fail. Note than I have a "colocation" directive
>to
>place these resources to the DRBD master only and "location" with -inf
>for
>node3, but this does not help - why? How to make pacemaker not run
>anything
>at node3?
>
>All the resources are added in a single transaction: "cat config.txt |
>crm
>-w -f- configure" where config.txt contains directives and "commit"
>statement at the end.
>
>Below are "crm status" (error messages) and "crm configure show"
>outputs.
>
>
>*root at node3:~# crm status*
>Current DC: node2 (1017525950) - partition with quorum
>3 Nodes configured
>6 Resources configured
>Online: [ node1 node2 node3 ]
>Master/Slave Set: ms_drbd [drbd]
>     Masters: [ node1 ]
>     Slaves: [ node2 ]
>Resource Group: server
>     fs (ocf::heartbeat:Filesystem): Started node1
>     postgresql (lsb:postgresql): Started node3 FAILED
>     bind9 (lsb:bind9): Started node3 FAILED
>     nginx (lsb:nginx): Started node3 (unmanaged) FAILED
>Failed actions:
>    drbd_monitor_0 (node=node3, call=744, rc=5, status=complete,
>last-rc-change=Mon Jan 12 11:16:43 2015, queued=2ms, exec=0ms): not
>installed
>    postgresql_monitor_0 (node=node3, call=753, rc=1, status=complete,
>last-rc-change=Mon Jan 12 11:16:43 2015, queued=8ms, exec=0ms): unknown
>error
>    bind9_monitor_0 (node=node3, call=757, rc=1, status=complete,
>last-rc-change=Mon Jan 12 11:16:43 2015, queued=11ms, exec=0ms):
>unknown
>error
>    nginx_stop_0 (node=node3, call=767, rc=5, status=complete,
>last-rc-change=Mon Jan 12 11:16:44 2015, queued=1ms, exec=0ms): not
>installed
>
>
>*root at node3:~# crm configure show | cat*
>node $id="1017525950" node2
>node $id="13071578" node3
>node $id="1760315215" node1
>primitive drbd ocf:linbit:drbd \
>params drbd_resource="vlv" \
>op start interval="0" timeout="240" \
>op stop interval="0" timeout="120"
>primitive fs ocf:heartbeat:Filesystem \
>params device="/dev/drbd0" directory="/var/lib/vlv.drbd/root"
>options="noatime,nodiratime" fstype="xfs" \
>op start interval="0" timeout="300" \
>op stop interval="0" timeout="300"
>primitive postgresql lsb:postgresql \
>op monitor interval="10" timeout="60" \
>op start interval="0" timeout="60" \
>op stop interval="0" timeout="60"
>primitive bind9 lsb:bind9 \
>op monitor interval="10" timeout="60" \
>op start interval="0" timeout="60" \
>op stop interval="0" timeout="60"
>primitive nginx lsb:nginx \
>op monitor interval="10" timeout="60" \
>op start interval="0" timeout="60" \
>op stop interval="0" timeout="60"
>group server fs postgresql bind9 nginx
>ms ms_drbd drbd meta master-max="1" master-node-max="1" clone-max="2"
>clone-node-max="1" notify="true"
>location loc_server server rule $id="loc_server-rule" -inf: #uname eq
>node3
>colocation col_server inf: server ms_drbd:Master
>order ord_server inf: ms_drbd:promote server:start
>property $id="cib-bootstrap-options" \
>stonith-enabled="false" \
>last-lrm-refresh="1421079189" \
>maintenance-mode="false"

It looks like you have a symmetric cluster. This makes pacemaker check each host for possibility of running a resource (even with -inf colocation).
You want something like this: http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch06s02s02.html (or to only run corosync on that node)

>------------------------------------------------------------------------
>
>_______________________________________________
>Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started:
>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org

-- 
Wysłane za pomocą K-9 Mail.