[Pacemaker] Configuration help.

Wed Feb 15 12:54:00 EST 2012

----- Original Message -----
> From: "James FLatten" <jflatten at iso-ne.com>
> To: pacemaker at oss.clusterlabs.org
> Sent: Wednesday, February 15, 2012 11:35:21 AM
> Subject: [Pacemaker] Configuration help.
> 
> I hope this mess makes sense!  My current setup looks like this:
> 
>      nodea                 nodeb
> +-----------+ +-----------+
> |      eth0 |---------| eth0      |
> |      eht1 |---------| eth1      |
> |      eth2 |--+  +---| eth2      |
> |           |  |  |   |           |
> |           |  +------| ilo       |
> |       ilo |-----+   |           |
> +-----------+         +-----------+
> 
> (bad ascii art!)
> 
> I have crossed connected the HP iLO3 interfaces and setup stonith on
> each node.  This works well but for the scenario of complete loss of
> a
> node.  I would like to configure the cluster so it behaves like so:
> 
> Secondary complete failure (Power completely gone)
> 
> nodea(P) --- nodeb(S)
> nodea(P) --- (gone)
> 
> result: nodea remains primary, services are *not* interrupted
> 
> Primary complete failure (Power completely gone)
> 
> nodea(P) --- nodeb(S)
> (gone)   --- nodeb(S)
> -- administrator intervention --
> (gone)   --- nodeb(P)
> 
> result: Services can be restored after admin confirms primary is
> indeed
> gone.
> 
> Is this possible with my setup?

Not 100% sure but I think this would work... though there is likely a more elegant solution someone could provide!

Pretty sure you would have to have stonith-enable=true for Node A to stay happy if Node B was MIA

Then to prevent Node B from going primary without intervention add location constraints that don't allow service to start/promote on Node B.  The admin intervention would be the removal of those location constraints allowing the services to start/promote on Node B. Example:

location l_no_drbd0_primary_on_nodeb drbd0clone $role="Master" -inf: #uname eq nodeb

location l_no_drbd1_primary_on_nodeb drbd1clone $role="Master" -inf: #uname eq nodeb

Assuming you would want to swap your cluster behavior to the reverse after "admin intervention" the admin could just edit these two statements and change nodeb to nodea after a verified nodea failure.  Since the rest of your services depend on drbd being master nothing will start while drbd stays secondary on surviving node b.

HTH

Jake

> 
> Here is my current config:
> 
> node nodea \
>      attributes standby="off"
> node nodeb \
>      attributes standby="off"
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>      params ip="192.168.1.3" cidr_netmask="32" \
>      op monitor interval="30s"
> primitive datafs ocf:heartbeat:Filesystem \
>      params device="/dev/drbd0" directory="/data" fstype="ext3" \
>      meta target-role="Started"
> primitive drbd0 ocf:linbit:drbd \
>      params drbd_resource="drbd0" \
>      op monitor interval="31s" role="Slave" \
>      op monitor interval="30s" role="Master"
> primitive drbd1 ocf:linbit:drbd \
>      params drbd_resource="drbd1" \
>      op monitor interval="31s" role="Slave" \
>      op monitor interval="30s" role="Master"
> primitive fence-nodea stonith:fence_ipmilan \
>      params pcmk_host_list="nodeb" ipaddr="xxx.xxx.xxx.xxx"
> login="xxxxxxx" passwd="xxxxxxxx" lanplus="1" timeout="4" auth="md5"
> \
>      op monitor interval="60s"
> primitive fence-nodeb stonith:fence_ipmilan \
>      params pcmk_host_list="nodea" ipaddr="xxx.xxx.xxx.xxx"
> login="xxxxxxx" passwd="xxxxxxxx" lanplus="1" timeout="4" auth="md5"
> \
>      op monitor interval="60s"
> primitive httpd ocf:heartbeat:apache \
>      params configfile="/etc/httpd/conf/httpd.conf" \
>      op monitor interval="1min"
> primitive patchfs ocf:heartbeat:Filesystem \
>      params device="/dev/drbd1" directory="/patch" fstype="ext3" \
>      meta target-role="Started"
> group web datafs patchfs ClusterIP httpd
> ms drbd0clone drbd0 \
>      meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true" target-role="Master"
> ms drbd1clone drbd1 \
>      meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true" target-role="Master"
> location fence-on-nodea fence-nodea \
>      rule $id="fence-on-nodea-rule" -inf: #uname ne nodea
> location fence-on-nodeb fence-nodeb \
>      rule $id="fence-on-nodeb-rule" -inf: #uname ne nodeb
> colocation datafs-with-drbd0 inf: web drbd0clone:Master
> colocation patchfs-with-drbd1 inf: web drbd1clone:Master
> order datafs-after-drbd0 inf: drbd0clone:promote web:start
> order patchfs-after-drbd1 inf: drbd1clone:promote web:start
> property $id="cib-bootstrap-options" \
>      dc-version="1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558"
>      \
>      cluster-infrastructure="openais" \
>      expected-quorum-votes="2" \
>      stonith-enabled="false" \
>      no-quorum-policy="ignore" \
>      last-lrm-refresh="1328556424"
> rsc_defaults $id="rsc-options" \
>      resource-stickiness="100"
> 
> Is this possible?
> 
> Thank you,
> Davin
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
>