[Pacemaker] Creating a safe cluster-node shutdown script (for when UPS goes OnBattery+LowBattery)

Andrew Beekhof andrew at beekhof.net
Mon Jul 7 20:59:50 EDT 2014


On 4 Jul 2014, at 3:16 pm, Giuseppe Ragusa <giuseppe.ragusa at hotmail.com> wrote:

> Hi all,
> I'm trying to create a script as per subject (on CentOS 6.5, CMAN+Pacemaker, only DRBD+KVM active/passive resources; SNMP-UPS monitored by NUT).
> 
> Ideally I think that each node should stop (disable) all locally-running VirtualDomain resources (doing so cleanly demotes than downs the DRBD resources underneath), then put itself in standby and finally shutdown.

Since the end goal is shutdown, why not just run 'pcs cluster stop' ?
Possibly with 'pcs cluster standby' first if you're worried that stopping the resources might take too long.

Pacemaker will stop everything in the required order and stop the node when done... problem solved?

> 
> On further startup, manual intervention would be required to unstandby all nodes and enable resources (nodes already in standby and resources already disabled before blackout should be manually distinguished).
> 
> Is this strategy conceptually safe?
> 
> Unfortunately, various searches have turned out no "prior art" :)
> 
> This is my tentative script (consider it in the public domain):
> 
> ------------------------------------------------------------------------------------------------------------------------------------
> #!/bin/bash
> 
> # Note: "pcs cluster status" still has a small bug vs. CMAN-controlled Corosync and would always return != 0
> pcs status > /dev/null 2>&1
> STATUS=$?
> 
> # Detect if cluster is running at all on local node
> # TODO: detect node already in standby and bypass this
> if [ "${STATUS}" = 0 ]; then
>     local_node="$(cman_tool status | grep -i 'Node[[:space:]]*name:' | sed -e 's/^.*Node\s*name:\s*\([^[:space:]]*\).*$/\1/i')"
>     for local_resource in $(pcs status 2>/dev/null | grep "ocf::heartbeat:VirtualDomain.*${local_node}\\s*\$" | awk '{print $1}'); do
>         pcs resource disable "${local_resource}"
>     done
>     # TODO: each resource disabling above may return without waiting for complete stop - wait here for "no more resources active"? (but avoid endless loops)
>     pcs cluster standby "${local_node}"
> fi
> 
> # Shut down gracefully anyway at the end
> /sbin/shutdown -h +0
> 
> ------------------------------------------------------------------------------------------------------------------------------------
> 
> Comments/suggestions/improvements are more than welcome.
> 
> Many thanks in advance.
> 
> Regards,
> Giuseppe
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140708/5050fdc5/attachment-0003.sig>


More information about the Pacemaker mailing list