[Pacemaker] staggered resource startup

Matthew O'Connor matt at ecsorl.com
Tue Aug 27 10:20:52 EDT 2013


I have a server that operates about 30 virtual machines.  Normally it
handles this load very well, but restart can be a bit dicey.  I have
found that by staggering the vm startups - currently done manually - the
system handles the growing load much more gracefully.  The sequence goes
something like this:
1. node reboots
2. pacemaker (and related) is started
3. immediately, all vm resources are stopped (for X in `crm status
--inactive | grep....`...; do crm resource stop $X...)
4. once pacemaker has brought the node online, all vm resource are
started one at a time (for X...; crm resource start $X; sleep 45s; done)

There's two things I'd like to accomplish, but if I can only get one,
that would be fine too. 

First and foremost, I'd like to have Pacemaker stagger the startup of
certain resources according to a time delay.  Although in the example
above the node is rebooted, in a two-or-more-node case a single node
failure might dump a significant number of resources onto the surviving
nodes, and (more significantly), thereby dumping a huge amount of load
on the SAN that backs the vm host(s).  Having the vm startups or
restarts staggered automatically would help mitigate this.  Staggering
should be relative to other relevant resources.  (Ordering takes care of
delaying the vms from starting till after the SAN stores mount, but each
VM should wait a while before another VM kicks off.  A failure of one VM
to start should not prevent other VMs from starting.)

Second, I think it would be useful to be able to group the resources
together for staggered startup.  For instance, most of my vms are linux,
and they boot very quickly with little load.  Some are Windows, and they
load the host and SAN very badly on boot.  I would ideally create small
groups of linux hosts (to be started together) and start the windows
hosts one at a time (or, another way to think of it, put them in groups
of one each, so that I'm staggering the groups instead of the individual

A key to making this work will be specifying the delay between starting
successive vms/groups.  The vm-start command returns from libvirt almost
immediately, but I want to wait for virtual machine to boot a while -
something I don't know yet how to easily check for in pacemaker. 
Although it does seem a little kludgey to put an arbitrary time delay,
it also appears to be very effective for my situation.

NB: the groups I describe above have no relationship to groups in the
classical Pacemaker sense; they don't have to live together, nor is
there necessarily a hard order of startup or shutdown described.  If one
resource in a staggering-group fails or is stopped, it has no effect on
the rest of the group.  There is only the notion that those resources
should be started together, and started after or before some other group
of resources + a time delay.  In essence, whereas Pacemaker groups
describe what to start, I am looking to describe when to start.  I don't
think stop-staggering has much use here, though I suppose executing
large batches of stops the same way as staggered-start would prevent the
vms from all flushing to the SAN at the same time.

Is there a way to do this with the latest Pacemaker?

(Sorry this got a bit long-winded...)

-- Matthew

CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return e-mail and delete the original message and any copies of it from your computer system. Thank you.


EXPORT CONTROL WARNING:  This document may contain technical data that is subject to the International Traffic in Arms Regulations (ITAR) controls and may not be exported or otherwise disclosed to any foreign person or firm, whether in the US or abroad, without first complying with all requirements of the ITAR, 22 CFR 120-130, including the requirement for obtaining an export license if applicable. In addition, this document may contain technology that is subject to the Export Administration Regulations (EAR) and may not be exported or otherwise disclosed to any non-U.S. person, whether in the US or abroad, without first complying with all requirements of the EAR, 15 CFR 730-774, including the requirement for obtaining an export license if applicable. Violation of these export laws is subject to severe criminal penalties.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5029 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130827/2b6fb46d/attachment-0002.p7s>

More information about the Pacemaker mailing list