[Pacemaker] notifications for cloned resources

Thu Aug 14 20:51:43 EDT 2014

On 15 Aug 2014, at 5:49 am, Steve Feehan <feehans at ncbi.nlm.nih.gov> wrote:

> On Thu, Aug 14, 2014 at 12:38:00PM +1000, Andrew Beekhof wrote:
>> 
>> On 14 Aug 2014, at 12:33 am, Steve Feehan <feehans at ncbi.nlm.nih.gov> wrote:
>> 
> 
>> Is it a problem that several seconds could go by between the node going offline and the notification arriving?
>> I would usually expect the answer to be yes.
> 
> When a node is offline, all the VMs are down and will need to be
> restarted.

This is, I guess, what I'm not understanding... pacemaker can already start and stop VMs.
With the caveat that everything looks like a nail to people that make hammers, it seems to me that a better interaction would be for ganeti to do whatever it does to create VMs and then hand them over to pacemaker to manage.

create VM in ganeti ==> add VM resource to pacemaker
stop/start VM in ganeti ==> modify target_role in pacemaker
delete VM in ganeti ==> resource stop + delete in pacemaker

Ganeti does what it does best and likewise so do we.
This whole nested cluster concept seems... fragile.

>  It will take harep several minutes (at least) to get them
> started. The quicker you start the better, but several seconds would
> hardly make a difference.

In the case of cluster filesystems, its important because you don't want to hand out file locks until you're sure the last owner is really dead.
I'd have thought ganeti would have had similar requirements, but I don't really know what harep is doing underneath.

> 
>> Those that do care (eg. cluster filesystems) usually have a daemon that a) monitors the corosync membership directly and/or b) subscribes to stonithd fencing notifications.
>> They do this because they can't wait for resource based notification.
> 
> Is there an example of method a) or b) that I can use as a starting point?

for a), check out crm_connect_corosync() in crmd/corosync.c.  The heavy lifting is done by the pcmk_cpg_membership() callback in lib/cluster/cpg.c.
for b), check out te_connect_stonith() and its callback tengine_stonith_notify() in crmd/te_utils.c

> 
>> What is the usecase for nesting a ganeti cluster inside a pacemaker one?
> 
> I'm not really sure.

This bothers me ;-)
How about if I rephrase this as... what does ganeti do that pacemaker doesn't already and vice-versa?

> It would certainly be easier if ganeti handled
> marking the node offline. Its provided hooks for fencing and the harep
> utility for healing the cluster. So its 90% of the way to a full HA
> solution.
> 
> Maybe the ganeti folks don't want to reinvent the wheel. Or maybe they
> don't want to own the decision of when to fence/offline a node. harep can
> perform potentially dangererous actions. Depending on the configuration,
> it can go as far as reinstalling VMs.
> 
> -- 
> Steve Feehan [Contractor]
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140815/529c93b7/attachment-0003.sig>