[ClusterLabs] Pacemake/Corosync good fit for embedded product?

Thu Apr 12 04:08:19 EDT 2018

On 04/12/2018 04:37 AM, David Hunt wrote:
> Thanks Guys,
>
> Ideally I would like to have event driven (rather than slower polled)
> inputs into pacemaker to quickly trigger the fall over. I assume
> adding event driven inputs to pacemaker isn't straightforward? If it
> was possible to add event inputs to pacemaker is pacemaker itself fast
> enough? Or is it also going to be relatively slow to switch?

I'm not aware of any systematic delays on top of what we discussed.
The time the rule-engine will need to calculate a transition will
of course depend on the complexity of your cluster and the
CPU-power you have available.

I've mentioned the delay of the DC reelection but what you
might consider as well in your calculations is fencing.
If you are using physical fencing-devices it depends on how
quickly these will react and give feedback. You might be able
to boost that by e.g. already keeping a control connection
to a fencing device open.

Regards,
Klaus

>
> It would seem based on this discussion it may still work still work to
> use pacemaker & corosync for initial setup & handle services which can
> handle a slower switch over time. For our services that require a much
> faster switch over time it would appear we need something propriety.
>
> Regards
> David
>
> On 12 April 2018 at 02:56, Klaus Wenninger <kwenning at redhat.com
> <mailto:kwenning at redhat.com>> wrote:
>
>     On 04/11/2018 10:44 AM, Jan Friesse wrote:
>     > David,
>     >
>     >> Hi,
>     >>
>     >> We are planning on creating a HA product in an active/standby
>     >> configuration
>     >> whereby the standby unit needs to take over from the active
>     unit very
>     >> fast
>     >> (<50ms including all services restored).
>     >>
>     >> We are able to do very fast signaling (say 1000Hz) between the two
>     >> units to
>     >> detect failures so detecting a failure isn't really an issue.
>     >>
>     >> Pacemaker looks to be a very useful piece of software for managing
>     >> resources so rather than roll our own it would make sense to reuse
>     >> pacemaker.
>     >>
>     >> So my initial questions are:
>     >>
>     >>     1. Do people think pacemaker is the right thing to use?
>     Everything I
>     >>     read seem to be talking about multiple seconds for failure
>     >> detection etc.
>     >>     Feature wise it looks pretty similar to what we would want.
>     >>     2. Has anyone done anything similar to this?
>     >>     3. Any pointers on where/how to add additional failure
>     detection
>     >> inputs
>     >>     to pacemaker?
>     >>     4.
>     >>     5. For a new design would you go with pacemaker+corosync,
>     >>     pacemaker+corosync+knet or something different?
>     >>
>     >
>     >
>     > I will just share my point of view about Corosync side.
>     >
>     > Corosync is using it's own mechanism for detecting failure, based on
>     > token rotation. Default timeout for detecting lost of token is 1
>     > second, so detecting failure takes hugely more than 50ms. It can be
>     > lowered, but that is not really tested.
>     >
>     > That means it's not currently possible to use different signaling
>     > mechanism without significant Corosync change.
>     >
>     > So I don't think Corosync can be really used for described scenario.
>     >
>     > Honza
>
>     On the other hand if a fail-over is triggered by loosing a node or
>     anything
>     that is being detected by corosync this is probably already the
>     fast-path
>     in a pacemaker-cluster.
>
>     Detection of other types of failures (like a resource failing on
>     an otherwise functional node) is probably even way slower.
>     When a failure is detected by corosync, pacemaker has some kind of
>     an event driven way to react on that.
>     We even have to add some delay to the mere corosync detection time
>     mentioned by Honza as pacemaker will have to run e.g. a selection
>     cycle for the designated coordinator to be able to do decisions again.
>
>     For other failures the base principle is rather probing a resource
>     at a
>     fixed rate (multiple seconds usually) for detection of failures
>     instead
>     of an event-driven mechanism.
>     There might be trickery possible though using attributes to achieve
>     event-driven-like reaction on certain failures. But I haven't done
>     anything concrete to exploit these possibilities. Others might have
>     more info (which I personally would be interested in as well ;-) ).
>
>     Approaches to realize event-driven mechanisms for resource-failure-
>     detection are under investigation/development (systemd-resources,
>     IP resources sitting on interfaces, ...) but afaik there is nothing
>     available out of the box by now.
>
>     Having that all said I can add some personal experiences from
>     having implemented an embedded product based on a
>     pacemaker-cluster myself in the past:
>
>     As reaction time based on pacemaker would be too slow for e.g.
>     many communication-protocols (e.g. things like SIP) or realtime-
>     streams it seems advisable to solve these issues on the
>     application-layer inside a service (respectively distributed service
>     in a cluster).
>     Pacemaker and it's decision engine can then be used to bring
>     up this distributed service in a cluster in some kind of an ordered
>     way.
>     Any additional services that might be less demanding regarding
>     switch-over timeout can be made available via pacemaker
>     directly.
>
>     Otherwise pacemaker configuration is very flexible so that you
>     can implement merely anything. It might be advisable to avoid
>     certain approaches which are common in cases where a cluster
>     is operated by somebody who can be informed quickly and
>     has to react under certain SLAs. Thinking of e.g. fencing a node
>     to be switched off instead of rebooting it might not be desirable
>     with kind of an appliance that is expected to just sit there and
>     work without merely any admin effort/expense at all.
>     But that is of course just an example and configuration (incl.
>     configuration concept) has to be tailored to your requirements.
>
>     Regards,
>     Klaus
>      
>     >
>     >>
>     >> Thanks
>     >>
>     >> David
>     >>
>     >>
>     >>
>     >> _______________________________________________
>     >> Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     >> https://lists.clusterlabs.org/mailman/listinfo/users
>     <https://lists.clusterlabs.org/mailman/listinfo/users>
>     >>
>     >> Project Home: http://www.clusterlabs.org
>     >> Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     >> Bugs: http://bugs.clusterlabs.org
>     >>
>     >
>     > _______________________________________________
>     > Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     > https://lists.clusterlabs.org/mailman/listinfo/users
>     <https://lists.clusterlabs.org/mailman/listinfo/users>
>     >
>     > Project Home: http://www.clusterlabs.org
>     > Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     > Bugs: http://bugs.clusterlabs.org
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180412/99c01959/attachment-0002.html>