[ClusterLabs] Problems with master/slave failovers
Ken Gaillot
kgaillot at redhat.com
Mon Jul 1 10:32:38 EDT 2019
On Sat, 2019-06-29 at 03:01 +0000, Harvey Shepherd wrote:
> Thank you so much Ken, your explanation of the crm_simulate output is
> really helpful. Regarding your suggestion of setting a migration-
> threshold of 1 for the king resource, I did in fact have that in
> place as a workaround. But ideally I don't want the failed instance
> to be delayed from restarting by having to time out its failure - I'd
> just like the resource to failover and restart a new slave
> immediately, That's because on my system there is a hit to
> performance if the slave instance is not running.
>
> My suspicion is that Pacemaker is trying to do the right thing, but
> is failing either because the operation is timing out, or because it
> is getting confused in some way due to the colocation and ordering
> constraints placing dependencies between the servant resources and
> the master king resource. Either of those possibilities might explain
> why I see logs like these during the eight or so attempts that
> Pacemaker makes to perform a failover after the king master resource
> fails.
>
> Jun 29 02:33:03 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=0, Pending=1,
> Fired=3, Skipped=0, Incomplete=61,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:03 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=2, Pending=1,
> Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:03 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=3, Pending=1,
> Fired=1, Skipped=0, Incomplete=2,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:03 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=4, Pending=1,
> Fired=1, Skipped=0, Incomplete=9,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=5, Pending=1,
> Fired=3, Skipped=0, Incomplete=29,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=7, Pending=1,
> Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=8, Pending=1,
> Fired=1, Skipped=0, Incomplete=5,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
You can see the "Complete" counter going up in each message above.
These are actions in the transition completing (successfully, otherwise
there would be messages about failures).
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (abort_transition_graph) notice: Transition 10 aborted by deletion
> of nvpair[@id='status-1-master-king_resource']: Transient attribute
> change | cib=0.4.208 source=abort_unless_down:345
> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/in
> stance_attributes[@id='status-1']/nvpair[@id='status-1-master-
> king_resource'] complete=false
Transitions are aborted anytime new information comes in, in this case
one of the master attributes changed. It's not a problem in any way,
pacemaker will simply recalculate what still needs to be done, taking
into account the actions that have already completed.
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=8, Pending=1,
> Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=9, Pending=0,
> Fired=3, Skipped=12, Incomplete=108,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=12, Pending=0,
> Fired=1, Skipped=14, Incomplete=107,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=13, Pending=1,
> Fired=1, Skipped=0, Incomplete=7,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) debug: Transition 10 (Complete=14, Pending=0,
> Fired=2, Skipped=14, Incomplete=104,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (run_graph) notice: Transition 10 (Complete=16, Pending=0,
> Fired=0, Skipped=15, Incomplete=104,
> Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): Stopped
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (te_graph_trigger) debug: Transition 10 is now complete
> Jun 29 02:33:04 ctr_qemu pacemaker-controld [1224]
> (notify_crmd) debug: Transition 10 status: restart - Transient
> attribute change
Even though the transition aborted, you still see actions being
completed -- these are actions that were already initiated when the
transition was aborted, so we were waiting on their results before
proceeding.
> As you can see, it eventually gives up in the transition attempt and
> starts a new one. Eventually the failed king resource master has had
> time to come back online and it then just promotes it again and
> forgets about trying to failover. I'm not sure if the cluster
> transition actions listed by crm_simulate are in the order in which
> Pacemaker tries to carry out the operations, but if so the order is
The "transition summary" is just a resource-by-resource list, not the
order things will be done. The "executing cluster transition" section
is the order things are being done.
> wrong. It should be stopping all servant resources on the failed king
> master, then failing over the king resource, then migrating the
> servant resources to the new master node. Instead it seems to be
> trying to migrate all the servant resources over first, with the king
> master failover scheduled near the bottom, which won't work due to
> the colocation constraint with the king master.
>
> Current cluster status:
> Online: [ primary secondary ]
>
> stk_shared_ip (ocf::heartbeat:IPaddr2): Started secondary
> Clone Set: ms_king_resource [king_resource] (promotable)
> king_resource (ocf::aviat:king-resource-ocf): FAILED
> primary
> Slaves: [ secondary ]
> Clone Set: ms_servant1 [servant1]
> Started: [ primary secondary ]
> Clone Set: ms_servant2 [servant2] (promotable)
> Masters: [ primary ]
> Slaves: [ secondary ]
> Clone Set: ms_servant3 [servant3] (promotable)
> Masters: [ primary ]
> Slaves: [ secondary ]
> servant4 (lsb:servant4): Started primary
> servant5 (lsb:servant5): Started primary
> servant6 (lsb:servant6): Started primary
> servant7 (lsb:servant7): Started primary
> servant8 (lsb:servant8): Started primary
> Resource Group: servant9_active_disabled
> servant9_resource1 (lsb:servant9_resource1): Started
> primary
> servant9_resource2 (lsb:servant9_resource2): Started primary
> servant10 (lsb:servant10): Started primary
> servant11 (lsb:servant11): Started primary
> servant12 (lsb:servant12): Started primary
> servant13 (lsb:servant13): Started primary
>
> Transition Summary:
> * Recover king_resource:0 ( Slave primary )
> * Promote king_resource:1 ( Slave -> Master secondary )
> * Demote servant2:0 ( Master -> Slave primary )
> * Promote servant2:1 ( Slave -> Master secondary )
> * Demote servant3:0 ( Master -> Slave primary )
> * Promote servant3:1 ( Slave -> Master secondary )
> * Move servant4 ( primary -> secondary )
> * Move servant5 ( primary -> secondary )
> * Move servant6 ( primary -> secondary )
> * Move servant7 ( primary -> secondary )
> * Move servant8 ( primary -> secondary )
> * Move servant9_resource1 ( primary ->
> secondary )
> * Move servant9_resource2 ( primary -> secondary )
> * Move servant10 ( primary -> secondary )
> * Move servant11 ( primary -> secondary )
> * Move servant12 ( primary -> secondary
> )
> * Move servant13 ( primary -> secondary )
>
> Executing cluster transition:
> * Pseudo action: ms_king_resource_pre_notify_stop_0
> * Pseudo action: ms_servant2_pre_notify_demote_0
> * Resource action: servant3 cancel=10000 on primary
> * Resource action: servant3 cancel=11000 on secondary
> * Pseudo action: ms_servant3_pre_notify_demote_0
> * Resource action: servant4 stop on primary
> * Resource action: servant5 stop on primary
> * Resource action: servant6 stop on primary
> * Resource action: servant7 stop on primary
> * Resource action: servant8 stop on primary
> * Pseudo action: servant9_active_disabled_stop_0
> * Resource action: servant9_resource2 stop on primary
> * Resource action: servant10 stop on primary
> * Resource action: servant11 stop on primary
> * Resource action: servant12 stop on primary
> * Resource action: servant13 stop on primary
> * Resource action: king_resource notify on primary
> * Resource action: king_resource notify on secondary
> * Pseudo action: ms_king_resource_confirmed-pre_notify_stop_0
> * Pseudo action: ms_king_resource_stop_0
> * Resource action: servant2 notify on primary
> * Resource action: servant2 notify on secondary
> * Pseudo action: ms_servant2_confirmed-pre_notify_demote_0
> * Pseudo action: ms_servant2_demote_0
> * Resource action: servant3 notify on primary
> * Resource action: servant3 notify on secondary
> * Pseudo action: ms_servant3_confirmed-pre_notify_demote_0
> * Pseudo action: ms_servant3_demote_0
> * Resource action: servant4 start on secondary
> * Resource action: servant5 start on secondary
> * Resource action: servant6 start on secondary
> * Resource action: servant7 start on secondary
> * Resource action: servant8 start on secondary
> * Resource action: servant9_resource1 stop on primary
> * Resource action: servant10 start on secondary
> * Resource action: servant11 start on secondary
> * Resource action: servant12 start on secondary
> * Resource action: servant13 start on secondary
> * Resource action: king_resource stop on primary
> * Pseudo action: ms_king_resource_stopped_0
> * Resource action: servant2 demote on primary
> * Pseudo action: ms_servant2_demoted_0
> * Resource action: servant3 demote on primary
> * Pseudo action: ms_servant3_demoted_0
> * Resource action: servant4 monitor=10000 on secondary
> * Resource action: servant5 monitor=10000 on secondary
> * Resource action: servant6 monitor=10000 on secondary
> * Resource action: servant7 monitor=10000 on secondary
> * Resource action: servant8 monitor=10000 on secondary
> * Pseudo action: servant9_active_disabled_stopped_0
> * Pseudo action: servant9_active_disabled_start_0
> * Resource action: servant9_resource1 start on secondary
> * Resource action: servant9_resource2 start on secondary
> * Resource action: servant10 monitor=10000 on secondary
> * Resource action: servant11 monitor=10000 on secondary
> * Resource action: servant12 monitor=10000 on secondary
> * Resource action: servant13 monitor=10000 on secondary
> * Pseudo action: ms_king_resource_post_notify_stopped_0
> * Pseudo action: ms_servant2_post_notify_demoted_0
> * Pseudo action: ms_servant3_post_notify_demoted_0
> * Pseudo action: servant9_active_disabled_running_0
> * Resource action: servant9_resource1 monitor=10000 on
> secondary
> * Resource action: servant9_resource2 monitor=10000 on secondary
> * Resource action: king_resource notify on secondary
> * Pseudo action: ms_king_resource_confirmed-post_notify_stopped_0
> * Pseudo action: ms_king_resource_pre_notify_start_0
> * Resource action: servant2 notify on primary
> * Resource action: servant2 notify on secondary
> * Pseudo action: ms_servant2_confirmed-post_notify_demoted_0
> * Pseudo action: ms_servant2_pre_notify_promote_0
> * Resource action: servant3 notify on primary
> * Resource action: servant3 notify on secondary
> * Pseudo action: ms_servant3_confirmed-post_notify_demoted_0
> * Pseudo action: ms_servant3_pre_notify_promote_0
> * Resource action: king_resource notify on secondary
> * Pseudo action: ms_king_resource_confirmed-pre_notify_start_0
> * Pseudo action: ms_king_resource_start_0
> * Resource action: servant2 notify on primary
> * Resource action: servant2 notify on secondary
> * Pseudo action: ms_servant2_confirmed-pre_notify_promote_0
> * Pseudo action: ms_servant2_promote_0
> * Resource action: servant3 notify on primary
> * Resource action: servant3 notify on secondary
> * Pseudo action: ms_servant3_confirmed-pre_notify_promote_0
> * Pseudo action: ms_servant3_promote_0
> * Resource action: king_resource start on primary
> * Pseudo action: ms_king_resource_running_0
> * Resource action: servant2 promote on secondary
> * Pseudo action: ms_servant2_promoted_0
> * Resource action: servant3 promote on secondary
> * Pseudo action: ms_servant3_promoted_0
> * Pseudo action: ms_king_resource_post_notify_running_0
> * Pseudo action: ms_servant2_post_notify_promoted_0
> * Pseudo action: ms_servant3_post_notify_promoted_0
> * Resource action: king_resource notify on primary
> * Resource action: king_resource notify on secondary
> * Pseudo action: ms_king_resource_confirmed-post_notify_running_0
> * Resource action: servant2 notify on primary
> * Resource action: servant2 notify on secondary
> * Pseudo action: ms_servant2_confirmed-post_notify_promoted_0
> * Resource action: servant3 notify on primary
> * Resource action: servant3 notify on secondary
> * Pseudo action: ms_servant3_confirmed-post_notify_promoted_0
> * Pseudo action: ms_king_resource_pre_notify_promote_0
> * Resource action: servant2 monitor=11000 on primary
> * Resource action: servant2 monitor=10000 on secondary
> * Resource action: servant3 monitor=11000 on primary
> * Resource action: servant3 monitor=10000 on secondary
> * Resource action: king_resource notify on primary
> * Resource action: king_resource notify on secondary
> * Pseudo action: ms_king_resource_confirmed-pre_notify_promote_0
> * Pseudo action: ms_king_resource_promote_0
> * Resource action: king_resource promote on secondary
> * Pseudo action: ms_king_resource_promoted_0
> * Pseudo action: ms_king_resource_post_notify_promoted_0
> * Resource action: king_resource notify on primary
> * Resource action: king_resource notify on secondary
> * Pseudo action: ms_king_resource_confirmed-post_notify_promoted_0
> * Resource action: king_resource monitor=11000 on primary
> * Resource action: king_resource monitor=10000 on secondary
> Using the original execution date of: 2019-06-29 02:33:03Z
>
> Revised cluster status:
> Online: [ primary secondary ]
>
> stk_shared_ip (ocf::heartbeat:IPaddr2): Started secondary
> Clone Set: ms_king_resource [king_resource] (promotable)
> Masters: [ secondary ]
> Slaves: [ primary ]
> Clone Set: ms_servant1 [servant1]
> Started: [ primary secondary ]
> Clone Set: ms_servant2 [servant2] (promotable)
> Masters: [ secondary ]
> Slaves: [ primary ]
> Clone Set: ms_servant3 [servant3] (promotable)
> Masters: [ secondary ]
> Slaves: [ primary ]
> servant4 (lsb:servant4): Started secondary
> servant5 (lsb:servant5): Started secondary
> servant6 (lsb:servant6): Started secondary
> servant7 (lsb:servant7): Started secondary
> servant8 (lsb:servant8): Started secondary
> Resource Group: servant9_active_disabled
> servant9_resource1 (lsb:servant9_resource1): Started
> secondary
> servant9_resource2 (lsb:servant9_resource2): Started secondary
> servant10 (lsb:servant10): Started secondary
> servant11 (lsb:servant11): Started secondary
> servant12 (lsb:servant12): Started secondary
> servant13 (lsb:servant13): Started secondary
>
>
> I don't think that there is an issue with the CIB constraints
> configuration, otherwise the resources would not be able to start
> upon bootup, but I'll keep digging and report back if I find any
> cause.
>
> Thanks again,
> Harvey
>
> ________________________________________
> From: Users <users-bounces at clusterlabs.org> on behalf of Ken Gaillot
> <kgaillot at redhat.com>
> Sent: Saturday, 29 June 2019 3:10 a.m.
> To: Cluster Labs - All topics related to open-source clustering
> welcomed
> Subject: EXTERNAL: Re: [ClusterLabs] Problems with master/slave
> failovers
>
> On Fri, 2019-06-28 at 07:36 +0000, Harvey Shepherd wrote:
> > Thanks for your reply Andrei. Whilst I understand what you say
> > about
> > the difficulties of diagnosing issues without all of the info, it's
> > a
> > compromise between a mailing list posting being very verbose in
> > which
> > case nobody wants to read it, and containing enough relevant
> > information for someone to be able to help. With 20+ resources
> > involved during a failover there are literally thousands of logs
> > generated, and it would be pointless to post them all.
> >
> > I've tried to focus in on the king resource only to keep things
> > simple, as that is the only resource that can initiate a failover.
> > I
> > provided the real master scores and transition decisions made by
> > pacemaker at the times that I killed the king master resource by
> > showing the crm_simulator output from both tests, and the CIB
> > config
> > is ss described. As I mentioned, migration-threshold is set to zero
> > for all resources, so it shouldn't prevent a second failover.
> >
> > Regarding the resource agent return codes, the failure is detected
> > by
> > the 10s king resource master instance monitor operation, which
> > returns OCF_ERR_GENERIC because the resource is expected to be
> > running and isn't (the OCF resource agent developers guide states
> > that monitor should only return OCF_NOT_RUNNING if there is no
> > error
> > condition that caused the resource to stop).
> >
> > What would be really helpful would be if you or someone else could
> > help me decipher the crm_simulate output:
>
> I've been working with Pacemaker for years and still look at those
> scores only after exhausting all other investigation.
>
> It isn't AI, but the complexity is somewhat similar in that it's not
> really possible to boil down the factors that went into a decision in
> a
> few human-readable sentences. We do have a project planned to provide
> some insight in human-readable form.
>
> But if you really want the headache:
>
> > 1. What is the difference between clone_color and native_color?
>
> native_color is scores added by the resource as a primitive resource,
> i.e. the resource being cloned. clone_color is scores added by the
> resource as a clone, i.e. the internal abstraction that allows a
> primitive resource to run in multiple places. All it really means is
> that different C functions added the scores, which is pretty useless
> without staring at the source code of those functions.
>
> > 2. What is the difference between "promotion scores" and
> > "allocation
> > scores" and why does the output show several instances of each?
>
> Allocation is placement of particular resources (including individual
> clone instances) to particular nodes; promotion is selecting an
> instance to be master.
>
> The multiple occurrences are due to multiple factors going into the
> final score.
>
> > 3. How does pacemaker use those scores to decide whether to
> > failover?
>
> It doesn't -- it uses them to determine where to failover. Whether to
> failover is determined by fail-count and resource operation history
> (and affected by configured policies such as on-fail, failure-
> timeout,
> and migration-threshold).
>
> > 4. Why is there a -INFINITY score on one node?
>
> That sometimes requires trace-level debugging and following the path
> through the source code. Which I don't recommend unless you're
> wanting
> to make this a full-time gig :)
>
> At this level of investigation, I usually start with giving
> crm_simulate -VVVV, which will show up to info-level logs. If that
> doesn't make it clear, add another -V for debug logs, and then
> another
> -V for trace logs, but that stretches the bounds of human
> intelligibility. Somewhat more helpful is PCMK_trace_tags=<resource-
> name> before crm_simulate, which will give some trace-level output
> for
> the given resource without swamping you with infinite detail. For
> clones it's best to use PCMK_trace_tags=<resource-name>,<clone-name>
> and sometimes even <resource-name>:0, etc.
>
> > Thanks again for your help.
> >
> >
> >
> > On 28 Jun 2019 6:46 pm, Andrei Borzenkov <arvidjaar at gmail.com>
> > wrote:
> > > On Fri, Jun 28, 2019 at 7:24 AM Harvey Shepherd
> > > <Harvey.Shepherd at aviatnet.com> wrote:
> > > >
> > > > Hi All,
> > > >
> > > >
> > > > I'm running Pacemaker 2.0.2 on a two node cluster. It runs one
> > >
> > > master/slave resource (I'll refer to it as the king resource) and
> > > about 20 other resources which are a mixture of:
> > > >
> > > >
> > > > - resources that only run on the king resource master node
> > >
> > > (colocation constraint with a score of INFINITY)
> > > >
> > > > - clone resources that run on both nodes
> > > >
> > > > - two other master/slave resources where the masters runs on
> > > > the
> > >
> > > same node as the king resource master (colocation constraint with
> > > a
> > > score of INFINITY)
> > > >
> > > >
> > > > I'll refer to the above set of resources as servant resources.
> > > >
> > > >
> > > > All servant resources have a resource-stickiness of zero and
> > > > the
> > >
> > > king resource has a resource-stickiness of 100. There is an
> > > ordering constraint that the king resource must start before all
> > > servant resources. The king resource is controlled by an OCF
> > > script
> > > that uses crm_master to set the preferred master for the king
> > > resource (current master has value 100, current slave is 5,
> > > unassigned role or resource failure is 1) - I've verified that
> > > these values are being set as expected upon
> > > promotion/demotion/failure etc, via the logs. That's pretty much
> > > all of the configuration - there is no configuration around node
> > > preferences and migration-threshold is zero for everything.
> > > >
> > > >
> > > > What I'm trying to achieve is fairly simple:
> > > >
> > > >
> > > > 1. If any servant resource fails on either node, it is simply
> > >
> > > restarted. These resources should never failover onto the other
> > > node because of colocation with the king resource, and they
> > > should
> > > not contribute in any way to deciding whether the king resource
> > > should failover (which is why they have a resource-stickiness of
> > > zero).
> > > >
> > > > 2. If the slave instance of the king resource fails, it should
> > >
> > > simply be restarted and again no failover should occur.
> > > >
> > > > 3. If the master instance of the king resource fails, then its
> > >
> > > slave instance should immediately be promoted, and the failed
> > > instance should be restarted. Failover of all servant resources
> > > should then occur due to the colocation dependency.
> > > >
> > > >
> > > > It's number 3 above that I'm having trouble with. If I kill the
> > >
> > > master king resource instance it behaves as I expect - everything
> > > fails over and the king resource is restarted on the new slave.
> > > If
> > > I then kill the master instance of the king resource again
> > > however,
> > > instead of failing back over to its original node, it restarts
> > > and
> > > promotes back to master on the same node. This is not what I
> > > want.
> > > >
> > >
> > > migration-threshold is the first thing that comes in mind.
> > > Another
> > > possibility is hard error returned by resource agent that forces
> > > resource off node.
> > >
> > > But please realize that without actual configuration and logs at
> > > the
> > > time undesired behavior happens it just becomes game of riddles.
> > >
> > > >
> > > > The relevant output from crm_simulate for the two tests is
> > > > shown
> > >
> > > below. Can anyone suggest what might be going wrong? Whilst I
> > > really like the concept of crm_simulate, I can't find a good
> > > description of how to interpret the output and I don't understand
> > > the difference between clone_color and native_color, or the
> > > difference between "promotion scores" and the various instances
> > > of
> > > "allocation scores", nor does it really tell me what is
> > > contributing to the scores. Where does the -INFINITY allocation
> > > score come from for example?
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Harvey
> > > >
> > > >
> > > >
> > > > FIRST KING RESOURCE MASTER FAILURE (CORRECT BEHAVIOUR - MASTER
> > >
> > > NODE FAILOVER OCCURS)
> > > >
> > > >
> > > > Clone Set: ms_king_resource [king_resource] (promotable)
> > > > king_resource (ocf::aviat:king-resource-
> > > > ocf): FAILED
> > >
> > > Master secondary
> > > > clone_color: ms_king_resource allocation score on primary: 0
> > > > clone_color: ms_king_resource allocation score on secondary: 0
> > > > clone_color: king_resource:0 allocation score on primary: 0
> > > > clone_color: king_resource:0 allocation score on secondary: 101
> > > > clone_color: king_resource:1 allocation score on primary: 200
> > > > clone_color: king_resource:1 allocation score on secondary: 0
> > > > native_color: king_resource:1 allocation score on primary: 200
> > > > native_color: king_resource:1 allocation score on secondary: 0
> > > > native_color: king_resource:0 allocation score on primary:
> > >
> > > -INFINITY
> > > > native_color: king_resource:0 allocation score on secondary:
> > > > 101
> > > > king_resource:1 promotion score on primary: 100
> > > > king_resource:0 promotion score on secondary: 1
> > > > * Recover king_resource:0 ( Master -> Slave secondary
> > > > )
> > > > * Promote king_resource:1 ( Slave -> Master primary
> > > > )
> > > > * Resource action: king_resource cancel=10000 on secondary
> > > > * Resource action: king_resource cancel=11000 on primary
> > > > * Pseudo action: ms_king_resource_pre_notify_demote_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > pre_notify_demote_0
> > > > * Pseudo action: ms_king_resource_demote_0
> > > > * Resource action: king_resource demote on secondary
> > > > * Pseudo action: ms_king_resource_demoted_0
> > > > * Pseudo action: ms_king_resource_post_notify_demoted_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_demoted_0
> > > > * Pseudo action: ms_king_resource_pre_notify_stop_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > > > pre_notify_stop_0
> > > > * Pseudo action: ms_king_resource_stop_0
> > > > * Resource action: king_resource stop on secondary
> > > > * Pseudo action: ms_king_resource_stopped_0
> > > > * Pseudo action: ms_king_resource_post_notify_stopped_0
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_stopped_0
> > > > * Pseudo action: ms_king_resource_pre_notify_start_0
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > > > pre_notify_start_0
> > > > * Pseudo action: ms_king_resource_start_0
> > > > * Resource action: king_resource start on secondary
> > > > * Pseudo action: ms_king_resource_running_0
> > > > * Pseudo action: ms_king_resource_post_notify_running_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_running_0
> > > > * Pseudo action: ms_king_resource_pre_notify_promote_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > pre_notify_promote_0
> > > > * Pseudo action: ms_king_resource_promote_0
> > > > * Resource action: king_resource promote on primary
> > > > * Pseudo action: ms_king_resource_promoted_0
> > > > * Pseudo action: ms_king_resource_post_notify_promoted_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_promoted_0
> > > > * Resource action: king_resource monitor=11000 on secondary
> > > > * Resource action: king_resource monitor=10000 on primary
> > > > Clone Set: ms_king_resource [king_resource] (promotable)
> > > >
> > > >
> > > > SECOND KING RESOURCE MASTER FAILURE (INCORRECT BEHAVIOUR - SAME
> > >
> > > NODE IS PROMOTED TO MASTER)
> > > >
> > > >
> > > > Clone Set: ms_king_resource [king_resource] (promotable)
> > > > king_resource (ocf::aviat:king-resource-
> > > > ocf): FAILED
> > >
> > > Master primary
> > > > clone_color: ms_king_resource allocation score on primary: 0
> > > > clone_color: ms_king_resource allocation score on secondary: 0
> > > > clone_color: king_resource:0 allocation score on primary: 0
> > > > clone_color: king_resource:0 allocation score on secondary: 200
> > > > clone_color: king_resource:1 allocation score on primary: 101
> > > > clone_color: king_resource:1 allocation score on secondary: 0
> > > > native_color: king_resource:0 allocation score on primary: 0
> > > > native_color: king_resource:0 allocation score on secondary:
> > > > 200
> > > > native_color: king_resource:1 allocation score on primary: 101
> > > > native_color: king_resource:1 allocation score on secondary:
> > >
> > > -INFINITY
> > > > king_resource:1 promotion score on primary: 1
> > > > king_resource:0 promotion score on secondary: 1
> > > > * Recover king_resource:1 ( Master primary )
> > > > * Pseudo action: ms_king_resource_pre_notify_demote_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > pre_notify_demote_0
> > > > * Pseudo action: ms_king_resource_demote_0
> > > > * Resource action: king_resource demote on primary
> > > > * Pseudo action: ms_king_resource_demoted_0
> > > > * Pseudo action: ms_king_resource_post_notify_demoted_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_demoted_0
> > > > * Pseudo action: ms_king_resource_pre_notify_stop_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > > > pre_notify_stop_0
> > > > * Pseudo action: ms_king_resource_stop_0
> > > > * Resource action: king_resource stop on primary
> > > > * Pseudo action: ms_king_resource_stopped_0
> > > > * Pseudo action: ms_king_resource_post_notify_stopped_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_stopped_0
> > > > * Pseudo action: ms_king_resource_pre_notify_start_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > > > pre_notify_start_0
> > > > * Pseudo action: ms_king_resource_start_0
> > > > * Resource action: king_resource start on primary
> > > > * Pseudo action: ms_king_resource_running_0
> > > > * Pseudo action: ms_king_resource_post_notify_running_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_running_0
> > > > * Pseudo action: ms_king_resource_pre_notify_promote_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > pre_notify_promote_0
> > > > * Pseudo action: ms_king_resource_promote_0
> > > > * Resource action: king_resource promote on primary
> > > > * Pseudo action: ms_king_resource_promoted_0
> > > > * Pseudo action: ms_king_resource_post_notify_promoted_0
> > > > * Resource action: king_resource notify on secondary
> > > > * Resource action: king_resource notify on primary
> > > > * Pseudo action: ms_king_resource_confirmed-
> > >
> > > post_notify_promoted_0
> > > > * Resource action: king_resource monitor=10000 on primary
> > > > Clone Set: ms_king_resource [king_resource] (promotable)
>
> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list