[ClusterLabs] Problems with master/slave failovers

Harvey Shepherd Harvey.Shepherd at Aviatnet.com
Fri Jun 28 23:01:01 EDT 2019


Thank you so much Ken, your explanation of the crm_simulate output is really helpful. Regarding your suggestion of setting a migration-threshold of 1 for the king resource, I did in fact have that in place as a workaround. But ideally I don't want the failed instance to be delayed from restarting by having to time out its failure - I'd just like the resource to failover and restart a new slave immediately, That's because on my system there is a hit to performance if the slave instance is not running.

My suspicion is that Pacemaker is trying to do the right thing, but is failing either because the operation is timing out, or because it is getting confused in some way due to the colocation and ordering constraints placing dependencies between the servant resources and the master king resource. Either of those possibilities might explain why I see logs like these during the eight or so attempts that Pacemaker makes to perform a failover after the king master resource fails.

Jun 29 02:33:03 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=0, Pending=1, Fired=3, Skipped=0, Incomplete=61, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:03 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=2, Pending=1, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:03 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=3, Pending=1, Fired=1, Skipped=0, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:03 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=4, Pending=1, Fired=1, Skipped=0, Incomplete=9, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=5, Pending=1, Fired=3, Skipped=0, Incomplete=29, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=7, Pending=1, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=8, Pending=1, Fired=1, Skipped=0, Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (abort_transition_graph)    notice: Transition 10 aborted by deletion of nvpair[@id='status-1-master-king_resource']: Transient attribute change | cib=0.4.208 source=abort_unless_down:345 path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-master-king_resource'] complete=false
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=8, Pending=1, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=9, Pending=0, Fired=3, Skipped=12, Incomplete=108, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=12, Pending=0, Fired=1, Skipped=14, Incomplete=107, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=13, Pending=1, Fired=1, Skipped=0, Incomplete=7, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         debug: Transition 10 (Complete=14, Pending=0, Fired=2, Skipped=14, Incomplete=104, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): In-progress
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (run_graph)         notice: Transition 10 (Complete=16, Pending=0, Fired=0, Skipped=15, Incomplete=104, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): Stopped
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (te_graph_trigger)  debug: Transition 10 is now complete
Jun 29 02:33:04 ctr_qemu pacemaker-controld  [1224] (notify_crmd)       debug: Transition 10 status: restart - Transient attribute change

As you can see, it eventually gives up in the transition attempt and starts a new one. Eventually the failed king resource master has had time to come back online and it then just promotes it again and forgets about trying to failover. I'm not sure if the cluster transition actions listed by crm_simulate are in the order in which Pacemaker tries to carry out the operations, but if so the order is wrong. It should be stopping all servant resources on the failed king master, then failing over the king resource, then migrating the servant resources to the new master node. Instead it seems to be trying to migrate all the servant resources over first, with the king master failover scheduled near the bottom, which won't work due to the colocation constraint with the king master.

Current cluster status:
Online: [ primary secondary ]

 stk_shared_ip  (ocf::heartbeat:IPaddr2):       Started secondary
 Clone Set: ms_king_resource [king_resource] (promotable)
     king_resource      (ocf::aviat:king-resource-ocf):    FAILED primary
     Slaves: [ secondary ]
 Clone Set: ms_servant1 [servant1]
     Started: [ primary secondary ]
 Clone Set: ms_servant2 [servant2] (promotable)
     Masters: [ primary ]
     Slaves: [ secondary ]
 Clone Set: ms_servant3 [servant3] (promotable)
     Masters: [ primary ]
     Slaves: [ secondary ]
 servant4        (lsb:servant4):  Started primary
 servant5  (lsb:servant5):    Started primary
 servant6      (lsb:servant6):        Started primary
 servant7      (lsb:servant7):      Started primary
 servant8      (lsb:servant8):        Started primary
 Resource Group: servant9_active_disabled
     servant9_resource1      (lsb:servant9_resource1):    Started primary
     servant9_resource2   (lsb:servant9_resource2): Started primary
 servant10 (lsb:servant10):   Started primary
 servant11 (lsb:servant11):      Started primary
 servant12    (lsb:servant12):      Started primary
 servant13        (lsb:servant13):  Started primary

Transition Summary:
 * Recover    king_resource:0     (             Slave primary )  
 * Promote    king_resource:1     ( Slave -> Master secondary )  
 * Demote     servant2:0          (   Master -> Slave primary )  
 * Promote    servant2:1          ( Slave -> Master secondary )  
 * Demote     servant3:0          (   Master -> Slave primary )  
 * Promote    servant3:1          ( Slave -> Master secondary )  
 * Move       servant4             (      primary -> secondary )  
 * Move       servant5               (      primary -> secondary )  
 * Move       servant6           (      primary -> secondary )  
 * Move       servant7           (      primary -> secondary )  
 * Move       servant8           (      primary -> secondary )  
 * Move       servant9_resource1               (      primary -> secondary )  
 * Move       servant9_resource2    (      primary -> secondary )  
 * Move       servant10              (      primary -> secondary )  
 * Move       servant11              (      primary -> secondary )  
 * Move       servant12                 (      primary -> secondary )  
 * Move       servant13             (      primary -> secondary )  

Executing cluster transition:
 * Pseudo action:   ms_king_resource_pre_notify_stop_0
 * Pseudo action:   ms_servant2_pre_notify_demote_0
 * Resource action: servant3        cancel=10000 on primary
 * Resource action: servant3        cancel=11000 on secondary
 * Pseudo action:   ms_servant3_pre_notify_demote_0
 * Resource action: servant4         stop on primary
 * Resource action: servant5           stop on primary
 * Resource action: servant6       stop on primary
 * Resource action: servant7       stop on primary
 * Resource action: servant8       stop on primary
 * Pseudo action:   servant9_active_disabled_stop_0
 * Resource action: servant9_resource2 stop on primary
 * Resource action: servant10          stop on primary
 * Resource action: servant11          stop on primary
 * Resource action: servant12             stop on primary
 * Resource action: servant13         stop on primary
 * Resource action: king_resource   notify on primary
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
 * Pseudo action:   ms_king_resource_stop_0
 * Resource action: servant2        notify on primary
 * Resource action: servant2        notify on secondary
 * Pseudo action:   ms_servant2_confirmed-pre_notify_demote_0
 * Pseudo action:   ms_servant2_demote_0
 * Resource action: servant3        notify on primary
 * Resource action: servant3        notify on secondary
 * Pseudo action:   ms_servant3_confirmed-pre_notify_demote_0
 * Pseudo action:   ms_servant3_demote_0
 * Resource action: servant4         start on secondary
 * Resource action: servant5           start on secondary
 * Resource action: servant6       start on secondary
 * Resource action: servant7       start on secondary
 * Resource action: servant8       start on secondary
 * Resource action: servant9_resource1           stop on primary
 * Resource action: servant10          start on secondary
 * Resource action: servant11          start on secondary
 * Resource action: servant12             start on secondary
 * Resource action: servant13         start on secondary
 * Resource action: king_resource   stop on primary
 * Pseudo action:   ms_king_resource_stopped_0
 * Resource action: servant2        demote on primary
 * Pseudo action:   ms_servant2_demoted_0
 * Resource action: servant3        demote on primary
 * Pseudo action:   ms_servant3_demoted_0
 * Resource action: servant4         monitor=10000 on secondary
 * Resource action: servant5           monitor=10000 on secondary
 * Resource action: servant6       monitor=10000 on secondary
 * Resource action: servant7       monitor=10000 on secondary
 * Resource action: servant8       monitor=10000 on secondary
 * Pseudo action:   servant9_active_disabled_stopped_0
 * Pseudo action:   servant9_active_disabled_start_0
 * Resource action: servant9_resource1           start on secondary
 * Resource action: servant9_resource2 start on secondary
 * Resource action: servant10          monitor=10000 on secondary
 * Resource action: servant11          monitor=10000 on secondary
 * Resource action: servant12             monitor=10000 on secondary
 * Resource action: servant13         monitor=10000 on secondary
 * Pseudo action:   ms_king_resource_post_notify_stopped_0
 * Pseudo action:   ms_servant2_post_notify_demoted_0
 * Pseudo action:   ms_servant3_post_notify_demoted_0
 * Pseudo action:   servant9_active_disabled_running_0
 * Resource action: servant9_resource1           monitor=10000 on secondary
 * Resource action: servant9_resource2 monitor=10000 on secondary
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_stopped_0
 * Pseudo action:   ms_king_resource_pre_notify_start_0
 * Resource action: servant2        notify on primary
 * Resource action: servant2        notify on secondary
 * Pseudo action:   ms_servant2_confirmed-post_notify_demoted_0
 * Pseudo action:   ms_servant2_pre_notify_promote_0
 * Resource action: servant3        notify on primary
 * Resource action: servant3        notify on secondary
 * Pseudo action:   ms_servant3_confirmed-post_notify_demoted_0
 * Pseudo action:   ms_servant3_pre_notify_promote_0
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
 * Pseudo action:   ms_king_resource_start_0
 * Resource action: servant2        notify on primary
 * Resource action: servant2        notify on secondary
 * Pseudo action:   ms_servant2_confirmed-pre_notify_promote_0
 * Pseudo action:   ms_servant2_promote_0
 * Resource action: servant3        notify on primary
 * Resource action: servant3        notify on secondary
 * Pseudo action:   ms_servant3_confirmed-pre_notify_promote_0
 * Pseudo action:   ms_servant3_promote_0
 * Resource action: king_resource   start on primary
 * Pseudo action:   ms_king_resource_running_0
 * Resource action: servant2        promote on secondary
 * Pseudo action:   ms_servant2_promoted_0
 * Resource action: servant3        promote on secondary
 * Pseudo action:   ms_servant3_promoted_0
 * Pseudo action:   ms_king_resource_post_notify_running_0
 * Pseudo action:   ms_servant2_post_notify_promoted_0
 * Pseudo action:   ms_servant3_post_notify_promoted_0
 * Resource action: king_resource   notify on primary
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_running_0
 * Resource action: servant2        notify on primary
 * Resource action: servant2        notify on secondary
 * Pseudo action:   ms_servant2_confirmed-post_notify_promoted_0
 * Resource action: servant3        notify on primary
 * Resource action: servant3        notify on secondary
 * Pseudo action:   ms_servant3_confirmed-post_notify_promoted_0
 * Pseudo action:   ms_king_resource_pre_notify_promote_0
 * Resource action: servant2        monitor=11000 on primary
 * Resource action: servant2        monitor=10000 on secondary
 * Resource action: servant3        monitor=11000 on primary
 * Resource action: servant3        monitor=10000 on secondary
 * Resource action: king_resource   notify on primary
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
 * Pseudo action:   ms_king_resource_promote_0
 * Resource action: king_resource   promote on secondary
 * Pseudo action:   ms_king_resource_promoted_0
 * Pseudo action:   ms_king_resource_post_notify_promoted_0
 * Resource action: king_resource   notify on primary
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_promoted_0
 * Resource action: king_resource   monitor=11000 on primary
 * Resource action: king_resource   monitor=10000 on secondary
Using the original execution date of: 2019-06-29 02:33:03Z

Revised cluster status:
Online: [ primary secondary ]

 stk_shared_ip  (ocf::heartbeat:IPaddr2):       Started secondary
 Clone Set: ms_king_resource [king_resource] (promotable)
     Masters: [ secondary ]
     Slaves: [ primary ]
 Clone Set: ms_servant1 [servant1]
     Started: [ primary secondary ]
 Clone Set: ms_servant2 [servant2] (promotable)
     Masters: [ secondary ]
     Slaves: [ primary ]
 Clone Set: ms_servant3 [servant3] (promotable)
     Masters: [ secondary ]
     Slaves: [ primary ]
 servant4        (lsb:servant4):  Started secondary
 servant5  (lsb:servant5):    Started secondary
 servant6      (lsb:servant6):        Started secondary
 servant7      (lsb:servant7):      Started secondary
 servant8      (lsb:servant8):        Started secondary
 Resource Group: servant9_active_disabled
     servant9_resource1      (lsb:servant9_resource1):    Started secondary
     servant9_resource2   (lsb:servant9_resource2): Started secondary
 servant10 (lsb:servant10):   Started secondary
 servant11 (lsb:servant11):      Started secondary
 servant12    (lsb:servant12):      Started secondary
 servant13        (lsb:servant13):  Started secondary


I don't think that there is an issue with the CIB constraints configuration, otherwise the resources would not be able to start upon bootup, but I'll keep digging and report back if I find any cause.

Thanks again,
Harvey

________________________________________
From: Users <users-bounces at clusterlabs.org> on behalf of Ken Gaillot <kgaillot at redhat.com>
Sent: Saturday, 29 June 2019 3:10 a.m.
To: Cluster Labs - All topics related to open-source clustering welcomed
Subject: EXTERNAL: Re: [ClusterLabs] Problems with master/slave failovers

On Fri, 2019-06-28 at 07:36 +0000, Harvey Shepherd wrote:
> Thanks for your reply Andrei. Whilst I understand what you say about
> the difficulties of diagnosing issues without all of the info, it's a
> compromise between a mailing list posting being very verbose in which
> case nobody wants to read it, and containing enough relevant
> information for someone to be able to help. With 20+ resources
> involved during a failover there are literally thousands of logs
> generated, and it would be pointless to post them all.
>
> I've tried to focus in on the king resource only to keep things
> simple, as that is the only resource that can initiate a failover. I
> provided the real master scores and transition decisions made by
> pacemaker at the times that I killed the king master resource by
> showing the crm_simulator output from both tests, and the CIB config
> is ss described. As I mentioned, migration-threshold is set to zero
> for all resources, so it shouldn't prevent a second failover.
>
> Regarding the resource agent return codes, the failure is detected by
> the 10s king resource master instance monitor operation, which
> returns OCF_ERR_GENERIC because the resource is expected to be
> running and isn't (the OCF resource agent developers guide states
> that monitor should only return OCF_NOT_RUNNING if there is no error
> condition that caused the resource to stop).
>
> What would be really helpful would be if you or someone else could
> help me decipher the crm_simulate output:

I've been working with Pacemaker for years and still look at those
scores only after exhausting all other investigation.

It isn't AI, but the complexity is somewhat similar in that it's not
really possible to boil down the factors that went into a decision in a
few human-readable sentences. We do have a project planned to provide
some insight in human-readable form.

But if you really want the headache:

> 1. What is the difference between clone_color and native_color?

native_color is scores added by the resource as a primitive resource,
i.e. the resource being cloned. clone_color is scores added by the
resource as a clone, i.e. the internal abstraction that allows a
primitive resource to run in multiple places. All it really means is
that different C functions added the scores, which is pretty useless
without staring at the source code of those functions.

> 2. What is the difference between "promotion scores" and "allocation
> scores" and why does the output show several instances of each?

Allocation is placement of particular resources (including individual
clone instances) to particular nodes; promotion is selecting an
instance to be master.

The multiple occurrences are due to multiple factors going into the
final score.

> 3. How does pacemaker use those scores to decide whether to failover?

It doesn't -- it uses them to determine where to failover. Whether to
failover is determined by fail-count and resource operation history
(and affected by configured policies such as on-fail, failure-timeout,
and migration-threshold).

> 4. Why is there a -INFINITY score on one node?

That sometimes requires trace-level debugging and following the path
through the source code. Which I don't recommend unless you're wanting
to make this a full-time gig :)

At this level of investigation, I usually start with giving
crm_simulate -VVVV, which will show up to info-level logs. If that
doesn't make it clear, add another -V for debug logs, and then another
-V for trace logs, but that stretches the bounds of human
intelligibility. Somewhat more helpful is PCMK_trace_tags=<resource-
name> before crm_simulate, which will give some trace-level output for
the given resource without swamping you with infinite detail. For
clones it's best to use PCMK_trace_tags=<resource-name>,<clone-name>
and sometimes even <resource-name>:0, etc.

> Thanks again for your help.
>
>
>
> On 28 Jun 2019 6:46 pm, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
> > On Fri, Jun 28, 2019 at 7:24 AM Harvey Shepherd
> > <Harvey.Shepherd at aviatnet.com> wrote:
> > >
> > > Hi All,
> > >
> > >
> > > I'm running Pacemaker 2.0.2 on a two node cluster. It runs one
> > master/slave resource (I'll refer to it as the king resource) and
> > about 20 other resources which are a mixture of:
> > >
> > >
> > > - resources that only run on the king resource master node
> > (colocation constraint with a score of INFINITY)
> > >
> > > - clone resources that run on both nodes
> > >
> > > - two other master/slave resources where the masters runs on the
> > same node as the king resource master (colocation constraint with a
> > score of INFINITY)
> > >
> > >
> > > I'll refer to the above set of resources as servant resources.
> > >
> > >
> > > All servant resources have a resource-stickiness of zero and the
> > king resource has a resource-stickiness of 100. There is an
> > ordering constraint that the king resource must start before all
> > servant resources. The king resource is controlled by an OCF script
> > that uses crm_master to set the preferred master for the king
> > resource (current master has value 100, current slave is 5,
> > unassigned role or resource failure is 1) - I've verified that
> > these values are being set as expected upon
> > promotion/demotion/failure etc, via the logs. That's pretty much
> > all of the configuration - there is no configuration around node
> > preferences and migration-threshold is zero for everything.
> > >
> > >
> > > What I'm trying to achieve is fairly simple:
> > >
> > >
> > > 1. If any servant resource fails on either node, it is simply
> > restarted. These resources should never failover onto the other
> > node because of colocation with the king resource, and they should
> > not contribute in any way to deciding whether the king resource
> > should failover (which is why they have a resource-stickiness of
> > zero).
> > >
> > > 2. If the slave instance of the king resource fails, it should
> > simply be restarted and again no failover should occur.
> > >
> > > 3. If the master instance of the king resource fails, then its
> > slave instance should immediately be promoted, and the failed
> > instance should be restarted. Failover of all servant resources
> > should then occur due to the colocation dependency.
> > >
> > >
> > > It's number 3 above that I'm having trouble with. If I kill the
> > master king resource instance it behaves as I expect - everything
> > fails over and the king resource is restarted on the new slave. If
> > I then kill the master instance of the king resource again however,
> > instead of failing back over to its original node, it restarts and
> > promotes back to master on the same node. This is not what I want.
> > >
> >
> > migration-threshold is the first thing that comes in mind. Another
> > possibility is hard error returned by resource agent that forces
> > resource off node.
> >
> > But please realize that without actual configuration and logs at
> > the
> > time undesired behavior happens it just becomes game of riddles.
> >
> > >
> > > The relevant output from crm_simulate for the two tests is shown
> > below. Can anyone suggest what might be going wrong? Whilst I
> > really like the concept of crm_simulate, I can't find a good
> > description of how to interpret the output and I don't understand
> > the difference between clone_color and native_color, or the
> > difference between "promotion scores" and the various instances of
> > "allocation scores", nor does it really tell me what is
> > contributing to the scores. Where does the -INFINITY allocation
> > score come from for example?
> > >
> > >
> > > Thanks,
> > >
> > > Harvey
> > >
> > >
> > >
> > > FIRST KING RESOURCE MASTER FAILURE (CORRECT BEHAVIOUR - MASTER
> > NODE FAILOVER OCCURS)
> > >
> > >
> > >  Clone Set: ms_king_resource [king_resource] (promotable)
> > >      king_resource      (ocf::aviat:king-resource-ocf):    FAILED
> > Master secondary
> > > clone_color: ms_king_resource allocation score on primary: 0
> > > clone_color: ms_king_resource allocation score on secondary: 0
> > > clone_color: king_resource:0 allocation score on primary: 0
> > > clone_color: king_resource:0 allocation score on secondary: 101
> > > clone_color: king_resource:1 allocation score on primary: 200
> > > clone_color: king_resource:1 allocation score on secondary: 0
> > > native_color: king_resource:1 allocation score on primary: 200
> > > native_color: king_resource:1 allocation score on secondary: 0
> > > native_color: king_resource:0 allocation score on primary:
> > -INFINITY
> > > native_color: king_resource:0 allocation score on secondary: 101
> > > king_resource:1 promotion score on primary: 100
> > > king_resource:0 promotion score on secondary: 1
> > >  * Recover    king_resource:0      ( Master -> Slave secondary )
> > >  * Promote    king_resource:1      (   Slave -> Master primary )
> > >  * Resource action: king_resource   cancel=10000 on secondary
> > >  * Resource action: king_resource   cancel=11000 on primary
> > >  * Pseudo action:   ms_king_resource_pre_notify_demote_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_demote_0
> > >  * Pseudo action:   ms_king_resource_demote_0
> > >  * Resource action: king_resource   demote on secondary
> > >  * Pseudo action:   ms_king_resource_demoted_0
> > >  * Pseudo action:   ms_king_resource_post_notify_demoted_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_demoted_0
> > >  * Pseudo action:   ms_king_resource_pre_notify_stop_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
> > >  * Pseudo action:   ms_king_resource_stop_0
> > >  * Resource action: king_resource   stop on secondary
> > >  * Pseudo action:   ms_king_resource_stopped_0
> > >  * Pseudo action:   ms_king_resource_post_notify_stopped_0
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_stopped_0
> > >  * Pseudo action:   ms_king_resource_pre_notify_start_0
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
> > >  * Pseudo action:   ms_king_resource_start_0
> > >  * Resource action: king_resource   start on secondary
> > >  * Pseudo action:   ms_king_resource_running_0
> > >  * Pseudo action:   ms_king_resource_post_notify_running_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_running_0
> > >  * Pseudo action:   ms_king_resource_pre_notify_promote_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_promote_0
> > >  * Pseudo action:   ms_king_resource_promote_0
> > >  * Resource action: king_resource   promote on primary
> > >  * Pseudo action:   ms_king_resource_promoted_0
> > >  * Pseudo action:   ms_king_resource_post_notify_promoted_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_promoted_0
> > >  * Resource action: king_resource   monitor=11000 on secondary
> > >  * Resource action: king_resource   monitor=10000 on primary
> > >  Clone Set: ms_king_resource [king_resource] (promotable)
> > >
> > >
> > > SECOND KING RESOURCE MASTER FAILURE (INCORRECT BEHAVIOUR - SAME
> > NODE IS PROMOTED TO MASTER)
> > >
> > >
> > >  Clone Set: ms_king_resource [king_resource] (promotable)
> > >      king_resource      (ocf::aviat:king-resource-ocf):    FAILED
> > Master primary
> > > clone_color: ms_king_resource allocation score on primary: 0
> > > clone_color: ms_king_resource allocation score on secondary: 0
> > > clone_color: king_resource:0 allocation score on primary: 0
> > > clone_color: king_resource:0 allocation score on secondary: 200
> > > clone_color: king_resource:1 allocation score on primary: 101
> > > clone_color: king_resource:1 allocation score on secondary: 0
> > > native_color: king_resource:0 allocation score on primary: 0
> > > native_color: king_resource:0 allocation score on secondary: 200
> > > native_color: king_resource:1 allocation score on primary: 101
> > > native_color: king_resource:1 allocation score on secondary:
> > -INFINITY
> > > king_resource:1 promotion score on primary: 1
> > > king_resource:0 promotion score on secondary: 1
> > >  * Recover    king_resource:1     ( Master primary )
> > >  * Pseudo action:   ms_king_resource_pre_notify_demote_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_demote_0
> > >  * Pseudo action:   ms_king_resource_demote_0
> > >  * Resource action: king_resource   demote on primary
> > >  * Pseudo action:   ms_king_resource_demoted_0
> > >  * Pseudo action:   ms_king_resource_post_notify_demoted_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_demoted_0
> > >  * Pseudo action:   ms_king_resource_pre_notify_stop_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
> > >  * Pseudo action:   ms_king_resource_stop_0
> > >  * Resource action: king_resource   stop on primary
> > >  * Pseudo action:   ms_king_resource_stopped_0
> > >  * Pseudo action:   ms_king_resource_post_notify_stopped_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_stopped_0
> > >  * Pseudo action:   ms_king_resource_pre_notify_start_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
> > >  * Pseudo action:   ms_king_resource_start_0
> > >  * Resource action: king_resource   start on primary
> > >  * Pseudo action:   ms_king_resource_running_0
> > >  * Pseudo action:   ms_king_resource_post_notify_running_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_running_0
> > >  * Pseudo action:   ms_king_resource_pre_notify_promote_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_promote_0
> > >  * Pseudo action:   ms_king_resource_promote_0
> > >  * Resource action: king_resource   promote on primary
> > >  * Pseudo action:   ms_king_resource_promoted_0
> > >  * Pseudo action:   ms_king_resource_post_notify_promoted_0
> > >  * Resource action: king_resource   notify on secondary
> > >  * Resource action: king_resource   notify on primary
> > >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_promoted_0
> > >  * Resource action: king_resource   monitor=10000 on primary
> > >  Clone Set: ms_king_resource [king_resource] (promotable)
--
Ken Gaillot <kgaillot at redhat.com>

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


More information about the Users mailing list