[ClusterLabs] Antw: Re: Antw: [EXT] Re: What's a "transition", BTW?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Jan 19 02:53:12 EST 2021


>>> Reid Wahl <nwahl at redhat.com> schrieb am 19.01.2021 um 08:22 in Nachricht
<CAPiuu9-PjyBFy_0JyOgJj8Cp-4F65hnJrVzUFKSHVHUgXiPaNw at mail.gmail.com>:
> On Mon, Jan 18, 2021 at 11:18 PM Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> 
>> >>> Ken Gaillot <kgaillot at redhat.com> schrieb am 18.01.2021 um 19:29 in
>> Nachricht
>> <1047fd943be77f4a6fd4cd4dd19b65d1550512f8.camel at redhat.com>:
>> > On Fri, 2021‑01‑15 at 11:40 +0100, Ulrich Windl wrote:
>> >> Hi!
>> >>
>> >> With a cluster recheck interval, I see periodic log messages like
>> >> this:
>> >> Jan 15 11:05:50 h19 pacemaker‑controld[4804]:  notice: State
>> >> transition S_TRANSITION_ENGINE ‑> S_IDLE
>> >> Jan 15 11:15:50 h19 pacemaker‑controld[4804]:  notice: State
>> >> transition S_IDLE ‑> S_POLICY_ENGINE
>> >
>> > The "transition" terminology is a little confusing. Note that the above
>> > uses of it are just in the normal sense, i.e. the controller state
>> > changed.
>> >
>> > The controller uses a finite state machine to keep track of what it's
>> > doing now and next. Going from "transition engine" to "idle" means it
>> > finished whatever needed to be done in that transition (in the more
>> > technical Pacemaker sense). Going from "idle" to "police engine" means
>> > it is ready to re‑invoke the scheduler to re‑check whether anything
>> > needs to be done.
>> >
>> >> Jan 15 11:15:50 h19 pacemaker‑schedulerd[4803]:  notice: Watchdog
>> >> will be used via SBD if fencing is required and stonith‑watchdog‑
>> >> timeout is nonzero
>> >> Jan 15 11:15:50 h19 pacemaker‑schedulerd[4803]:  notice: Calculated
>> >> transition 596, saving inputs in /var/lib/pacemaker/pengine/pe‑input‑
>> >> 41.bz2
>> >> Jan 15 11:15:50 h19 pacemaker‑controld[4804]:  notice: Processing
>> >> graph 596 (ref=pe_calc‑dc‑1610705750‑978) derived from
>> >> /var/lib/pacemaker/pengine/pe‑input‑41.bz2
>> >> Jan 15 11:15:50 h19 pacemaker‑controld[4804]:  notice: Transition 596
>> >> (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>> >> Source=/var/lib/pacemaker/pengine/pe‑input‑41.bz2): Complete
>> >>
>> >> The "transition" number increases each time, while there is visible
>> >> no action to be performed. So what's in such a "transition"? Couldn't
>> >> the cluster skip those lines if there's nothing to do?
>> >>
>> >> Regards,
>> >> Ulrich
>> >
>> > "Transition" as Pacemaker uses it in a technical sense is what you
>> > called in a different post an "action plan". A transition is all
>> > actions needed to bring the cluster to the desired state (as defined by
>> > the configuration), given everything known about the cluster at the
>> > moment (represented by the complete CIB including configuration and
>> > status).
>> >
>> > The controller starts a new transition whenever something interesting
>> > happens (like a resource monitor failure), when a transition action
>> > returns an unexpected result (like a start failing instead of
>> > succeeding), and periodically (according to cluster‑recheck‑interval).
>> >
>> > In any case, it's possible there's nothing to do, so the transition has
>> > no actions. It's still a record that the cluster checked whether
>> > anything needed to be done, and decided no. I have considered lowering
>> > the log message to info level in that case, though ‑‑ that probably
>> > makes sense.
>>
>> If its something that is expected to happen frequently under normal
>> conditions, I also think "info" instead of "notice" would be OK as well,
>> but
>> what about pe-input?
>> Is a new file required even if there's nothing to do?
> 
> 
> Nope. For example, nothing's been happening in my cluster. The transition
> number increments, but the pe-input file stays the same.

You are right; I didn't look carefully enough. Thanks for pointing that out!

> 
> # grep 'Calculated transition' /var/log/pacemaker/pacemaker.log | tail -n 5
> Jan 18 22:12:13 fastvm-rhel-8-0-23 pacemaker-schedulerd[7699]
> (pcmk__log_transition_summary at pcmk_sched_allocate.c:2897) notice:
> Calculated transition 1003, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-376.bz2
> Jan 18 22:27:13 fastvm-rhel-8-0-23 pacemaker-schedulerd[7699]
> (pcmk__log_transition_summary at pcmk_sched_allocate.c:2897) notice:
> Calculated transition 1004, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-376.bz2
> Jan 18 22:42:13 fastvm-rhel-8-0-23 pacemaker-schedulerd[7699]
> (pcmk__log_transition_summary at pcmk_sched_allocate.c:2897) notice:
> Calculated transition 1005, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-376.bz2
> Jan 18 22:57:13 fastvm-rhel-8-0-23 pacemaker-schedulerd[7699]
> (pcmk__log_transition_summary at pcmk_sched_allocate.c:2897) notice:
> Calculated transition 1006, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-376.bz2
> Jan 18 23:12:13 fastvm-rhel-8-0-23 pacemaker-schedulerd[7699]
> (pcmk__log_transition_summary at pcmk_sched_allocate.c:2897) notice:
> Calculated transition 1007, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-376.bz2
> 
> 
>> I could imagine reusing
>> the last number if the last transition had no actions other than
>> monitor/probe.
>> Of course that would not work if inputs are interleaved (the next begins
>> before the last one has finished).
>>
>> Regards,
>> Ulrich
>>
>>
>> > ‑‑
>> > Ken Gaillot <kgaillot at redhat.com>
>> >
>> > _______________________________________________
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users 
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/ 
>>
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
>>
> 
> 
> -- 
> Regards,
> 
> Reid Wahl, RHCA
> Senior Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA





More information about the Users mailing list