[ClusterLabs] Antw: [EXT] Re: What's a "transition", BTW?
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Tue Jan 19 02:17:46 EST 2021
>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 18.01.2021 um 19:29 in
Nachricht
<1047fd943be77f4a6fd4cd4dd19b65d1550512f8.camel at redhat.com>:
> On Fri, 2021‑01‑15 at 11:40 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> With a cluster recheck interval, I see periodic log messages like
>> this:
>> Jan 15 11:05:50 h19 pacemaker‑controld[4804]: notice: State
>> transition S_TRANSITION_ENGINE ‑> S_IDLE
>> Jan 15 11:15:50 h19 pacemaker‑controld[4804]: notice: State
>> transition S_IDLE ‑> S_POLICY_ENGINE
>
> The "transition" terminology is a little confusing. Note that the above
> uses of it are just in the normal sense, i.e. the controller state
> changed.
>
> The controller uses a finite state machine to keep track of what it's
> doing now and next. Going from "transition engine" to "idle" means it
> finished whatever needed to be done in that transition (in the more
> technical Pacemaker sense). Going from "idle" to "police engine" means
> it is ready to re‑invoke the scheduler to re‑check whether anything
> needs to be done.
>
>> Jan 15 11:15:50 h19 pacemaker‑schedulerd[4803]: notice: Watchdog
>> will be used via SBD if fencing is required and stonith‑watchdog‑
>> timeout is nonzero
>> Jan 15 11:15:50 h19 pacemaker‑schedulerd[4803]: notice: Calculated
>> transition 596, saving inputs in /var/lib/pacemaker/pengine/pe‑input‑
>> 41.bz2
>> Jan 15 11:15:50 h19 pacemaker‑controld[4804]: notice: Processing
>> graph 596 (ref=pe_calc‑dc‑1610705750‑978) derived from
>> /var/lib/pacemaker/pengine/pe‑input‑41.bz2
>> Jan 15 11:15:50 h19 pacemaker‑controld[4804]: notice: Transition 596
>> (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>> Source=/var/lib/pacemaker/pengine/pe‑input‑41.bz2): Complete
>>
>> The "transition" number increases each time, while there is visible
>> no action to be performed. So what's in such a "transition"? Couldn't
>> the cluster skip those lines if there's nothing to do?
>>
>> Regards,
>> Ulrich
>
> "Transition" as Pacemaker uses it in a technical sense is what you
> called in a different post an "action plan". A transition is all
> actions needed to bring the cluster to the desired state (as defined by
> the configuration), given everything known about the cluster at the
> moment (represented by the complete CIB including configuration and
> status).
>
> The controller starts a new transition whenever something interesting
> happens (like a resource monitor failure), when a transition action
> returns an unexpected result (like a start failing instead of
> succeeding), and periodically (according to cluster‑recheck‑interval).
>
> In any case, it's possible there's nothing to do, so the transition has
> no actions. It's still a record that the cluster checked whether
> anything needed to be done, and decided no. I have considered lowering
> the log message to info level in that case, though ‑‑ that probably
> makes sense.
If its something that is expected to happen frequently under normal
conditions, I also think "info" instead of "notice" would be OK as well, but
what about pe-input?
Is a new file required even if there's nothing to do? I could imagine reusing
the last number if the last transition had no actions other than
monitor/probe.
Of course that would not work if inputs are interleaved (the next begins
before the last one has finished).
Regards,
Ulrich
> ‑‑
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list