[ClusterLabs] Antw: [EXT] Re: crm resource stop VirtualDomain ‑ but VirtualDomain shutdown start some minutes later
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Wed Feb 16 07:01:36 EST 2022
>>> "Lentes, Bernd" <bernd.lentes at helmholtz-muenchen.de> schrieb am 16.02.2022 um
12:35 in Nachricht
<879647182.178210820.1645011316841.JavaMail.zimbra at helmholtz-muenchen.de>:
>
> ----- On Feb 16, 2022, at 12:52 AM, kgaillot kgaillot at redhat.com wrote:
>
>
>>> Any idea ?
>>> What is about that transition 128, which is aborted ?
>>
>> A transition is the set of actions that need to be taken in response to
>> current conditions. A transition is aborted any time conditions change
>> (here, the target-role being changed in the configuration), so that a
>> new set of actions can be calculated.
>>
>> Someone once defined a transition as an "action plan", and I'm tempted
>> to use that instead. Plus maybe replace "aborted" with "interrupted",
>> so then we'd have "Action plan interrupted" which is maybe a little
>> more understandable.
>>
>>>
>>> Transition 128 is finished:
>>> Feb 15 21:04:26 [15370] ha-idg-2 crmd: notice:
>>> run_graph: Transition 128 (Complete=1, Pending=0, Fired=0,
>>> Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-
>>> 3548.bz2): Complete
>>>
>>> And one second later the shutdown starts. Is that normal that there
>>> is such a big time gap ?
>>>
>>
>> No, there should be another transition calculated (with a "saving
>> input" message) immediately after the original transition is aborted.
>> What's the timestamp on that?
>> --
>
> Hi Ken,
>
> this is what i found:
>
> Feb 15 20:54:30 [15369] ha-idg-2 pengine: notice: process_pe_message:
> Calculated transition 128, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-3548.bz2
> Feb 15 20:54:30 [15370] ha-idg-2 crmd: info: do_state_transition:
> State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE |
> input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
> Feb 15 20:54:30 [15370] ha-idg-2 crmd: notice: do_te_invoke:
> Processing graph 128 (ref=pe_calc-dc-1644954870-403) derived from
> /var/lib/pacemaker/pengine/pe-input-3548.bz2
> Feb 15 20:54:30 [15370] ha-idg-2 crmd: notice: te_rsc_command:
> Initiating stop operation vm_pathway_stop_0 locally on ha-idg-2 | action 76
>
> Feb 15 21:04:26 [15369] ha-idg-2 pengine: notice: process_pe_message:
> Calculated transition 129, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-3549.bz2
> Feb 15 21:04:26 [15370] ha-idg-2 crmd: info: do_state_transition:
> State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE |
> input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
> Feb 15 21:04:26 [15370] ha-idg-2 crmd: notice: do_te_invoke:
> Processing graph 129 (ref=pe_calc-dc-1644955466-405) derived from
> /var/lib/pacemaker/pengine/pe-input-3549.bz2
Bernd,
I guess the syslog(/journal of the DC has better logs.
As I see it now, it seems stop of vm_pathway takes a few minutes, and no other action is started befor that is done.
I think I once said it "Clusters are not for the impatient", i.e.: Don't start a noew action when the previous action did not complete yet.
Maybe more recent versions of pacemaker can "preempt" action plans (transitions), but I don't know...
Regards,
Ulrich
>
> Bernd
More information about the Users
mailing list