[ClusterLabs] Pacemaker resource parameter reload confusion
Ferenc Wágner
wferi at niif.hu
Tue Oct 31 04:33:07 EDT 2017
Ken Gaillot <kgaillot at redhat.com> writes:
> On Fri, 2017-10-20 at 15:52 +0200, Ferenc Wágner wrote:
>
>> Ken Gaillot <kgaillot at redhat.com> writes:
>>
>>> On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote:
>>>
>>>> Ken Gaillot <kgaillot at redhat.com> writes:
>>>>
>>>>> Hmm, stop+reload is definitely a bug. Can you attach (or email it
>>>>> to me privately, or file a bz with it attached) the above pe-input
>>>>> file with any sensitive info removed?
>>>>
>>>> I sent you the pe-input file privately. It indeed shows the
>>>> issue:
>>>>
>>>> $ /usr/sbin/crm_simulate -x pe-input-1033.bz2 -RS
>>>> [...]
>>>> Executing cluster transition:
>>>> * Resource action: vm-alder stop on vhbl05
>>>> * Resource action: vm-alder reload on vhbl05
>>>> [...]
>>>>
>>>> Hope you can easily get to the bottom of this.
>>>
>>> This turned out to have the same underlying cause as CLBZ#5309. I
>>> have a fix pending review, which I expect to make it into the
>>> soon-to-be-released 1.1.18.
>>
>> Great!
>>
>>> It is a regression introduced in 1.1.15 by commit 2558d76f. The
>>> logic for reloads was consolidated in one place, but that happened
>>> to be before restarts were scheduled, so it no longer had the right
>>> information about whether a restart was needed. Now, it sets an
>>> ordering flag that is used later to cancel the reload if the restart
>>> becomes required. I've also added a regression test for it.
>>
>> Restarts shouldn't even enter the picture here, so I don't get your
>> explanation. But I also don't know the code, so that doesn't mean a
>> thing. I'll test the next RC to be sure.
>
> :-)
>
> Reloads are done in place of restarts, when circumstances allow. So
> reloads are always related to (potential) restarts.
>
> The problem arose because not all of the relevant circumstances are
> known at the time the reload action is created. We may figure out later
> that a resource the reloading resource depends on must be restarted,
> therefore the reloading resource must be fully restarted instead of
> reloaded. E.g. a database resource might otherwise be able to reload,
> but not if the filesystem it's using is going away.
>
> Previously in those cases, we would end up scheduling both the reload
> and the restart. Now, we schedule only the restart.
Hi Ken,
1.1.18-rc3 indeed schedules a restart, not a reload, like 1.1.16 did.
However, this wasn't my problem, I really expect a reload on the change
of a non-unique parameter. Them problem was that 1.1.16 also executed a
stop action in parallel with the reload.
Maybe I test it wrong: I just copied the pe-input file to another system
(which doesn't even know this resource agent) running 1.1.18-rc3 and
gave it to crm_simulate. Does the pe-input file contain all the
information necessary to decide between restart and reload? The
op-force-restart attribute does not contain the name of the changed
parameter, but I can't find any info on what changed at all. Should I
see a clean reload in this test setup at all?
--
Thanks,
Feri
More information about the Users
mailing list