[Pacemaker] pre_notify_demote is issued twice

Mon Mar 10 08:26:58 UTC 2014

Hi,

2014-02-24 10:49 GMT+09:00 Andrew Beekhof <andrew at beekhof.net>:
>
> On 21 Feb 2014, at 2:19 pm, Andrew Beekhof <andrew at beekhof.net> wrote:
>
>>
>> On 18 Feb 2014, at 1:23 pm, Andrew Beekhof <andrew at beekhof.net> wrote:
>>
>>>
>>> On 6 Feb 2014, at 7:45 pm, Keisuke MORI <keisuke.mori+ha at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I observed that pre_notify_demote is issued twice when a master
>>>> resource is migrating.
>>>> I'm wondering if this is the correct behavior.
>>>>
>>>> Steps to reproduce:
>>>>
>>>> - Start up 2 nodes cluster configured for the PostgreSQL streaming
>>>> replication using pgsql RA as  a master/slave resource.
>>>> - kill the postgresql process on the master node to induce a fail-over.
>>>> - The fail-over succeeds as expected, but pre_notify_demote was
>>>> executed twice on each node before demoting on the master resource.
>>>>
>>>> 100% reproducible on my cluster.
>>>>
>>>> Pacemaker version: 1.1.11-rc4 (source build from the repo)
>>>> OS: RHEL6.4
>>>>
>>>> I have never seen this on Pacemaker-1.0.* cluster with the same configuration.
>>>>
>>>> The relevant logs and pe-inputs are attached.
>>>>
>>>>
>>>> Diagnostics:
>>>>
>>>> (1) The first transition caused by the process failure (pe-input-160)
>>>> initiates pre_notify_demote on both nodes and cancelling slave monitor
>>>> on the slave node.
>>>> {{{
>>>> 171 Jan 30 16:08:59 rhel64-1 crmd[8143]:   notice: te_rsc_command:
>>>> Initiating action 9: cancel prmPostgresql_cancel_10000 on rhel64-2
>>>> 172 Jan 30 16:08:59 rhel64-1 crmd[8143]:   notice: te_rsc_command:
>>>> Initiating action 79: notify prmPostgresql_pre_notify_demote_0 on
>>>> rhel64-1 (local)
>>>>
>>>> 175 Jan 30 16:08:59 rhel64-1 crmd[8143]:   notice: te_rsc_command:
>>>> Initiating action 81: notify prmPostgresql_pre_notify_demote_0 on
>>>> rhel64-2
>>>> }}}
>>>>
>>>> (2) When cancelling slave monitor completes, the transition is aborted
>>>> by Resource op removal.
>>>> {{{
>>>> 176 Jan 30 16:08:59 rhel64-1 crmd[8143]:     info: match_graph_event:
>>>> Action prmPostgresql_monitor_10000 (9) confirmed on rhel64-2 (rc=0)
>>>> 177 Jan 30 16:08:59 rhel64-1 cib[8138]:     info: cib_process_request:
>>>> Completed cib_delete operation for section status: OK (rc=0,
>>>> origin=rhel64-2/crmd/21, version=0.37.9)
>>>> 178 Jan 30 16:08:59 rhel64-1 crmd[8143]:     info:
>>>> abort_transition_graph: te_update_diff:258 - Triggered transition
>>>> abort (complete=0, node=rhel64-2, tag=lrm_rsc_op,
>>>> id=prmPostgresql_monitor_10000,
>>>> magic=0:0;26:12:0:acf9a2a3-307c-460b-b786-fc20e6b8aad5, cib=0.37.9) :
>>>> Resource op removal
>>>> }}}
>>>>
>>>> (3) The second transition is calculated by the abort (pe-input-161)
>>>> which results initiating pre_notify_demote again.
>>>
>>> If the demote didn't complete (or wasn't even attempted), then we must send the pre_notify_demote again unfortunately.
>>> The real bug may well be that the transition shouldn't have been aborted.
>>
>> It looks legitimate:
>>
>> Jan 30 16:08:59 rhel64-1 crmd[8143]:     info: abort_transition_graph: te_update_diff:258 - Triggered transition abort (complete=0, node=rhel64-2, tag=lrm_rsc_op, id=prmPostgresql_monitor_10000, magic=0:0;26:12:0:acf9a2a3-307c-460b-b786-fc20e6b8aad5, cib=0.37.9) : Resource op removal
>
> It looks like get_cancel_action() was not functioning correctly:
>
>    https://github.com/beekhof/pacemaker/commit/9d77c99

Thanks for looking into it.

I have confirmed that the issue is now resolved with the recent
revision on your repo. at:
  https://github.com/beekhof/pacemaker/commit/04ff1bd2d144e7defd6f1f67f6bde6fa95c428e1

Thanks!

-- 
Keisuke MORI

>
>>
>> Jan 30 16:08:59 rhel64-1 cib[8138]:     info: cib_process_request: Completed cib_delete operation for section status: OK (rc=0, origin=rhel64-2/crmd/21, version=0.37.9)
>>
>> It looks like part of the node status entry being removed for rhel64-2.
>> Possibly as a result of:
>>
>> Jan 30 16:07:54 rhel64-2 crmd[25070]:     info: erase_status_tag: Deleting xpath: //node_state[@uname='rhel64-2']/transient_attributes
>>
>> The new cib code, being much faster, might help here too :)
>>
>>>
>>>> {{{
>>>> 227 Jan 30 16:09:01 rhel64-1 pengine[8142]:   notice:
>>>> process_pe_message: Calculated Transition 15:
>>>> /var/lib/pacemaker/pengine/pe-input-161.bz2
>>>> 229 Jan 30 16:09:01 rhel64-1 crmd[8143]:   notice: te_rsc_command:
>>>> Initiating action 78: notify prmPostgresql_pre_notify_demote_0 on
>>>> rhel64-1 (local)
>>>> 232 Jan 30 16:09:01 rhel64-1 crmd[8143]:   notice: te_rsc_command:
>>>> Initiating action 80: notify prmPostgresql_pre_notify_demote_0 on
>>>> rhel64-2
>>>> }}}
>>>>
>>>> I think that the transition abort at (2) should not happen.
>>>>
>>>> Regards,
>>>> --
>>>> Keisuke MORI
>>>> <logs-pre-notify-20140206.tar.bz2>_______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>