[ClusterLabs] Regarding Finalization Timer (I_ELECTION) just popped (1800000ms)

Prasad Nagaraj prasad.nagaraj76 at gmail.com
Sat May 18 08:37:35 EDT 2019


Thank you for the response Ken. Will watch for this to be reproduced again.

On Thu, May 16, 2019 at 4:10 AM Ken Gaillot <kgaillot at redhat.com> wrote:

> Resurrecting a really old thread in case anyone had similar questions.
> This arrived at a crazy busy time and got neglected.
>
> The "finalization timer" is a timeout in the DC election process. The
> value is the join-finalization-timeout cluster property (formerly crmd-
> finalization-timeout), which defaults to 30 minutes.
>
> The whole election process is undocumented and quite arcane. It would
> be nice to document it but that's a bigger project than there is time
> for at the moment.
>
> The controller (crmd) is implemented as a finite state machine, meaning
> various inputs move it from one state to another according to fixed
> rules. The finalization timer is started when the "finalize join" state
> is reached, and stopped whenever that state is left. This state is
> achieved once the (possibly newly elected) DC has received join
> requests from all nodes, and is left once the DC has sync'd the CIB to
> those nodes, ack'd their join requests, and received confirmations back
> from them.
>
> Obviously it's too late to look at this particular incident, but if it
> can be reliably reproduced, I can take a new look.
>
> On Sun, 2018-10-28 at 23:32 +0530, Prasad Nagaraj wrote:
> > Hi :
> >
> > I came across  a strange situation in my cluster few days back. I was
> > trying to replace one of the nodes in my 3 node cluster and I removed
> > and added that node back following due process as per
> >
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_removing_a_corosync_node.html
> > . Updated corosync.conf on all nodes and restarted corosync and
> > pacemaker on all the nodes. The cluster didnt elect a DC and also
> > didnt report any activities for almost 28 mins as seen from below
> > logs. Then I saw this message:
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_timer_popped:    Finalization Timer (I_ELECTION) just popped
> > (1800000ms)
> > after which I could see further activities happening including DC
> > election.
> >
> > I was not able to understand or identify any reasons for this
> > behavior and also there is absolutely no documentation on this
> > Finalization timer and what it means. Appreciate any help in terms of
> > explaining what exactly this timer means and what could be reasons
> > for this behavior. I have pasted a snippet of logs during the time
> > here. I do have more logs and also logs from other nodes that I can
> > share if required.
> >
> > Thanks in advance for the help!
> > Prasad
> >
> >
> > Oct 22 22:07:46 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > nodes: OK (rc=0, origin=vm46890219c5/crm_attribute/4,
> > version=0.100.0)
> > Oct 22 22:07:46 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Reading cluster configuration file
> > /var/lib/pacemaker/cib/cib.QC3Kay (digest:
> > /var/lib/pacemaker/cib/cib.c74mjr)
> > Oct 22 22:07:47 [76412] vm85c4465533        cib:     info:
> > cib_file_backup:     Archived previous version as
> > /var/lib/pacemaker/cib/cib-96.raw
> > Oct 22 22:07:47 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Wrote version 0.100.0 of the CIB to disk
> > (digest: 3ab4bffdc9c372985cfe50ad3100131d)
> > Oct 22 22:07:47 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Reading cluster configuration file
> > /var/lib/pacemaker/cib/cib.7NqOEy (digest:
> > /var/lib/pacemaker/cib/cib.gamHhs)
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      Diff: --- 0.100.0 2
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      Diff: +++ 0.101.0 (null)
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      +  /cib:  @epoch=101
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      ++ /cib/configuration/constraints:
> > <rsc_location id="ms_mysql_member453" rsc="ms_mysql" score="0"
> > node="vm46890219c5"/>
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_apply_diff operation for section
> > 'all': OK (rc=0, origin=local/cibadmin/2, version=0.101.0)
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_file_backup:     Archived previous version as
> > /var/lib/pacemaker/cib/cib-97.raw
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Wrote version 0.101.0 of the CIB to disk
> > (digest: 7ffa91a8ca752581d4c1df13d287e467)
> > Oct 22 22:07:48 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Reading cluster configuration file
> > /var/lib/pacemaker/cib/cib.JkMV2C (digest:
> > /var/lib/pacemaker/cib/cib.mLy23A)
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      Diff: --- 0.101.0 2
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      Diff: +++ 0.102.0 (null)
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      +  /cib:  @epoch=102
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_perform_op:      +
> > /cib/configuration/resources/master[@id='ms_mysql']/meta_attributes[@
> > id='ms_mysql-meta_attributes']/nvpair[@id='ms_mysql-meta_attributes-
> > maintenance']:  @value=false
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > resources: OK (rc=0, origin=local/crm_resource/6, version=0.102.0)
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_file_backup:     Archived previous version as
> > /var/lib/pacemaker/cib/cib-98.raw
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Wrote version 0.102.0 of the CIB to disk
> > (digest: dd6263dc226d652721b09ec702e37742)
> > Oct 22 22:07:49 [76412] vm85c4465533        cib:     info:
> > cib_file_write_with_digest:  Reading cluster configuration file
> > /var/lib/pacemaker/cib/cib.DOunDE (digest:
> > /var/lib/pacemaker/cib/cib.MYBrjE)
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_timer_popped:    Finalization Timer (I_ELECTION) just popped
> > (1800000ms)
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > do_state_transition: State transition S_FINALIZE_JOIN -> S_ELECTION [
> > input=I_ELECTION cause=C_TIMER_POPPED origin=crm_timer_popped ]
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > update_dc:   Unset DC. Was vm85c4465533
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > election_complete:   Election election-0 complete
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > election_timeout_popped:     Election failed: Declaring ourselves the
> > winner
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info: do_log:
> >   FSA: Input I_ELECTION_DC from election_timeout_popped() received in
> > state S_ELECTION
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:   notice:
> > do_state_transition: State transition S_ELECTION -> S_INTEGRATION [
> > input=I_ELECTION_DC cause=C_TIMER_POPPED
> > origin=election_timeout_popped ]
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > do_dc_takeover:      Taking over DC status for this partition
> > Oct 22 22:35:14 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_master operation for section
> > 'all': OK (rc=0, origin=local/crmd/86, version=0.102.0)
> > Oct 22 22:35:14 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section cib:
> > OK (rc=0, origin=local/crmd/87, version=0.102.0)
> > Oct 22 22:35:14 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > crm_config: OK (rc=0, origin=local/crmd/89, version=0.102.0)
> > Oct 22 22:35:14 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > crm_config: OK (rc=0, origin=local/crmd/91, version=0.102.0)
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        initialize_join: Node
> > vm46890219c5[218109612] - join-2 phase 4 -> 0
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        initialize_join: Node
> > vm85c4465533[83891884] - join-2 phase 4 -> 0
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        initialize_join: Node
> > vm2ad44008dc[251664044] - join-2 phase 3 -> 0
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > join_make_offer:     join-2: Sending offer to vm46890219c5
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        join_make_offer: Node
> > vm46890219c5[218109612] - join-2 phase 0 -> 1
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > join_make_offer:     join-2: Sending offer to vm85c4465533
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        join_make_offer: Node
> > vm85c4465533[83891884] - join-2 phase 0 -> 1
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > join_make_offer:     join-2: Sending offer to vm2ad44008dc
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        join_make_offer: Node
> > vm2ad44008dc[251664044] - join-2 phase 0 -> 1
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > do_dc_join_offer_all:        join-2: Waiting on 3 outstanding join
> > acks
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:  warning: do_log:
> >   FSA: Input I_ELECTION_DC from do_election_check() received in state
> > S_INTEGRATION
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        initialize_join: Node
> > vm46890219c5[218109612] - join-3 phase 1 -> 0
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        initialize_join: Node
> > vm85c4465533[83891884] - join-3 phase 1 -> 0
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        initialize_join: Node
> > vm2ad44008dc[251664044] - join-3 phase 1 -> 0
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > join_make_offer:     join-3: Sending offer to vm46890219c5
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        join_make_offer: Node
> > vm46890219c5[218109612] - join-3 phase 0 -> 1
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > join_make_offer:     join-3: Sending offer to vm85c4465533
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        join_make_offer: Node
> > vm85c4465533[83891884] - join-3 phase 0 -> 1
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > join_make_offer:     join-3: Sending offer to vm2ad44008dc
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crm_update_peer_join:        join_make_offer: Node
> > vm2ad44008dc[251664044] - join-3 phase 0 -> 1
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > do_dc_join_offer_all:        join-3: Waiting on 3 outstanding join
> > acks
> > Oct 22 22:35:14 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > crm_config: OK (rc=0, origin=local/crmd/93, version=0.102.0)
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > plugin_handle_membership:    Membership 52: quorum retained
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crmd_cs_dispatch:    Setting expected votes to 3
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > plugin_handle_membership:    Membership 52: quorum retained
> > Oct 22 22:35:14 [76412] vm85c4465533        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > crm_config: OK (rc=0, origin=local/crmd/96, version=0.102.0)
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > crmd_cs_dispatch:    Setting expected votes to 3
> > Oct 22 22:35:14 [76417] vm85c4465533       crmd:     info:
> > update_dc:   Set DC to vm85c4465533 (3.0.10)
> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190518/cb69a399/attachment-0001.html>


More information about the Users mailing list