[ClusterLabs] Q (SLES11 SP4): Delay after node came up (info: throttle_send_command: New throttle mode: 0000 (was ffffffff))

Thu Feb 9 17:16:47 EST 2017

On 01/16/2017 04:25 AM, Ulrich Windl wrote:
> Hi!
> 
> I have a question: The following happened in out 3-node cluster (n1, n2, n3):
> n3 was DC, n2 was offlined, n2 came online again, n1 rebooted (went offline/online), then n2 reboted (offline /online)
> 
> I observed a significant delay after all three nodes were online before resources were started. Actualy the start seemed to be triggered by some crm restart action on n3.
> 
> Logs on n3 (DC) look like this:
> cib:     info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=local/crmd/359, version=1.99.1)
> crmd:   notice: handle_request:   Current ping state: S_TRANSITION_ENGINE
> (...many more...)
> stonith-ng:     info: plugin_handle_membership:         Membership 3328: quorum retained
> crmd:     info: plugin_handle_membership:         Membership 3328: quorum retained
> [...]
> stonith-ng:     info: plugin_handle_membership:         Membership 3328: quorum retained
> [...]
> cib:     info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=local/crmd/365, version=1.99.3)
> crmd:     info: crmd_cs_dispatch:         Setting expected votes to 3
> crmd:     info: plugin_handle_membership:         Membership 3328: quorum retained
> [...]
> crmd:     info: crmd_cs_dispatch:         Setting expected votes to 3
> crmd:     info: do_state_transition:      State transition S_TRANSITION_ENGINE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=peer_update_callback ]
> crmd:     info: do_dc_join_offer_one:     An unknown node joined - (re-)offer to any unconfirmed nodes
> (???what's that?)

This is normal when a node joins the cluster. The DC's cluster layer
detects any joins, and the DC's crmd responds to that by offering
membership to the new node(s).

> crmd:     info: join_make_offer:  Making join offers based on membership 3328
> crmd:     info: join_make_offer:  join-2: Sending offer to n2
> crmd:     info: crm_update_peer_join:     join_make_offer: Node n2[739512325] - join-2 phase 0 -> 1
> crmd:     info: join_make_offer:  Skipping n1: already known 4
> crmd:     info: join_make_offer:  Skipping n3: already known 4

Above we can see that n1 and n3 already have confirmed membership, and
the newly joined n2 gets offered membership.

> crmd:   notice: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)

This is one of the common log messages I think can be improved. "Peer
Halt" in this case does not mean the peer halted, but rather that the
transition was halted due to a peer event (in this case the join).

> cib:     info: cib_process_request:      Completed cib_modify operation for section crm_config: OK (rc=0, origin=local/crmd/375, version=1.99.5)
> crmd:     info: crm_update_peer_join:     do_dc_join_filter_offer: Node n2[739512325] - join-2 phase 1 -> 2
> crmd:     info: crm_update_peer_expected:         do_dc_join_filter_offer: Node n2[739512325] - expected state is now member (was down)
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)
> crmd:     info: do_state_transition:      State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]
> crmd:     info: crmd_join_phase_log:      join-2: n2=integrated
> crmd:     info: crmd_join_phase_log:      join-2: n1=confirmed
> crmd:     info: crmd_join_phase_log:      join-2: n3=confirmed
> crmd:   notice: do_dc_join_finalize:      join-2: Syncing the CIB from n2 to the rest of the cluster
> [...]
> cib:     info: cib_process_replace:      Replaced 1.99.5 with 1.99.5 from n2
> cib:     info: cib_process_request:      Completed cib_replace operation for section 'all': OK (rc=0, origin=n2/crmd/376, version=1.99.5)
> crmd:     info: crm_update_peer_join:     finalize_join_for: Node n2[739512325] - join-2 phase 2 -> 3
> crmd:     info: do_log:   FSA: Input I_WAIT_FOR_EVENT from do_te_invoke() received in state S_FINALIZE_JOIN
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)
> [...]
> cib:     info: cib_file_write_with_digest:       Wrote version 1.99.0 of the CIB to disk (digest: 6e71ae6f4a1d2619cc64c91d40f55a32)
> (??? We already had 1.99.5)

Only .0's are written to disk -- the .x's contain updates to dynamic
information (like the status section) and are in-memory only.

> cib:     info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=n2/attrd/3, version=1.99.6)
> crmd:     info: crm_update_peer_join:     do_dc_join_ack: Node n2[739512325] - join-2 phase 3 -> 4
> crmd:     info: do_dc_join_ack:   join-2: Updating node state to member for n2
> [...]
> crmd:   notice: handle_request:   Current ping state: S_FINALIZE_JOIN
> crmd:     info: do_log:   FSA: Input I_WAIT_FOR_EVENT from do_te_invoke() received in state S_FINALIZE_JOIN
> [...]
> crmd:     info: do_state_transition:      State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ]
> crmd:   notice: crm_update_quorum:        Updating quorum status to true (call=385)
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Cancelled (source=do_te_invoke:161, 0)
> [...]
> crmd:   notice: handle_request:   Current ping state: S_POLICY_ENGINE
> [...]
> (the following messages repeat for a long time)
> crmd:   notice: handle_request:   Current ping state: S_POLICY_ENGINE
> crmd:     info: do_log:   FSA: Input I_WAIT_FOR_EVENT from do_te_invoke() received in state S_POLICY_ENGINE
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)
> crmd:     info: do_log:   FSA: Input I_WAIT_FOR_EVENT from do_te_invoke() received in state S_POLICY_ENGINE
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)
> crmd:     info: do_log:   FSA: Input I_WAIT_FOR_EVENT from do_te_invoke() received in state S_POLICY_ENGINE
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)
> [...]
> crmd:     info: abort_transition_graph:   Transition aborted: Peer Halt (source=do_te_invoke:168, 0)
> lrmd:     info: log_finished:     finished - rsc:prm_xen_v09 action:stop call_id:146 pid:6420 exit-code:0 exec-time:331841ms queue-time:0ms
> [...]
> crmd:   notice: handle_request:   Current ping state: S_POLICY_ENGINE
> pengine:     info: determine_online_status_fencing:  Node n1 is active
> pengine:     info: determine_online_status:  Node n1 is online
> pengine:     info: determine_online_status_fencing:  Node n3 is active
> pengine:     info: determine_online_status:  Node n3 is online
> pengine:     info: determine_online_status_fencing:  Node n2 is active
> pengine:     info: determine_online_status:  Node n2 is online
> (now the cluster seems to become active...)
> 
> On node n2:
> [...]
> crmd:   notice: do_state_transition:      State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
> attrd:   notice: attrd_perform_update:     Sent update 3: shutdown=0
> [...]
> crmd:     info: throttle_send_command:    New throttle mode: 0000 (was ffffffff)
> (...Here is the big delay when noting seemed to happen)
> cib:     info: cib_perform_op:   Diff: --- 1.99.9 2
> 
> On n1 the delay was between these two lines:
> cib:     info: cib_process_ping:         Reporting our current digest to n3: 497fb3b47c84a24f0cc7e127b219fdae for 1.99.9 (0xc7cbb0 0)
> cib:     info: cib_perform_op:   Diff: --- 1.99.9 2
> 
> What may be causing this? It was almost like a three-minute delay.

Unfortunately, I don't have any ideas here.

> Software is the latest version for SLES11 SP4 (pacemaker-1.1.12-18.1).
> 
> Regards,
> Ulrich