[ClusterLabs] both nodes OFFLINE
石井 俊直
i_j_e_x_a at yahoo.co.jp
Sat May 13 02:36:25 EDT 2017
Hi.
We have, sometimes, a problem in our two nodes cluster on CentOS7. Let node-2 and node-3
be the names of the nodes. When the problem happens, both nodes are recognized OFFLINE
on node-3 and on node-2, only node-3 is recognized OFFLINE.
When that happens, the following log message is added repeatedly on node-2 and log file
(/var/log/cluster/corosync.log) becomes hundreds of megabytes in short time. Log message
content on node-3 is different.
The erroneous state is temporally solved if OS of node-2 is restarted. On the other hand,
restarting OS of node-3 results in the same state.
I’ve searched content of ML and found a post (Mon Oct 1 01:27:39 CEST 2012) about
"Discarding update with feature set” problem. According to the message, our problem
may be solved by removing /var/lib/pacemaker/crm/cib.* on node-2.
What I want to know is whether removing the above files on just one of the node is safe ?
If there’s other method to solve the problem, I’d like to hear that.
Thanks.
—— from corosync.log ————————————————————————————————
cib: error: cib_perform_op: Discarding update with feature set '3.0.11' greater than our own '3.0.10'
cib: error: cib_process_request: Completed cib_replace operation for section 'all': Protocol not supported (rc=-93, origin=node-3/crmd/12708, version=0.83.30)
crmd: error: finalize_sync_callback: Sync from node-3 failed: Protocol not supported
crmd: info: register_fsa_error_adv: Resetting the current action list
crmd: warning: do_log: Input I_ELECTION_DC received in state S_FINALIZE_JOIN from finalize_sync_callback
crmd: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=finalize_sync_callback
crmd: info: crm_update_peer_join: initialize_join: Node node-2[1] - join-6329 phase 2 -> 0
crmd: info: crm_update_peer_join: initialize_join: Node node-3[2] - join-6329 phase 2 -> 0
crmd: info: update_dc: Unset DC. Was node-2
crmd: info: join_make_offer: join-6329: Sending offer to node-2
crmd: info: crm_update_peer_join: join_make_offer: Node node-2[1] - join-6329 phase 0 -> 1
crmd: info: join_make_offer: join-6329: Sending offer to node-3
crmd: info: crm_update_peer_join: join_make_offer: Node node-3[2] - join-6329 phase 0 -> 1
crmd: info: do_dc_join_offer_all: join-6329: Waiting on 2 outstanding join acks
crmd: info: update_dc: Set DC to node-2 (3.0.10)
crmd: info: crm_update_peer_join: do_dc_join_filter_offer: Node node-2[1] - join-6329 phase 1 -> 2
crmd: info: crm_update_peer_join: do_dc_join_filter_offer: Node node-3[2] - join-6329 phase 1 -> 2
crmd: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN | input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state
crmd: info: crmd_join_phase_log: join-6329: node-2=integrated
crmd: info: crmd_join_phase_log: join-6329: node-3=integrated
crmd: notice: do_dc_join_finalize: Syncing the Cluster Information Base from node-3 to rest of cluster | join-6329
crmd: notice: do_dc_join_finalize: Requested version <generation_tuple crm_feature_set="3.0.11" validate-with="pacemaker-2.5" epoch="84" num_updates="1" admin_epoch="0" cib-last-written="Thu May 11 08:05:45 2017" update-origin="node-2" update-client="crm_resource" update-user="root" have-quorum="1"/>
cib: info: cib_process_request: Forwarding cib_sync operation for section 'all' to node-3 (origin=local/crmd/12710)
cib: info: cib_process_replace: Digest matched on replace from node-3: 85a19c7927c54ccb15794f2720e07ce1
cib: info: cib_process_replace: Replaced 0.83.30 with 0.84.1 from node-3
cib: info: __xml_diff_object: Moved node_state at crmd (3 -> 2)
More information about the Users
mailing list