[ClusterLabs] DC leaving the cluster: some odd messages
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Fri Feb 24 02:48:54 EST 2017
Hi!
I cleanly stopped OpenAIS on a sLES11 SP4 node. On another node I saw some strange messages:
Feb 24 08:38:58 h01 corosync[2822]: [pcmk ] info: update_member: Node h10 now has process list: 00000000000000000000000000000002 (2)
Feb 24 08:38:58 h01 corosync[2822]: [pcmk ] info: send_member_notification: Sending membership update 3360 to 5 children
Feb 24 08:38:58 h01 crmd[2832]: notice: peer_update_callback: Our peer on the DC (h10) is dead
Feb 24 08:38:58 h01 stonith-ng[2828]: notice: crm_update_peer_state: st_peer_update_callback: Node h10[739512330] - state is now lost (was member)
Feb 24 08:38:58 h01 cib[2827]: notice: crm_update_peer_state: cib_peer_update_callback: Node h10[739512330] - state is now lost (was member)
Feb 24 08:38:58 h01 cib[2827]: notice: crm_update_peer_state: plugin_handle_membership: Node h10[739512330] - state is now member (was lost)
Feb 24 08:38:58 h01 crmd[2832]: warning: reap_dead_nodes: Our DC node (h10) left the cluster
Feb 24 08:38:58 h01 stonith-ng[2828]: notice: crm_update_peer_state: plugin_handle_membership: Node h10[739512330] - state is now member (was lost)
So it looks like an down-up-down-up transition of node h10. Maybe this message contributes to the confusion:
Feb 24 08:38:58 h01 cib[2827]: warning: cib_server_process_diff: Something went wrong in compatibility mode, requesting full refresh
Feb 24 08:38:58 h01 corosync[2822]: [pcmk ] info: ais_mark_unseen_peer_dead: Node h10 was not seen in the previous transition
Feb 24 08:38:58 h01 corosync[2822]: [pcmk ] info: update_member: Node 739512330/h10 is now: lost
Feb 24 08:38:58 h01 crmd[2832]: notice: crm_update_peer_state: plugin_handle_membership: Node h10[739512330] - state is now lost (was member)
Feb 24 08:38:58 h01 crmd[2832]: warning: match_down_event: No match for shutdown action on h10
Feb 24 08:38:58 h01 crmd[2832]: notice: peer_update_callback: Stonith/shutdown of h10 not matched
Feb 24 08:38:58 h01 crmd[2832]: notice: crm_update_quorum: Updating quorum status to true (call=162)
What worries me a bit is "No match for shutdown action on h10": Shouldn't it be obvious from the CIB that h10 was intended to leave?
Regards,
Ulrich
More information about the Users
mailing list