[ClusterLabs] Previous DC fenced prior to integration

Nate Clark nate at neworld.us
Mon Aug 1 17:01:20 EDT 2016


On Fri, Jul 29, 2016 at 4:09 PM, Ken Gaillot <kgaillot at redhat.com> wrote:
> On 07/28/2016 01:48 PM, Nate Clark wrote:
>> On Mon, Jul 25, 2016 at 2:48 PM, Nate Clark <nate at neworld.us> wrote:
>>> On Mon, Jul 25, 2016 at 11:20 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
>>>> On 07/23/2016 10:14 PM, Nate Clark wrote:
>>>>> On Sat, Jul 23, 2016 at 1:06 AM, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
>>>>>> 23.07.2016 01:37, Nate Clark пишет:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am running pacemaker 1.1.13 with corosync and think I may have
>>>>>>> encountered a start up timing issue on a two node cluster. I didn't
>>>>>>> notice anything in the changelog for 14 or 15 that looked similar to
>>>>>>> this or open bugs.
>>>>>>>
>>>>>>> The rough out line of what happened:
>>>>>>>
>>>>>>> Module 1 and 2 running
>>>>>>> Module 1 is DC
>>>>>>> Module 2 shuts down
>>>>>>> Module 1 updates node attributes used by resources
>>>>>>> Module 1 shuts down
>>>>>>> Module 2 starts up
>>>>>>> Module 2 votes itself as DC
>>>>>>> Module 1 starts up
>>>>>>> Module 2 sees module 1 in corosync and notices it has quorum
>>>>>>> Module 2 enters policy engine state.
>>>>>>> Module 2 policy engine decides to fence 1
>>>>>>> Module 2 then continues and starts resource on itself based upon the old state
>>>>>>>
>>>>>>> For some reason the integration never occurred and module 2 starts to
>>>>>>> perform actions based on stale state.
>>>>>>>
>>>>>>> Here is the full logs
>>>>>>> Jul 20 16:29:06.376805 module-2 crmd[21969]:   notice: Connecting to
>>>>>>> cluster infrastructure: corosync
>>>>>>> Jul 20 16:29:06.386853 module-2 crmd[21969]:   notice: Could not
>>>>>>> obtain a node name for corosync nodeid 2
>>>>>>> Jul 20 16:29:06.392795 module-2 crmd[21969]:   notice: Defaulting to
>>>>>>> uname -n for the local corosync node name
>>>>>>> Jul 20 16:29:06.403611 module-2 crmd[21969]:   notice: Quorum lost
>>>>>>> Jul 20 16:29:06.409237 module-2 stonith-ng[21965]:   notice: Watching
>>>>>>> for stonith topology changes
>>>>>>> Jul 20 16:29:06.409474 module-2 stonith-ng[21965]:   notice: Added
>>>>>>> 'watchdog' to the device list (1 active devices)
>>>>>>> Jul 20 16:29:06.413589 module-2 stonith-ng[21965]:   notice: Relying
>>>>>>> on watchdog integration for fencing
>>>>>>> Jul 20 16:29:06.416905 module-2 cib[21964]:   notice: Defaulting to
>>>>>>> uname -n for the local corosync node name
>>>>>>> Jul 20 16:29:06.417044 module-2 crmd[21969]:   notice:
>>>>>>> pcmk_quorum_notification: Node module-2[2] - state is now member (was
>>>>>>> (null))
>>>>>>> Jul 20 16:29:06.421821 module-2 crmd[21969]:   notice: Defaulting to
>>>>>>> uname -n for the local corosync node name
>>>>>>> Jul 20 16:29:06.422121 module-2 crmd[21969]:   notice: Notifications disabled
>>>>>>> Jul 20 16:29:06.422149 module-2 crmd[21969]:   notice: Watchdog
>>>>>>> enabled but stonith-watchdog-timeout is disabled
>>>>>>> Jul 20 16:29:06.422286 module-2 crmd[21969]:   notice: The local CRM
>>>>>>> is operational
>>>>>>> Jul 20 16:29:06.422312 module-2 crmd[21969]:   notice: State
>>>>>>> transition S_STARTING -> S_PENDING [ input=I_PENDING
>>>>>>> cause=C_FSA_INTERNAL origin=do_started ]
>>>>>>> Jul 20 16:29:07.416871 module-2 stonith-ng[21965]:   notice: Added
>>>>>>> 'fence_sbd' to the device list (2 active devices)
>>>>>>> Jul 20 16:29:08.418567 module-2 stonith-ng[21965]:   notice: Added
>>>>>>> 'ipmi-1' to the device list (3 active devices)
>>>>>>> Jul 20 16:29:27.423578 module-2 crmd[21969]:  warning: FSA: Input
>>>>>>> I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
>>>>>>> Jul 20 16:29:27.424298 module-2 crmd[21969]:   notice: State
>>>>>>> transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC
>>>>>>> cause=C_TIMER_POPPED origin=election_timeout_popped ]
>>>>>>> Jul 20 16:29:27.460834 module-2 crmd[21969]:  warning: FSA: Input
>>>>>>> I_ELECTION_DC from do_election_check() received in state S_INTEGRATION
>>>>>>> Jul 20 16:29:27.463794 module-2 crmd[21969]:   notice: Notifications disabled
>>>>>>> Jul 20 16:29:27.463824 module-2 crmd[21969]:   notice: Watchdog
>>>>>>> enabled but stonith-watchdog-timeout is disabled
>>>>>>> Jul 20 16:29:27.473285 module-2 attrd[21967]:   notice: Defaulting to
>>>>>>> uname -n for the local corosync node name
>>>>>>> Jul 20 16:29:27.498464 module-2 pengine[21968]:   notice: Relying on
>>>>>>> watchdog integration for fencing
>>>>>>> Jul 20 16:29:27.498536 module-2 pengine[21968]:   notice: We do not
>>>>>>> have quorum - fencing and resource management disabled
>>>>>>> Jul 20 16:29:27.502272 module-2 pengine[21968]:  warning: Node
>>>>>>> module-1 is unclean!
>>>>>>> Jul 20 16:29:27.502287 module-2 pengine[21968]:   notice: Cannot fence
>>>>>>> unclean nodes until quorum is attained (or no-quorum-policy is set to
>>>>>>> ignore)
>>>>
>>>> The above two messages indicate that module-2 cannot see module-1 at
>>>> startup, therefore it must assume it is potentially misbehaving and must
>>>> be shot. However, since it does not have quorum with only one out of two
>>>> nodes, it must wait until module-1 joins until it can shoot it!
>>>>
>>>> This is a special problem with quorum in a two-node cluster. There are a
>>>> variety of ways to deal with it, but the simplest is to set "two_node:
>>>> 1" in corosync.conf (with corosync 2 or later). This will make each node
>>>> wait for the other at startup, meaning both nodes must be started before
>>>> the cluster can start working, but from that point on, it will assume it
>>>> has quorum, and use fencing to ensure any lost node is really down.
>>>
>>> two_node is set to 1 for this system. I understand what you are saying
>>> but what usually happens is S_INTEGRATION occurs after quorum as
>>> achieved and the current DC acknowledges the other node which just
>>> started and then accepts into the cluster. However it looks like
>>> S_POLICY_ENGINE occurred first.
>>
>> I saw a similar situation occur on another two node system. Based on
>> Ken's previous comment it sounds like this is unexpected behavior for
>> when two_node is enabled, or did I misinterpret his comment?
>>
>> Thanks
>> -nate
>
> I didn't think it through properly ... two_node will only affect quorum,
> so the above sequence makes sense once the cluster decides fencing is
> necessary.
>
> I'm not sure why it sometimes goes into S_INTEGRATION and sometimes
> S_POLICY_ENGINE. In the above logs, it goes into S_INTEGRATION because
> the DC election timed out. How are the logs in the successful case
> different? Maybe the other node happens to join before the DC election
> times out?

In both cases the DC election has occurred prior to the second node
coming online.

Here are more logs from the more successful startup. DC election
occurs and the first node elects and integrates itself and then the
second node becomes available and is integrated into the cluster.

Jul 22 18:42:10 d25-22-left crmd[17714]:   notice: State transition
S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_started ]
Jul 22 18:42:11 d25-22-left stonith-ng[17710]:   notice: Added
'fence_sbd' to the device list (2 active devices)
Jul 22 18:42:12 d25-22-left stonith-ng[17710]:   notice: Added
'ipmi-73' to the device list (3 active devices)
Jul 22 18:42:31 d25-22-left crmd[17714]:  warning: FSA: Input
I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
Jul 22 18:42:31 d25-22-left crmd[17714]:   notice: State transition
S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_TIMER_POPPED
origin=election_timeout_popped ]
Jul 22 18:42:31 d25-22-left crmd[17714]:  warning: FSA: Input
I_ELECTION_DC from do_election_check() received in state S_INTEGRATION
Jul 22 18:42:31 d25-22-left crmd[17714]:   notice: Notifications disabled
Jul 22 18:42:31 d25-22-left crmd[17714]:   notice: Watchdog enabled
but stonith-watchdog-timeout is disabled
Jul 22 18:42:31 d25-22-left attrd[17712]:   notice: Defaulting to
uname -n for the local corosync node name
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Relying on
watchdog integration for fencing
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: We do not have
quorum - fencing and resource management disabled
Jul 22 18:42:32 d25-22-left pengine[17713]:  warning: Node
d25-22-right is unclean!
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Cannot fence
unclean nodes until quorum is attained (or no-quorum-policy is set to
ignore)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start
fence_sbd        (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start   ipmi-73
      (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start   SlaveIP
      (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start
postgres:0        (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start
ethmonitor:0        (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start
tomcat-instance:0        (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:   notice: Start
ClusterMonitor:0        (d25-22-left - blocked)
Jul 22 18:42:32 d25-22-left pengine[17713]:  warning: Calculated
Transition 0: /var/lib/pacemaker/pengine/pe-warn-0.bz2
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
4: monitor fence_sbd_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
5: monitor ipmi-72_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
6: monitor ipmi-73_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
7: monitor MasterIP_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
8: monitor SlaveIP_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
9: monitor postgres:0_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
10: monitor ethmonitor:0_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
11: monitor tomcat-instance:0_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
12: monitor ClusterMonitor:0_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
13: monitor N1F1_monitor_0 on d25-22-left (local)
Jul 22 18:42:32 d25-22-left crmd[17714]:   notice: Initiating action
14: monitor N2F1_monitor_0 on d25-22-left (local)
Jul 22 18:42:33 d25-22-left crmd[17714]:   notice: Operation
fence_sbd_monitor_0: not running (node=d25-22-left, call=5, rc=7,
cib-update=29, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
ipmi-72_monitor_0: not running (node=d25-22-left, call=9, rc=7,
cib-update=30, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
ipmi-73_monitor_0: not running (node=d25-22-left, call=13, rc=7,
cib-update=31, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Transition aborted
by status-1-master-postgres, master-postgres=-INFINITY: Transient
attribute change (create cib=1.94.6, source=abort_unless_down:319,
path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1'],
0)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
ClusterMonitor_monitor_0: not running (node=d25-22-left, call=41,
rc=7, cib-update=32, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
ethmonitor_monitor_0: not running (node=d25-22-left, call=31, rc=7,
cib-update=33, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:    error: pcmkRegisterNode:
Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
tomcat-instance_monitor_0: not running (node=d25-22-left, call=36,
rc=7, cib-update=34, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
MasterIP_monitor_0: not running (node=d25-22-left, call=17, rc=7,
cib-update=35, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
SlaveIP_monitor_0: not running (node=d25-22-left, call=21, rc=7,
cib-update=36, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
N2F1_monitor_0: not running (node=d25-22-left, call=49, rc=7,
cib-update=37, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
N1F1_monitor_0: not running (node=d25-22-left, call=45, rc=7,
cib-update=38, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Operation
postgres_monitor_0: not running (node=d25-22-left, call=26, rc=7,
cib-update=39, confirmed=true)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Initiating action
3: probe_complete probe_complete-d25-22-left on d25-22-left (local) -
no waiting
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Transition 0
(Complete=13, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-0.bz2): Complete
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Relying on
watchdog integration for fencing
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: We do not have
quorum - fencing and resource management disabled
Jul 22 18:42:34 d25-22-left pengine[17713]:  warning: Node
d25-22-right is unclean!
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Cannot fence
unclean nodes until quorum is attained (or no-quorum-policy is set to
ignore)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
fence_sbd        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start   ipmi-73
      (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start   SlaveIP
      (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
postgres:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
ethmonitor:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
tomcat-instance:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
ClusterMonitor:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:  warning: Calculated
Transition 1: /var/lib/pacemaker/pengine/pe-warn-1.bz2
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Relying on
watchdog integration for fencing
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: We do not have
quorum - fencing and resource management disabled
Jul 22 18:42:34 d25-22-left pengine[17713]:  warning: Node
d25-22-right is unclean!
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Cannot fence
unclean nodes until quorum is attained (or no-quorum-policy is set to
ignore)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
fence_sbd        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start   ipmi-73
      (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start   SlaveIP
      (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
postgres:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
ethmonitor:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
tomcat-instance:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left pengine[17713]:   notice: Start
ClusterMonitor:0        (d25-22-left - blocked)
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: Transition 2
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-2.bz2): Complete
Jul 22 18:42:34 d25-22-left crmd[17714]:   notice: State transition
S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Jul 22 18:42:34 d25-22-left pengine[17713]:  warning: Calculated
Transition 2: /var/lib/pacemaker/pengine/pe-warn-2.bz2
Jul 22 18:42:40 d25-22-left crmd[17714]:   notice: State transition
S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]
Jul 22 18:42:41 d25-22-left crmd[17714]:   notice: Notifications disabled
Jul 22 18:42:41 d25-22-left crmd[17714]:   notice: Watchdog enabled
but stonith-watchdog-timeout is disabled
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Relying on
watchdog integration for fencing
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: We do not have
quorum - fencing and resource management disabled
Jul 22 18:42:41 d25-22-left pengine[17713]:  warning: Node
d25-22-right is unclean!
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Cannot fence
unclean nodes until quorum is attained (or no-quorum-policy is set to
ignore)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start
fence_sbd        (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start   ipmi-73
      (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start   SlaveIP
      (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start
postgres:0        (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start
ethmonitor:0        (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start
tomcat-instance:0        (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:   notice: Start
ClusterMonitor:0        (d25-22-left - blocked)
Jul 22 18:42:41 d25-22-left pengine[17713]:  warning: Calculated
Transition 3: /var/lib/pacemaker/pengine/pe-warn-3.bz2
Jul 22 18:42:41 d25-22-left crmd[17714]:   notice: Transition 3
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-3.bz2): Complete
Jul 22 18:42:41 d25-22-left crmd[17714]:   notice: State transition
S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Jul 22 18:42:48 d25-22-left corosync[17646]:   [TOTEM ] A new
membership (172.20.71.72:84) was formed. Members joined: 2
Jul 22 18:42:48 d25-22-left corosync[17646]:   [VOTEQ ] Waiting for
all cluster members. Current votes: 1 expected_votes: 2
Jul 22 18:42:48 d25-22-left corosync[17646]:   [QUORUM] This node is
within the primary component and will provide service.
Jul 22 18:42:48 d25-22-left corosync[17646]:   [QUORUM] Members[2]: 1 2
Jul 22 18:42:48 d25-22-left corosync[17646]:   [MAIN  ] Completed
service synchronization, ready to provide service.
Jul 22 18:42:48 d25-22-left crmd[17714]:   notice: Membership 84:
quorum acquired (2)
Jul 22 18:42:48 d25-22-left pacemakerd[17687]:   notice: Membership
84: quorum acquired (2)
Jul 22 18:42:48 d25-22-left crmd[17714]:   notice: Could not obtain a
node name for corosync nodeid 2
Jul 22 18:42:48 d25-22-left pacemakerd[17687]:   notice: Could not
obtain a node name for corosync nodeid 2
Jul 22 18:42:48 d25-22-left crmd[17714]:   notice: Could not obtain a
node name for corosync nodeid 2
Jul 22 18:42:48 d25-22-left crmd[17714]:   notice:
pcmk_quorum_notification: Node (null)[2] - state is now member (was
(null))
Jul 22 18:42:48 d25-22-left pacemakerd[17687]:   notice: Could not
obtain a node name for corosync nodeid 2
Jul 22 18:42:48 d25-22-left pacemakerd[17687]:   notice:
pcmk_quorum_notification: Node (null)[2] - state is now member (was
(null))
Jul 22 18:42:49 d25-22-left pacemakerd[17687]:   notice: Could not
obtain a node name for corosync nodeid 2
Jul 22 18:42:49 d25-22-left attrd[17712]:   notice: Could not obtain a
node name for corosync nodeid 2
Jul 22 18:42:49 d25-22-left attrd[17712]:   notice:
crm_update_peer_proc: Node (null)[2] - state is now member (was
(null))
Jul 22 18:42:49 d25-22-left stonith-ng[17710]:   notice: Could not
obtain a node name for corosync nodeid 2
Jul 22 18:42:49 d25-22-left stonith-ng[17710]:   notice:
crm_update_peer_proc: Node (null)[2] - state is now member (was
(null))
Jul 22 18:42:49 d25-22-left stonith-ng[17710]:   notice: Defaulting to
uname -n for the local corosync node name
Jul 22 18:42:49 d25-22-left cib[17709]:   notice: Could not obtain a
node name for corosync nodeid 2
Jul 22 18:42:49 d25-22-left cib[17709]:   notice:
crm_update_peer_proc: Node (null)[2] - state is now member (was
(null))
Jul 22 18:42:50 d25-22-left crmd[17714]:   notice: Could not obtain a
node name for corosync nodeid 2
Jul 22 18:42:51 d25-22-left crmd[17714]:   notice: State transition
S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_HA_MESSAGE
origin=route_message ]

It seems like it could be dangerous for a node in a two node system to
fence the other node as soon as quorum is reached since the node doing
the fencing could be the one out of date.

-nate




More information about the Users mailing list