[Pacemaker] When stonith is enabled, resources won't start until after stonith, even though requires="nothing" and prereq="nothing" on RHEL 7 with pacemaker-1.1.11 compiled from source.

Paul E Cain pecain at us.ibm.com
Tue Jul 1 20:28:25 UTC 2014


Hi Andrew,

Thanks for fixing that.

I downloaded the build from this patch
(pacemaker-2a5bbf93cb1bfee4ff57425d4460122d0fba57ab.zip) and compiled it
from source. This time ha3_fabric_ping tried to start as expected and then
failed as expected. However, the stonith still occurred even though
fencing_route_to_ha4 wasn't up and running. Looking at the logs, it looks
like STONITH can still fence even if fencing_route_to_ha4 isn't running.
Unless you have an other suggestions, I'm thinking it would be best for me
to add a few lines to the fence agent to have it ping 10.10.0.1 before
fencing, not fencing and returning a fail code if the it cannot ping
10.10.0.1.

[root at ha3 crmsh-2.0.0]# crm_mon -1
Last updated: Tue Jul  1 14:42:52 2014
Last change: Tue Jul  1 14:33:06 2014
Stack: corosync
Current DC: ha3 (1) - partition WITHOUT quorum
Version: 1.1.11-2a5bbf9
2 Nodes configured
4 Resources configured


Node ha4 (2): UNCLEAN (offline)
Online: [ ha3 ]


Failed actions:
    ha3_fabric_ping_start_0 on ha3 'unknown error' (1): call=18,
status=complete, last-rc-change='Tue Jul  1 14:36:53 2014', queued=0ms,
exec=20027ms


<cib crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="10"
num_updates="15" admin_epoch="0" cib-last-written="Tue Jul  1 14:33:06
2014" have-quorum="0" dc-uuid="1">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair name="symmetric-cluster" value="true"
id="cib-bootstrap-options-symmetric-cluster"/>
        <nvpair name="stonith-enabled" value="true"
id="cib-bootstrap-options-stonith-enabled"/>
        <nvpair name="stonith-action" value="reboot"
id="cib-bootstrap-options-stonith-action"/>
        <nvpair name="no-quorum-policy" value="ignore"
id="cib-bootstrap-options-no-quorum-policy"/>
        <nvpair name="stop-orphan-resources" value="true"
id="cib-bootstrap-options-stop-orphan-resources"/>
        <nvpair name="stop-orphan-actions" value="true"
id="cib-bootstrap-options-stop-orphan-actions"/>
        <nvpair name="default-action-timeout" value="20s"
id="cib-bootstrap-options-default-action-timeout"/>
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="1.1.11-2a5bbf9"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
name="cluster-infrastructure" value="corosync"/>
        <nvpair id="cib-bootstrap-options-last-lrm-refresh"
name="last-lrm-refresh" value="1404242903"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="ha3"/>
      <node id="2" uname="ha4"/>
    </nodes>
    <resources>
      <primitive id="ha3_fabric_ping" class="ocf" provider="pacemaker"
type="ping">
        <instance_attributes id="ha3_fabric_ping-instance_attributes">
          <nvpair name="host_list" value="10.10.0.1"
id="ha3_fabric_ping-instance_attributes-host_list"/>
          <nvpair name="failure_score" value="1"
id="ha3_fabric_ping-instance_attributes-failure_score"/>
        </instance_attributes>
        <operations>
          <op name="start" timeout="60s" requires="nothing" interval="0"
id="ha3_fabric_ping-start-0">
            <instance_attributes
id="ha3_fabric_ping-start-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="ha3_fabric_ping-start-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
          <op name="monitor" interval="15s" requires="nothing"
timeout="15s" id="ha3_fabric_ping-monitor-15s">
            <instance_attributes
id="ha3_fabric_ping-monitor-15s-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="ha3_fabric_ping-monitor-15s-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
          <op name="stop" on-fail="fence" requires="nothing" interval="0"
id="ha3_fabric_ping-stop-0">
            <instance_attributes
id="ha3_fabric_ping-stop-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="ha3_fabric_ping-stop-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
        </operations>
      </primitive>
      <primitive id="ha4_fabric_ping" class="ocf" provider="pacemaker"
type="ping">
        <instance_attributes id="ha4_fabric_ping-instance_attributes">
          <nvpair name="host_list" value="10.10.0.1"
id="ha4_fabric_ping-instance_attributes-host_list"/>
          <nvpair name="failure_score" value="1"
id="ha4_fabric_ping-instance_attributes-failure_score"/>
        </instance_attributes>
        <operations>
          <op name="start" timeout="60s" requires="nothing" interval="0"
id="ha4_fabric_ping-start-0">
            <instance_attributes
id="ha4_fabric_ping-start-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="ha4_fabric_ping-start-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
          <op name="monitor" interval="15s" requires="nothing"
timeout="15s" id="ha4_fabric_ping-monitor-15s">
            <instance_attributes
id="ha4_fabric_ping-monitor-15s-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="ha4_fabric_ping-monitor-15s-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
          <op name="stop" on-fail="fence" requires="nothing" interval="0"
id="ha4_fabric_ping-stop-0">
            <instance_attributes
id="ha4_fabric_ping-stop-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="ha4_fabric_ping-stop-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
        </operations>
      </primitive>
      <primitive id="fencing_route_to_ha3" class="stonith" type="meatware">
        <instance_attributes id="fencing_route_to_ha3-instance_attributes">
          <nvpair name="hostlist" value="ha3"
id="fencing_route_to_ha3-instance_attributes-hostlist"/>
        </instance_attributes>
        <operations>
          <op name="start" requires="nothing" interval="0"
id="fencing_route_to_ha3-start-0">
            <instance_attributes
id="fencing_route_to_ha3-start-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="fencing_route_to_ha3-start-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
          <op name="monitor" requires="nothing" interval="0"
id="fencing_route_to_ha3-monitor-0">
            <instance_attributes
id="fencing_route_to_ha3-monitor-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="fencing_route_to_ha3-monitor-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
        </operations>
      </primitive>
      <primitive id="fencing_route_to_ha4" class="stonith" type="meatware">
        <instance_attributes id="fencing_route_to_ha4-instance_attributes">
          <nvpair name="hostlist" value="ha4"
id="fencing_route_to_ha4-instance_attributes-hostlist"/>
        </instance_attributes>
        <operations>
          <op name="start" requires="nothing" interval="0"
id="fencing_route_to_ha4-start-0">
            <instance_attributes
id="fencing_route_to_ha4-start-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="fencing_route_to_ha4-start-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
          <op name="monitor" requires="nothing" interval="0"
id="fencing_route_to_ha4-monitor-0">
            <instance_attributes
id="fencing_route_to_ha4-monitor-0-instance_attributes">
              <nvpair name="prereq" value="nothing"
id="fencing_route_to_ha4-monitor-0-instance_attributes-prereq"/>
            </instance_attributes>
          </op>
        </operations>
      </primitive>
    </resources>
    <constraints>
      <rsc_location id="ha3_fabric_ping_location" rsc="ha3_fabric_ping"
score="INFINITY" node="ha3"/>
      <rsc_location id="ha3_fabric_ping_not_location" rsc="ha3_fabric_ping"
score="-INFINITY" node="ha4"/>
      <rsc_location id="ha4_fabric_ping_location" rsc="ha4_fabric_ping"
score="INFINITY" node="ha4"/>
      <rsc_location id="ha4_fabric_ping_not_location" rsc="ha4_fabric_ping"
score="-INFINITY" node="ha3"/>
      <rsc_location id="fencing_route_to_ha4_location"
rsc="fencing_route_to_ha4" score="INFINITY" node="ha3"/>
      <rsc_location id="fencing_route_to_ha4_not_location"
rsc="fencing_route_to_ha4" score="-INFINITY" node="ha4"/>
      <rsc_location id="fencing_route_to_ha3_location"
rsc="fencing_route_to_ha3" score="INFINITY" node="ha4"/>
      <rsc_location id="fencing_route_to_ha3_not_location"
rsc="fencing_route_to_ha3" score="-INFINITY" node="ha3"/>
      <rsc_order id="ha3_fabric_ping_before_fencing_route_to_ha4"
score="INFINITY" first="ha3_fabric_ping" first-action="start"
then="fencing_route_to_ha4" then-action="start"/>
      <rsc_order id="ha4_fabric_ping_before_fencing_route_to_ha3"
score="INFINITY" first="ha4_fabric_ping" first-action="start"
then="fencing_route_to_ha3" then-action="start"/>
    </constraints>
  </configuration>
  <status>
    <node_state id="1" uname="ha3" in_ccm="true" crmd="online"
crm-debug-origin="do_update_resource" join="member" expected="member">
      <lrm id="1">
        <lrm_resources>
          <lrm_resource id="ha3_fabric_ping" type="ping" class="ocf"
provider="pacemaker">
            <lrm_rsc_op id="ha3_fabric_ping_last_0"
operation_key="ha3_fabric_ping_stop_0" operation="stop"
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9"
transition-key="1:1:0:81a6b215-3955-42b9-871b-9d127ef97e40"
transition-magic="0:0;1:1:0:81a6b215-3955-42b9-871b-9d127ef97e40"
call-id="19" rc-code="0" op-status="0" interval="0" last-run="1404243473"
last-rc-change="1404243473" exec-time="63" queue-time="0"
op-digest="91b00b3fe95f23582466d18e42c4fd58" on_node="ha3"/>
            <lrm_rsc_op id="ha3_fabric_ping_last_failure_0"
operation_key="ha3_fabric_ping_start_0" operation="start"
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9"
transition-key="8:0:0:81a6b215-3955-42b9-871b-9d127ef97e40"
transition-magic="0:1;8:0:0:81a6b215-3955-42b9-871b-9d127ef97e40"
call-id="18" rc-code="1" op-status="0" interval="0" last-run="1404243413"
last-rc-change="1404243413" exec-time="20027" queue-time="0"
op-digest="ddf4bee6852a62c7efcf52cf7471d629"/>
          </lrm_resource>
          <lrm_resource id="ha4_fabric_ping" type="ping" class="ocf"
provider="pacemaker">
            <lrm_rsc_op id="ha4_fabric_ping_last_0"
operation_key="ha4_fabric_ping_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9"
transition-key="5:0:7:81a6b215-3955-42b9-871b-9d127ef97e40"
transition-magic="0:7;5:0:7:81a6b215-3955-42b9-871b-9d127ef97e40"
call-id="9" rc-code="7" op-status="0" interval="0" last-run="1404243413"
last-rc-change="1404243413" exec-time="8" queue-time="0"
op-digest="91b00b3fe95f23582466d18e42c4fd58" on_node="ha3"/>
          </lrm_resource>
          <lrm_resource id="fencing_route_to_ha3" type="meatware"
class="stonith">
            <lrm_rsc_op id="fencing_route_to_ha3_last_0"
operation_key="fencing_route_to_ha3_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9"
transition-key="6:0:7:81a6b215-3955-42b9-871b-9d127ef97e40"
transition-magic="0:7;6:0:7:81a6b215-3955-42b9-871b-9d127ef97e40"
call-id="13" rc-code="7" op-status="0" interval="0" last-run="1404243413"
last-rc-change="1404243413" exec-time="1" queue-time="0"
op-digest="502fbd7a2366c2be772d7fbecc9e0351" on_node="ha3"/>
          </lrm_resource>
          <lrm_resource id="fencing_route_to_ha4" type="meatware"
class="stonith">
            <lrm_rsc_op id="fencing_route_to_ha4_last_0"
operation_key="fencing_route_to_ha4_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9"
transition-key="7:0:7:81a6b215-3955-42b9-871b-9d127ef97e40"
transition-magic="0:7;7:0:7:81a6b215-3955-42b9-871b-9d127ef97e40"
call-id="17" rc-code="7" op-status="0" interval="0" last-run="1404243413"
last-rc-change="1404243413" exec-time="0" queue-time="0"
op-digest="5be26fbcfd648e3d545d0115645dde76" on_node="ha3"/>
          </lrm_resource>
        </lrm_resources>
      </lrm>
      <transient_attributes id="1">
        <instance_attributes id="status-1">
          <nvpair id="status-1-shutdown" name="shutdown" value="0"/>
          <nvpair id="status-1-probe_complete" name="probe_complete"
value="true"/>
          <nvpair id="status-1-fail-count-ha3_fabric_ping"
name="fail-count-ha3_fabric_ping" value="INFINITY"/>
          <nvpair id="status-1-last-failure-ha3_fabric_ping"
name="last-failure-ha3_fabric_ping" value="1404243433"/>
        </instance_attributes>
      </transient_attributes>
    </node_state>
  </status>
</cib>


(PS: I added extra logging to your patch that can be seen in this log file
below)
 if(action->needs == rsc_req_nothing) {
                crm_notice("%s needs nothing", action->uuid);
        } else if (action->needs == rsc_req_stonith) {
            crm_notice("%s needs stonith", action->uuid);
            order_actions(stonith_done, action, pe_order_optional);

/var/log/messages
Jul  1 14:34:56 ha3 corosync[4638]: [QB    ] withdrawing server sockets
Jul  1 14:34:56 ha3 corosync[4638]: [SERV  ] Service engine unloaded:
corosync cluster quorum service v0.1
Jul  1 14:34:56 ha3 corosync[4638]: [SERV  ] Service engine unloaded:
corosync profile loading service
Jul  1 14:34:56 ha3 corosync[4638]: [MAIN  ] Corosync Cluster Engine
exiting normally
Jul  1 14:34:57 ha3 corosync: Waiting for corosync services to
unload:.[  OK  ]
Jul  1 14:34:57 ha3 systemd: Stopped LSB: Starts and stops Corosync Cluster
Engine..
Jul  1 14:35:01 ha3 systemd-logind: Removed session 11.
Jul  1 14:35:37 ha3 systemd: Starting Session 12 of user root.
Jul  1 14:35:37 ha3 systemd: Started Session 12 of user root.
Jul  1 14:35:37 ha3 systemd-logind: New session 12 of user root.
Jul  1 14:36:24 ha3 systemd: Starting LSB: Starts and stops Corosync
Cluster Engine....
Jul  1 14:36:24 ha3 corosync[4924]: [MAIN  ] Corosync Cluster Engine
('2.3.3'): started and ready to provide service.
Jul  1 14:36:24 ha3 corosync[4924]: [MAIN  ] Corosync built-in features:
pie relro bindnow
Jul  1 14:36:24 ha3 corosync[4925]: [TOTEM ] Initializing transport (UDP/IP
Unicast).
Jul  1 14:36:24 ha3 corosync[4925]: [TOTEM ] Initializing transmit/receive
security (NSS) crypto: none hash: none
Jul  1 14:36:25 ha3 corosync[4925]: [TOTEM ] The network interface
[10.10.0.14] is now up.
Jul  1 14:36:25 ha3 corosync[4925]: [SERV  ] Service engine loaded:
corosync configuration map access [0]
Jul  1 14:36:25 ha3 corosync[4925]: [QB    ] server name: cmap
Jul  1 14:36:25 ha3 corosync[4925]: [SERV  ] Service engine loaded:
corosync configuration service [1]
Jul  1 14:36:25 ha3 corosync[4925]: [QB    ] server name: cfg
Jul  1 14:36:25 ha3 corosync[4925]: [SERV  ] Service engine loaded:
corosync cluster closed process group service v1.01 [2]
Jul  1 14:36:25 ha3 corosync[4925]: [QB    ] server name: cpg
Jul  1 14:36:25 ha3 corosync[4925]: [SERV  ] Service engine loaded:
corosync profile loading service [4]
Jul  1 14:36:25 ha3 corosync[4925]: [QUORUM] Using quorum provider
corosync_votequorum
Jul  1 14:36:25 ha3 corosync[4925]: [SERV  ] Service engine loaded:
corosync vote quorum service v1.0 [5]
Jul  1 14:36:25 ha3 corosync[4925]: [QB    ] server name: votequorum
Jul  1 14:36:25 ha3 corosync[4925]: [SERV  ] Service engine loaded:
corosync cluster quorum service v0.1 [3]
Jul  1 14:36:25 ha3 corosync[4925]: [QB    ] server name: quorum
Jul  1 14:36:25 ha3 corosync[4925]: [TOTEM ] adding new UDPU member
{10.10.0.14}
Jul  1 14:36:25 ha3 corosync[4925]: [TOTEM ] adding new UDPU member
{10.10.0.15}
Jul  1 14:36:25 ha3 corosync[4925]: [TOTEM ] A new membership
(10.10.0.14:2420) was formed. Members joined: 1
Jul  1 14:36:25 ha3 corosync[4925]: [QUORUM] Members[1]: 1
Jul  1 14:36:25 ha3 corosync[4925]: [MAIN  ] Completed service
synchronization, ready to provide service.
Jul  1 14:36:25 ha3 corosync: Starting Corosync Cluster Engine (corosync):
[  OK  ]
Jul  1 14:36:25 ha3 systemd: Started LSB: Starts and stops Corosync Cluster
Engine..
Jul  1 14:36:30 ha3 systemd: Starting LSB: Starts and stops Pacemaker
Cluster Manager....
Jul  1 14:36:30 ha3 pacemaker: Starting Pacemaker Cluster Manager
Jul  1 14:36:30 ha3 pacemakerd[4953]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 pacemakerd[4953]: notice: mcp_read_config: Configured
corosync to accept connections from group 1000: OK (1)
Jul  1 14:36:30 ha3 pacemakerd[4953]: notice: main: Starting Pacemaker
1.1.11 (Build: 2a5bbf9):  agent-manpages ncurses libqb-logging libqb-ipc
lha-fencing nagios  corosync-native libesmtp acls
Jul  1 14:36:30 ha3 pacemakerd[4953]: notice: cluster_connect_quorum:
Quorum lost
Jul  1 14:36:30 ha3 pacemakerd[4953]: notice: crm_update_peer_state:
pcmk_quorum_notification: Node ha3[1] - state is now member (was (null))
Jul  1 14:36:30 ha3 crmd[4960]: notice: crm_add_logfile: Additional logging
available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 crmd[4960]: notice: main: CRM Git Version: 2a5bbf9
Jul  1 14:36:30 ha3 crmd[4960]: warning:
crm_is_writable: /var/lib/pacemaker/pengine should be owned and r/w by
group haclient
Jul  1 14:36:30 ha3 crmd[4960]: warning:
crm_is_writable: /var/lib/pacemaker/cib should be owned and r/w by group
haclient
Jul  1 14:36:30 ha3 stonith-ng[4956]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 stonith-ng[4956]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Jul  1 14:36:30 ha3 lrmd[4957]: notice: crm_add_logfile: Additional logging
available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 attrd[4958]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 attrd[4958]: notice: crm_cluster_connect: Connecting to
cluster infrastructure: corosync
Jul  1 14:36:30 ha3 cib[4955]: notice: crm_add_logfile: Additional logging
available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 cib[4955]: warning:
crm_is_writable: /var/lib/pacemaker/cib should be owned and r/w by group
haclient
Jul  1 14:36:30 ha3 cib[4955]: notice: crm_cluster_connect: Connecting to
cluster infrastructure: corosync
Jul  1 14:36:30 ha3 pengine[4959]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Jul  1 14:36:30 ha3 pengine[4959]: warning:
crm_is_writable: /var/lib/pacemaker/pengine should be owned and r/w by
group haclient
Jul  1 14:36:30 ha3 attrd[4958]: notice: crm_update_peer_state:
attrd_peer_change_cb: Node ha3[1] - state is now member (was (null))
Jul  1 14:36:31 ha3 crmd[4960]: notice: crm_cluster_connect: Connecting to
cluster infrastructure: corosync
Jul  1 14:36:31 ha3 crmd[4960]: notice: cluster_connect_quorum: Quorum lost
Jul  1 14:36:31 ha3 crmd[4960]: notice: crm_update_peer_state:
pcmk_quorum_notification: Node ha3[1] - state is now member (was (null))
Jul  1 14:36:31 ha3 crmd[4960]: notice: do_started: The local CRM is
operational
Jul  1 14:36:31 ha3 crmd[4960]: notice: do_state_transition: State
transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_started ]
Jul  1 14:36:31 ha3 stonith-ng[4956]: notice: setup_cib: Watching for
stonith topology changes
Jul  1 14:36:31 ha3 stonith-ng[4956]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:36:32 ha3 stonith-ng[4956]: notice: stonith_device_register:
Added 'fencing_route_to_ha4' to the device list (1 active devices)
Jul  1 14:36:35 ha3 pacemaker: Starting Pacemaker Cluster Manager[  OK  ]
Jul  1 14:36:35 ha3 systemd: Started LSB: Starts and stops Pacemaker
Cluster Manager..
Jul  1 14:36:52 ha3 crmd[4960]: warning: do_log: FSA: Input I_DC_TIMEOUT
from crm_timer_popped() received in state S_PENDING
Jul  1 14:36:52 ha3 crmd[4960]: notice: do_state_transition: State
transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC
cause=C_TIMER_POPPED origin=election_timeout_popped ]
Jul  1 14:36:52 ha3 crmd[4960]: warning: do_log: FSA: Input I_ELECTION_DC
from do_election_check() received in state S_INTEGRATION
Jul  1 14:36:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:36:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
ha3_fabric_ping_monitor_15000 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
ha3_fabric_ping_start_0 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
ha3_fabric_ping_monitor_0 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
ha4_fabric_ping_monitor_0 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha3_monitor_0 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_monitor_0 needs nothing
Jul  1 14:36:53 ha3 pengine[4959]: notice: LogActions: Start
ha3_fabric_ping	(ha3)
Jul  1 14:36:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3)
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
4: monitor ha3_fabric_ping_monitor_0 on ha3 (local)
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (12) on ha4 (timeout=60000)
Jul  1 14:36:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:36:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
3eb51036-6c4a-40ee-a0dc-bc1838cf13df (0)
Jul  1 14:36:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 0: /var/lib/pacemaker/pengine/pe-warn-202.bz2
Jul  1 14:36:53 ha3 stonith: [4972]: info: parse config info info=ha4
Jul  1 14:36:53 ha3 stonith: [4972]: info: meatware device OK.
Jul  1 14:36:53 ha3 stonith: [4977]: info: parse config info info=ha4
Jul  1 14:36:53 ha3 stonith: [4977]: info: meatware device OK.
Jul  1 14:36:53 ha3 stonith: [4983]: info: parse config info info=ha4
Jul  1 14:36:53 ha3 stonith: [4983]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:36:53 ha3 stonith: [4983]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.
Jul  1 14:36:53 ha3 crmd[4960]: notice: process_lrm_event: Operation
ha3_fabric_ping_monitor_0: not running (node=ha3, call=5, rc=7,
cib-update=25, confirmed=true)
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
5: monitor ha4_fabric_ping_monitor_0 on ha3 (local)
Jul  1 14:36:53 ha3 crmd[4960]: notice: process_lrm_event: Operation
ha4_fabric_ping_monitor_0: not running (node=ha3, call=9, rc=7,
cib-update=26, confirmed=true)
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
6: monitor fencing_route_to_ha3_monitor_0 on ha3 (local)
Jul  1 14:36:53 ha3 crmd[4960]: notice: process_lrm_event: Operation
fencing_route_to_ha3_monitor_0: not running (node=ha3, call=13, rc=7,
cib-update=27, confirmed=true)
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
7: monitor fencing_route_to_ha4_monitor_0 on ha3 (local)
Jul  1 14:36:53 ha3 crmd[4960]: notice: process_lrm_event: Operation
fencing_route_to_ha4_monitor_0: not running (node=ha3, call=17, rc=7,
cib-update=28, confirmed=true)
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
3: probe_complete probe_complete-ha3 on ha3 (local) - no waiting
Jul  1 14:36:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
8: start ha3_fabric_ping_start_0 on ha3 (local)
Jul  1 14:36:53 ha3 crmd[4960]: notice: abort_transition_graph: Transition
aborted by status-1-probe_complete, probe_complete=true: Transient
attribute change (create cib=0.10.9, source=te_update_diff:391,
path=/cib/status/node_state[@id='1']/transient_attributes
[@id='1']/instance_attributes[@id='status-1'], 0)
Jul  1 14:37:13 ha3 ping(ha3_fabric_ping)[5003]: WARNING: pingd is less
than failure_score(1)
Jul  1 14:37:13 ha3 crmd[4960]: notice: process_lrm_event: Operation
ha3_fabric_ping_start_0: unknown error (node=ha3, call=18, rc=1,
cib-update=29, confirmed=true)
Jul  1 14:37:13 ha3 crmd[4960]: warning: status_from_rc: Action 8
(ha3_fabric_ping_start_0) on ha3 failed (target: 0 vs. rc: 1): Error
Jul  1 14:37:13 ha3 crmd[4960]: warning: update_failcount: Updating
failcount for ha3_fabric_ping on ha3 after failed start: rc=1
(update=INFINITY, time=1404243433)
Jul  1 14:37:13 ha3 crmd[4960]: warning: update_failcount: Updating
failcount for ha3_fabric_ping on ha3 after failed start: rc=1
(update=INFINITY, time=1404243433)
Jul  1 14:37:13 ha3 crmd[4960]: warning: status_from_rc: Action 8
(ha3_fabric_ping_start_0) on ha3 failed (target: 0 vs. rc: 1): Error
Jul  1 14:37:13 ha3 crmd[4960]: warning: update_failcount: Updating
failcount for ha3_fabric_ping on ha3 after failed start: rc=1
(update=INFINITY, time=1404243433)
Jul  1 14:37:13 ha3 crmd[4960]: warning: update_failcount: Updating
failcount for ha3_fabric_ping on ha3 after failed start: rc=1
(update=INFINITY, time=1404243433)
Jul  1 14:37:53 ha3 stonith-ng[4956]: notice: stonith_action_async_done:
Child process 4979 performing action 'reboot' timed out with signal 15
Jul  1 14:37:53 ha3 stonith-ng[4956]: error: log_operation: Operation
'reboot' [4979] (call 2 from crmd.4960) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
Jul  1 14:37:53 ha3 stonith-ng[4956]: warning: log_operation:
fencing_route_to_ha4:4979 [ Performing: stonith -t meatware -T reset ha4 ]
Jul  1 14:37:53 ha3 stonith-ng[4956]: warning: get_xpath_object: No match
for //@st_delegate in /st-reply
Jul  1 14:37:53 ha3 stonith-ng[4956]: error: remote_op_done: Operation
reboot of ha4 by ha3 for crmd.4960 at ha3.3eb51036: Timer expired
Jul  1 14:37:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 2/12:0:0:81a6b215-3955-42b9-871b-9d127ef97e40: Timer expired
(-62)
Jul  1 14:37:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 2 for ha4 failed (Timer expired): aborting transition.
Jul  1 14:37:53 ha3 crmd[4960]: notice: tengine_stonith_notify: Peer ha4
was not terminated (reboot) by ha3 for ha3: Timer expired
(ref=3eb51036-6c4a-40ee-a0dc-bc1838cf13df) by client crmd.4960
Jul  1 14:37:53 ha3 crmd[4960]: notice: run_graph: Transition 0
(Complete=8, Pending=0, Fired=0, Skipped=4, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-202.bz2): Stopped
Jul  1 14:37:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:37:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:37:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:37:53 ha3 pengine[4959]: warning: common_apply_stickiness:
Forcing ha3_fabric_ping away from ha3 after 1000000 failures (max=1000000)
Jul  1 14:37:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:37:53 ha3 pengine[4959]: notice: native_start_constraints:
ha3_fabric_ping_stop_0 needs nothing
Jul  1 14:37:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:37:53 ha3 pengine[4959]: notice: LogActions: Stop
ha3_fabric_ping	(ha3)
Jul  1 14:37:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3 - blocked)
Jul  1 14:37:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 1: /var/lib/pacemaker/pengine/pe-warn-203.bz2
Jul  1 14:37:53 ha3 crmd[4960]: notice: te_rsc_command: Initiating action
1: stop ha3_fabric_ping_stop_0 on ha3 (local)
Jul  1 14:37:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (7) on ha4 (timeout=60000)
Jul  1 14:37:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:37:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
a52da012-efbc-448b-843a-9f85d828b9af (0)
Jul  1 14:37:53 ha3 stonith: [5040]: info: parse config info info=ha4
Jul  1 14:37:53 ha3 stonith: [5040]: info: meatware device OK.
Jul  1 14:37:53 ha3 stonith: [5045]: info: parse config info info=ha4
Jul  1 14:37:53 ha3 stonith: [5045]: info: meatware device OK.
Jul  1 14:37:53 ha3 stonith: [5051]: info: parse config info info=ha4
Jul  1 14:37:53 ha3 stonith: [5051]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:37:53 ha3 stonith: [5051]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.
Jul  1 14:37:53 ha3 crmd[4960]: notice: process_lrm_event: Operation
ha3_fabric_ping_stop_0: ok (node=ha3, call=19, rc=0, cib-update=31,
confirmed=true)
Jul  1 14:37:58 ha3 crmd[4960]: notice: abort_transition_graph: Transition
aborted by deletion of nvpair[@id='status-1-pingd']: Transient attribute
change (cib=0.10.15, source=te_update_diff:391, path=/cib/status/node_state
[@id='1']/transient_attributes[@id='1']/instance_attributes
[@id='status-1']/nvpair[@id='status-1-pingd'], 0)
Jul  1 14:38:53 ha3 stonith-ng[4956]: notice: stonith_action_async_done:
Child process 5047 performing action 'reboot' timed out with signal 15
Jul  1 14:38:53 ha3 stonith-ng[4956]: error: log_operation: Operation
'reboot' [5047] (call 3 from crmd.4960) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
Jul  1 14:38:53 ha3 stonith-ng[4956]: warning: log_operation:
fencing_route_to_ha4:5047 [ Performing: stonith -t meatware -T reset ha4 ]
Jul  1 14:38:53 ha3 stonith-ng[4956]: warning: get_xpath_object: No match
for //@st_delegate in /st-reply
Jul  1 14:38:53 ha3 stonith-ng[4956]: error: remote_op_done: Operation
reboot of ha4 by ha3 for crmd.4960 at ha3.a52da012: Timer expired
Jul  1 14:38:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 3/7:1:0:81a6b215-3955-42b9-871b-9d127ef97e40: Timer expired (-62)
Jul  1 14:38:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 3 for ha4 failed (Timer expired): aborting transition.
Jul  1 14:38:53 ha3 crmd[4960]: notice: tengine_stonith_notify: Peer ha4
was not terminated (reboot) by ha3 for ha3: Timer expired
(ref=a52da012-efbc-448b-843a-9f85d828b9af) by client crmd.4960
Jul  1 14:38:53 ha3 crmd[4960]: notice: run_graph: Transition 1
(Complete=2, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-203.bz2): Stopped
Jul  1 14:38:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:38:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:38:53 ha3 pengine[4959]: warning: common_apply_stickiness:
Forcing ha3_fabric_ping away from ha3 after 1000000 failures (max=1000000)
Jul  1 14:38:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:38:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:38:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3 - blocked)
Jul  1 14:38:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (6) on ha4 (timeout=60000)
Jul  1 14:38:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:38:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
7d74f7e7-354c-4aed-805e-376d78a268d6 (0)
Jul  1 14:38:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 2: /var/lib/pacemaker/pengine/pe-warn-204.bz2
Jul  1 14:38:53 ha3 stonith: [5056]: info: parse config info info=ha4
Jul  1 14:38:53 ha3 stonith: [5056]: info: meatware device OK.
Jul  1 14:38:53 ha3 stonith: [5058]: info: parse config info info=ha4
Jul  1 14:38:53 ha3 stonith: [5058]: info: meatware device OK.
Jul  1 14:38:53 ha3 stonith: [5060]: info: parse config info info=ha4
Jul  1 14:38:53 ha3 stonith: [5060]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:38:53 ha3 stonith: [5060]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.
Jul  1 14:39:53 ha3 stonith-ng[4956]: notice: stonith_action_async_done:
Child process 5059 performing action 'reboot' timed out with signal 15
Jul  1 14:39:53 ha3 stonith-ng[4956]: error: log_operation: Operation
'reboot' [5059] (call 4 from crmd.4960) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
Jul  1 14:39:53 ha3 stonith-ng[4956]: warning: log_operation:
fencing_route_to_ha4:5059 [ Performing: stonith -t meatware -T reset ha4 ]
Jul  1 14:39:53 ha3 stonith-ng[4956]: warning: get_xpath_object: No match
for //@st_delegate in /st-reply
Jul  1 14:39:53 ha3 stonith-ng[4956]: error: remote_op_done: Operation
reboot of ha4 by ha3 for crmd.4960 at ha3.7d74f7e7: Timer expired
Jul  1 14:39:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 4/6:2:0:81a6b215-3955-42b9-871b-9d127ef97e40: Timer expired (-62)
Jul  1 14:39:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 4 for ha4 failed (Timer expired): aborting transition.
Jul  1 14:39:53 ha3 crmd[4960]: notice: abort_transition_graph: Transition
aborted: Stonith failed (source=tengine_stonith_callback:697, 0)
Jul  1 14:39:53 ha3 crmd[4960]: notice: tengine_stonith_notify: Peer ha4
was not terminated (reboot) by ha3 for ha3: Timer expired
(ref=7d74f7e7-354c-4aed-805e-376d78a268d6) by client crmd.4960
Jul  1 14:39:53 ha3 crmd[4960]: notice: run_graph: Transition 2
(Complete=1, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-204.bz2): Stopped
Jul  1 14:39:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:39:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:39:53 ha3 pengine[4959]: warning: common_apply_stickiness:
Forcing ha3_fabric_ping away from ha3 after 1000000 failures (max=1000000)
Jul  1 14:39:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:39:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:39:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3 - blocked)
Jul  1 14:39:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 3: /var/lib/pacemaker/pengine/pe-warn-204.bz2
Jul  1 14:39:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (6) on ha4 (timeout=60000)
Jul  1 14:39:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:39:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
2788d6bb-ac17-450c-beba-10944495a476 (0)
Jul  1 14:39:53 ha3 stonith: [5062]: info: parse config info info=ha4
Jul  1 14:39:53 ha3 stonith: [5062]: info: meatware device OK.
Jul  1 14:39:53 ha3 stonith: [5064]: info: parse config info info=ha4
Jul  1 14:39:53 ha3 stonith: [5064]: info: meatware device OK.
Jul  1 14:39:53 ha3 stonith: [5066]: info: parse config info info=ha4
Jul  1 14:39:53 ha3 stonith: [5066]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:39:53 ha3 stonith: [5066]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.
Jul  1 14:40:53 ha3 stonith-ng[4956]: notice: stonith_action_async_done:
Child process 5065 performing action 'reboot' timed out with signal 15
Jul  1 14:40:53 ha3 stonith-ng[4956]: error: log_operation: Operation
'reboot' [5065] (call 5 from crmd.4960) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
Jul  1 14:40:53 ha3 stonith-ng[4956]: warning: log_operation:
fencing_route_to_ha4:5065 [ Performing: stonith -t meatware -T reset ha4 ]
Jul  1 14:40:53 ha3 stonith-ng[4956]: warning: get_xpath_object: No match
for //@st_delegate in /st-reply
Jul  1 14:40:53 ha3 stonith-ng[4956]: error: remote_op_done: Operation
reboot of ha4 by ha3 for crmd.4960 at ha3.2788d6bb: Timer expired
Jul  1 14:40:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 5/6:3:0:81a6b215-3955-42b9-871b-9d127ef97e40: Timer expired (-62)
Jul  1 14:40:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 5 for ha4 failed (Timer expired): aborting transition.
Jul  1 14:40:53 ha3 crmd[4960]: notice: abort_transition_graph: Transition
aborted: Stonith failed (source=tengine_stonith_callback:697, 0)
Jul  1 14:40:53 ha3 crmd[4960]: notice: tengine_stonith_notify: Peer ha4
was not terminated (reboot) by ha3 for ha3: Timer expired
(ref=2788d6bb-ac17-450c-beba-10944495a476) by client crmd.4960
Jul  1 14:40:53 ha3 crmd[4960]: notice: run_graph: Transition 3
(Complete=1, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-204.bz2): Stopped
Jul  1 14:40:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:40:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:40:53 ha3 pengine[4959]: warning: common_apply_stickiness:
Forcing ha3_fabric_ping away from ha3 after 1000000 failures (max=1000000)
Jul  1 14:40:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:40:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:40:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3 - blocked)
Jul  1 14:40:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 4: /var/lib/pacemaker/pengine/pe-warn-204.bz2
Jul  1 14:40:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (6) on ha4 (timeout=60000)
Jul  1 14:40:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:40:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
f854d478-2620-4662-bd78-068921d554c2 (0)
Jul  1 14:40:53 ha3 stonith: [5068]: info: parse config info info=ha4
Jul  1 14:40:53 ha3 stonith: [5068]: info: meatware device OK.
Jul  1 14:40:53 ha3 stonith: [5070]: info: parse config info info=ha4
Jul  1 14:40:53 ha3 stonith: [5070]: info: meatware device OK.
Jul  1 14:40:53 ha3 stonith: [5072]: info: parse config info info=ha4
Jul  1 14:40:53 ha3 stonith: [5072]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:40:53 ha3 stonith: [5072]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.
Jul  1 14:41:53 ha3 stonith-ng[4956]: notice: stonith_action_async_done:
Child process 5071 performing action 'reboot' timed out with signal 15
Jul  1 14:41:53 ha3 stonith-ng[4956]: error: log_operation: Operation
'reboot' [5071] (call 6 from crmd.4960) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
Jul  1 14:41:53 ha3 stonith-ng[4956]: warning: log_operation:
fencing_route_to_ha4:5071 [ Performing: stonith -t meatware -T reset ha4 ]
Jul  1 14:41:53 ha3 stonith-ng[4956]: warning: get_xpath_object: No match
for //@st_delegate in /st-reply
Jul  1 14:41:53 ha3 stonith-ng[4956]: error: remote_op_done: Operation
reboot of ha4 by ha3 for crmd.4960 at ha3.f854d478: Timer expired
Jul  1 14:41:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 6/6:4:0:81a6b215-3955-42b9-871b-9d127ef97e40: Timer expired (-62)
Jul  1 14:41:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 6 for ha4 failed (Timer expired): aborting transition.
Jul  1 14:41:53 ha3 crmd[4960]: notice: abort_transition_graph: Transition
aborted: Stonith failed (source=tengine_stonith_callback:697, 0)
Jul  1 14:41:53 ha3 crmd[4960]: notice: tengine_stonith_notify: Peer ha4
was not terminated (reboot) by ha3 for ha3: Timer expired
(ref=f854d478-2620-4662-bd78-068921d554c2) by client crmd.4960
Jul  1 14:41:53 ha3 crmd[4960]: notice: run_graph: Transition 4
(Complete=1, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-204.bz2): Stopped
Jul  1 14:41:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:41:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:41:53 ha3 pengine[4959]: warning: common_apply_stickiness:
Forcing ha3_fabric_ping away from ha3 after 1000000 failures (max=1000000)
Jul  1 14:41:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:41:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:41:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3 - blocked)
Jul  1 14:41:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 5: /var/lib/pacemaker/pengine/pe-warn-204.bz2
Jul  1 14:41:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (6) on ha4 (timeout=60000)
Jul  1 14:41:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:41:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
4670e736-4d12-4ebf-a3f4-3c267384bbec (0)
Jul  1 14:41:53 ha3 stonith: [5075]: info: parse config info info=ha4
Jul  1 14:41:53 ha3 stonith: [5075]: info: meatware device OK.
Jul  1 14:41:53 ha3 stonith: [5077]: info: parse config info info=ha4
Jul  1 14:41:53 ha3 stonith: [5077]: info: meatware device OK.
Jul  1 14:41:53 ha3 stonith: [5079]: info: parse config info info=ha4
Jul  1 14:41:53 ha3 stonith: [5079]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:41:53 ha3 stonith: [5079]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.
Jul  1 14:42:53 ha3 stonith-ng[4956]: notice: stonith_action_async_done:
Child process 5078 performing action 'reboot' timed out with signal 15
Jul  1 14:42:53 ha3 stonith-ng[4956]: error: log_operation: Operation
'reboot' [5078] (call 7 from crmd.4960) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
Jul  1 14:42:53 ha3 stonith-ng[4956]: warning: log_operation:
fencing_route_to_ha4:5078 [ Performing: stonith -t meatware -T reset ha4 ]
Jul  1 14:42:53 ha3 stonith-ng[4956]: warning: get_xpath_object: No match
for //@st_delegate in /st-reply
Jul  1 14:42:53 ha3 stonith-ng[4956]: error: remote_op_done: Operation
reboot of ha4 by ha3 for crmd.4960 at ha3.4670e736: Timer expired
Jul  1 14:42:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 7/6:5:0:81a6b215-3955-42b9-871b-9d127ef97e40: Timer expired (-62)
Jul  1 14:42:53 ha3 crmd[4960]: notice: tengine_stonith_callback: Stonith
operation 7 for ha4 failed (Timer expired): aborting transition.
Jul  1 14:42:53 ha3 crmd[4960]: notice: abort_transition_graph: Transition
aborted: Stonith failed (source=tengine_stonith_callback:697, 0)
Jul  1 14:42:53 ha3 crmd[4960]: notice: tengine_stonith_notify: Peer ha4
was not terminated (reboot) by ha3 for ha3: Timer expired
(ref=4670e736-4d12-4ebf-a3f4-3c267384bbec) by client crmd.4960
Jul  1 14:42:53 ha3 crmd[4960]: notice: run_graph: Transition 5
(Complete=1, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-204.bz2): Stopped
Jul  1 14:42:53 ha3 pengine[4959]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Jul  1 14:42:53 ha3 pengine[4959]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
Jul  1 14:42:53 ha3 pengine[4959]: warning: common_apply_stickiness:
Forcing ha3_fabric_ping away from ha3 after 1000000 failures (max=1000000)
Jul  1 14:42:53 ha3 pengine[4959]: warning: stage6: Scheduling Node ha4 for
STONITH
Jul  1 14:42:53 ha3 pengine[4959]: notice: native_start_constraints:
fencing_route_to_ha4_start_0 needs nothing
Jul  1 14:42:53 ha3 pengine[4959]: notice: LogActions: Start
fencing_route_to_ha4	(ha3 - blocked)
Jul  1 14:42:53 ha3 pengine[4959]: warning: process_pe_message: Calculated
Transition 6: /var/lib/pacemaker/pengine/pe-warn-204.bz2
Jul  1 14:42:53 ha3 crmd[4960]: notice: te_fence_node: Executing reboot
fencing operation (6) on ha4 (timeout=60000)
Jul  1 14:42:53 ha3 stonith-ng[4956]: notice: handle_request: Client
crmd.4960.55d3ab19 wants to fence (reboot) 'ha4' with device '(any)'
Jul  1 14:42:53 ha3 stonith-ng[4956]: notice: initiate_remote_stonith_op:
Initiating remote operation reboot for ha4:
dff125ee-e64f-4c23-80d2-ea38f1bc3437 (0)
Jul  1 14:42:53 ha3 stonith: [5083]: info: parse config info info=ha4
Jul  1 14:42:53 ha3 stonith: [5083]: info: meatware device OK.
Jul  1 14:42:53 ha3 stonith: [5085]: info: parse config info info=ha4
Jul  1 14:42:53 ha3 stonith: [5085]: info: meatware device OK.
Jul  1 14:42:53 ha3 stonith: [5087]: info: parse config info info=ha4
Jul  1 14:42:53 ha3 stonith: [5087]: CRIT: OPERATOR INTERVENTION REQUIRED
to reset ha4.
Jul  1 14:42:53 ha3 stonith: [5087]: CRIT: Run "meatclient -c ha4" AFTER
power-cycling the machine.


Paul Cain



From:	Andrew Beekhof <andrew at beekhof.net>
To:	The Pacemaker cluster resource manager
            <pacemaker at oss.clusterlabs.org>
Date:	06/27/2014 05:13 AM
Subject:	Re: [Pacemaker] When stonith is enabled,	resources won't start
            until after stonith,	even though requires="nothing" and
            prereq="nothing" on RHEL	7	with	pacemaker-1.1.11
            compiled from source.




On 14 Jun 2014, at 7:37 am, Paul E Cain <pecain at us.ibm.com> wrote:

> Hi Andrew,
>
> Thank you for your quick response. This time, I completely shut down ha4
and then started corosync and pacemaker on ha3. However, the problem still
persisted. It's my understanding that using requires="nothing" or
prereq="nothing" should allow the cluster to start resources without
needing to fence. Is this not correct?

Apparently not without this patch:
  https://github.com/ClusterLabs/pacemaker/commit/2a5bbf9
> > > Jun 11 12:59:03 ha3 stonith-ng[5010]: notice: unpack_config: On loss
of CCM Quorum: Ignore
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: corosync_node_name: Unable to
get node name for nodeid 168427534
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: get_node_name: Defaulting to
uname -n for the local corosync node name
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: cluster_connect_quorum:
Quorum acquired
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: crm_update_peer_state:
pcmk_quorum_notification: Node ha3[168427534] - state is now member (was
(null))
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: corosync_node_name: Unable to
get node name for nodeid 168427535
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: get_node_name: Could not
obtain a node name for corosync nodeid 168427535
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: corosync_node_name: Unable to
get node name for nodeid 168427535
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: corosync_node_name: Unable to
get node name for nodeid 168427535
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: get_node_name: Could not
obtain a node name for corosync nodeid 168427535
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: crm_update_peer_state:
pcmk_quorum_notification: Node (null)[168427535] - state is now member (was
(null))
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: corosync_node_name: Unable to
get node name for nodeid 168427534
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: get_node_name: Defaulting to
uname -n for the local corosync node name
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: do_started: The local CRM is
operational
> > > Jun 11 12:59:03 ha3 crmd[5014]: notice: do_state_transition: State
transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_started ]
> > > Jun 11 12:59:04 ha3 stonith-ng[5010]: notice:
stonith_device_register: Added 'fencing_route_to_ha4' to the device list (1
active devices)
> > > Jun 11 12:59:06 ha3 pacemaker: Starting Pacemaker Cluster Manager
[  OK  ]
> > > Jun 11 12:59:06 ha3 systemd: Started LSB: Starts and stops Pacemaker
Cluster Manager..
> > > Jun 11 12:59:24 ha3 crmd[5014]: warning: do_log: FSA: Input
I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
> > > Jun 11 12:59:24 ha3 crmd[5014]: notice: do_state_transition: State
transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC
cause=C_TIMER_POPPED origin=election_timeout_popped ]
> > > Jun 11 12:59:24 ha3 crmd[5014]: warning: do_log: FSA: Input
I_ELECTION_DC from do_election_check() received in state S_INTEGRATION
> > > Jun 11 12:59:24 ha3 cib[5009]: notice: corosync_node_name: Unable to
get node name for nodeid 168427534
> > > Jun 11 12:59:24 ha3 cib[5009]: notice: get_node_name: Defaulting to
uname -n for the local corosync node name
> > > Jun 11 12:59:24 ha3 attrd[5012]: notice: corosync_node_name: Unable
to get node name for nodeid 168427534
> > > Jun 11 12:59:24 ha3 attrd[5012]: notice: get_node_name: Defaulting to
uname -n for the local corosync node name
> > > Jun 11 12:59:24 ha3 attrd[5012]: notice: write_attribute: Sent update
2 with 1 changes for terminate, id=<n/a>, set=(null)
> > > Jun 11 12:59:24 ha3 attrd[5012]: notice: write_attribute: Sent update
3 with 1 changes for shutdown, id=<n/a>, set=(null)
> > > Jun 11 12:59:24 ha3 attrd[5012]: notice: attrd_cib_callback: Update 2
for terminate[ha3]=(null): OK (0)
> > > Jun 11 12:59:24 ha3 attrd[5012]: notice: attrd_cib_callback: Update 3
for shutdown[ha3]=0: OK (0)
> > > Jun 11 12:59:25 ha3 pengine[5013]: notice: unpack_config: On loss of
CCM Quorum: Ignore
> > > Jun 11 12:59:25 ha3 pengine[5013]: warning: stage6: Scheduling Node
ha4 for STONITH
> > > Jun 11 12:59:25 ha3 pengine[5013]: notice: LogActions: Start
ha3_fabric_ping		   (ha3)
> > > Jun 11 12:59:25 ha3 pengine[5013]: notice: LogActions: Start
fencing_route_to_ha4		   (ha3)
> > > Jun 11 12:59:25 ha3 pengine[5013]: warning: process_pe_message: Calc
ulated Transition 0: /var/lib/pacemaker/pengine/pe-warn-80.bz2
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 4: monitor ha3_fabric_ping_monitor_0 on ha3 (local)
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: te_fence_node: Executing
reboot fencing operation (12) on ha4 (timeout=60000)
> > > Jun 11 12:59:25 ha3 stonith-ng[5010]: notice: handle_request: Client
crmd.5014.dbbbf194 wants to fence (reboot) 'ha4' with device '(any)'
> > > Jun 11 12:59:25 ha3 stonith-ng[5010]: notice:
initiate_remote_stonith_op: Initiating remote operation reboot for ha4:
b3ab6141-9612-4024-82b2-350e74bbb33d (0)
> > > Jun 11 12:59:25 ha3 stonith-ng[5010]: notice: corosync_node_name:
Unable to get node name for nodeid 168427534
> > > Jun 11 12:59:25 ha3 stonith-ng[5010]: notice: get_node_name:
Defaulting to uname -n for the local corosync node name
> > > Jun 11 12:59:25 ha3 stonith: [5027]: info: parse config info info=ha4
> > > Jun 11 12:59:25 ha3 stonith-ng[5010]: notice:
can_fence_host_with_device: fencing_route_to_ha4 can fence ha4:
dynamic-list
> > > Jun 11 12:59:25 ha3 stonith: [5031]: info: parse config info info=ha4
> > > Jun 11 12:59:25 ha3 stonith: [5031]: CRIT: OPERATOR INTERVENTION
REQUIRED to reset ha4.
> > > Jun 11 12:59:25 ha3 stonith: [5031]: CRIT: Run "meatclient -c ha4"
AFTER power-cycling the machine.
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: process_lrm_event: LRM
operation ha3_fabric_ping_monitor_0 (call=5, rc=7, cib-update=25,
confirmed=true) not running
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 5: monitor ha4_fabric_ping_monitor_0 on ha3 (local)
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: process_lrm_event: LRM
operation ha4_fabric_ping_monitor_0 (call=9, rc=7, cib-update=26,
confirmed=true) not running
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 6: monitor fencing_route_to_ha3_monitor_0 on ha3 (local)
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 7: monitor fencing_route_to_ha4_monitor_0 on ha3 (local)
> > > Jun 11 12:59:25 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 3: probe_complete probe_complete on ha3 (local) - no waiting
> > > Jun 11 12:59:25 ha3 attrd[5012]: notice: write_attribute: Sent update
4 with 1 changes for probe_complete, id=<n/a>, set=(null)
> > > Jun 11 12:59:25 ha3 attrd[5012]: notice: attrd_cib_callback: Update 4
for probe_complete[ha3]=true: OK (0)
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: notice:
stonith_action_async_done: Child process 5030 performing action 'reboot'
timed out with signal 15
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: error: log_operation: Operation
'reboot' [5030] (call 2 from crmd.5014) for host 'ha4' with device
'fencing_route_to_ha4' returned: -62 (Timer expired)
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: warning: log_operation:
fencing_route_to_ha4:5030 [ Performing: stonith -t meatware -T reset ha4 ]
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: notice: stonith_choose_peer:
Couldn't find anyone to fence ha4 with <any>
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: error: remote_op_done:
Operation reboot of ha4 by ha3 for crmd.5014 at ha3.b3ab6141: No route to host
> > > Jun 11 13:00:25 ha3 crmd[5014]: notice: tengine_stonith_callback:
Stonith operation 2/12:0:0:0ebf14dc-cfcf-425a-a507-65ed0ee060aa: No route
to host (-113)
> > > Jun 11 13:00:25 ha3 crmd[5014]: notice: tengine_stonith_callback:
Stonith operation 2 for ha4 failed (No route to host): aborting transition.
> > > Jun 11 13:00:25 ha3 crmd[5014]: notice: tengine_stonith_notify: Peer
ha4 was not terminated (reboot) by ha3 for ha3: No route to host
(ref=b3ab6141-9612-4024-82b2-350e74bbb33d) by client crmd.5014
> > > Jun 11 13:00:25 ha3 crmd[5014]: notice: run_graph: Transition 0
(Complete=7, Pending=0, Fired=0, Skipped=5, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-80.bz2): Stopped
> > > Jun 11 13:00:25 ha3 pengine[5013]: notice: unpack_config: On loss of
CCM Quorum: Ignore
> > > Jun 11 13:00:25 ha3 pengine[5013]: warning: stage6: Scheduling Node
ha4 for STONITH
> > > Jun 11 13:00:25 ha3 pengine[5013]: notice: LogActions: Start
ha3_fabric_ping		   (ha3)
> > > Jun 11 13:00:25 ha3 pengine[5013]: notice: LogActions: Start
fencing_route_to_ha4		   (ha3)
> > > Jun 11 13:00:25 ha3 pengine[5013]: warning: process_pe_message:
Calculated Transition 1: /var/lib/pacemaker/pengine/pe-warn-81.bz2
> > > Jun 11 13:00:25 ha3 crmd[5014]: notice: te_fence_node: Executing
reboot fencing operation (8) on ha4 (timeout=60000)
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: notice: handle_request: Client
crmd.5014.dbbbf194 wants to fence (reboot) 'ha4' with device '(any)'
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: notice:
initiate_remote_stonith_op: Initiating remote operation reboot for ha4:
eae78d4c-8d80-47fe-93e9-1a9261ec38a4 (0)
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: notice:
can_fence_host_with_device: fencing_route_to_ha4 can fence ha4:
dynamic-list
> > > Jun 11 13:00:25 ha3 stonith-ng[5010]: notice:
can_fence_host_with_device: fencing_route_to_ha4 can fence ha4:
dynamic-list
> > > Jun 11 13:00:25 ha3 stonith: [5057]: info: parse config info info=ha4
> > > Jun 11 13:00:25 ha3 stonith: [5057]: CRIT: OPERATOR INTERVENTION
REQUIRED to reset ha4.
> > > Jun 11 13:00:25 ha3 stonith: [5057]: CRIT: Run "meatclient -c ha4"
AFTER power-cycling the machine.
> > > Jun 11 13:00:41 ha3 stonith: [5057]: info: node Meatware-reset: ha4
> > > Jun 11 13:00:41 ha3 stonith-ng[5010]: notice: log_operation:
Operation 'reboot' [5056] (call 3 from crmd.5014) for host 'ha4' with
device 'fencing_route_to_ha4' returned: 0 (OK)
> > > Jun 11 13:00:41 ha3 stonith-ng[5010]: notice: remote_op_done:
Operation reboot of ha4 by ha3 for crmd.5014 at ha3.eae78d4c: OK
> > > Jun 11 13:00:41 ha3 crmd[5014]: notice: tengine_stonith_callback:
Stonith operation 3/8:1:0:0ebf14dc-cfcf-425a-a507-65ed0ee060aa: OK (0)
> > > Jun 11 13:00:41 ha3 crmd[5014]: notice: crm_update_peer_state:
send_stonith_update: Node ha4[0] - state is now lost (was (null))
> > > Jun 11 13:00:41 ha3 crmd[5014]: notice: tengine_stonith_notify: Peer
ha4 was terminated (reboot) by ha3 for ha3: OK
(ref=eae78d4c-8d80-47fe-93e9-1a9261ec38a4) by client crmd.5014
> > > Jun 11 13:00:41 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 4: start ha3_fabric_ping_start_0 on ha3 (local)
> > > Jun 11 13:01:01 ha3 systemd: Starting Session 22 of user root.
> > > Jun 11 13:01:01 ha3 systemd: Started Session 22 of user root.
> > > Jun 11 13:01:01 ha3 attrd[5012]: notice: write_attribute: Sent update
5 with 1 changes for pingd, id=<n/a>, set=(null)
> > > Jun 11 13:01:01 ha3 attrd[5012]: notice: attrd_cib_callback: Update 5
for pingd[ha3]=0: OK (0)
> > > Jun 11 13:01:01 ha3 ping(ha3_fabric_ping)[5060]: WARNING: pingd is
less than failure_score(1)
> > > Jun 11 13:01:01 ha3 crmd[5014]: notice: process_lrm_event: LRM
operation ha3_fabric_ping_start_0 (call=18, rc=1, cib-update=37,
confirmed=true) unknown error
> > > Jun 11 13:01:01 ha3 crmd[5014]: warning: status_from_rc: Action 4
(ha3_fabric_ping_start_0) on ha3 failed (target: 0 vs. rc: 1): Error
> > > Jun 11 13:01:01 ha3 crmd[5014]: warning: update_failcount: Updating
failcount for ha3_fabric_ping on ha3 after failed start: rc=1
(update=INFINITY, time=1402509661)
> > > Jun 11 13:01:01 ha3 crmd[5014]: warning: update_failcount: Updating
failcount for ha3_fabric_ping on ha3 after failed start: rc=1
(update=INFINITY, time=1402509661)
> > > Jun 11 13:01:01 ha3 crmd[5014]: notice: run_graph: Transition 1
(Complete=4, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-warn-81.bz2): Stopped
> > > Jun 11 13:01:01 ha3 attrd[5012]: notice: write_attribute: Sent update
6 with 1 changes for fail-count-ha3_fabric_ping, id=<n/a>, set=(null)
> > > Jun 11 13:01:01 ha3 attrd[5012]: notice: write_attribute: Sent update
7 with 1 changes for last-failure-ha3_fabric_ping, id=<n/a>, set=(null)
> > > Jun 11 13:01:01 ha3 pengine[5013]: notice: unpack_config: On loss of
CCM Quorum: Ignore
> > > Jun 11 13:01:01 ha3 pengine[5013]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
> > > Jun 11 13:01:01 ha3 pengine[5013]: notice: LogActions: Stop
ha3_fabric_ping		   (ha3)
> > > Jun 11 13:01:01 ha3 pengine[5013]: notice: process_pe_message:
Calculated Transition 2: /var/lib/pacemaker/pengine/pe-input-304.bz2
> > > Jun 11 13:01:01 ha3 attrd[5012]: notice: attrd_cib_callback: Update 6
for fail-count-ha3_fabric_ping[ha3]=INFINITY: OK (0)
> > > Jun 11 13:01:01 ha3 attrd[5012]: notice: attrd_cib_callback: Update 7
for last-failure-ha3_fabric_ping[ha3]=1402509661: OK (0)
> > > Jun 11 13:01:01 ha3 pengine[5013]: notice: unpack_config: On loss of
CCM Quorum: Ignore
> > > Jun 11 13:01:01 ha3 pengine[5013]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
> > > Jun 11 13:01:01 ha3 pengine[5013]: notice: LogActions: Stop
ha3_fabric_ping		   (ha3)
> > > Jun 11 13:01:01 ha3 pengine[5013]: notice: process_pe_message:
Calculated Transition 3: /var/lib/pacemaker/pengine/pe-input-305.bz2
> > > Jun 11 13:01:01 ha3 crmd[5014]: notice: te_rsc_command: Initiating
action 4: stop ha3_fabric_ping_stop_0 on ha3 (local)
> > > Jun 11 13:01:01 ha3 crmd[5014]: notice: process_lrm_event: LRM
operation ha3_fabric_ping_stop_0 (call=19, rc=0, cib-update=41,
confirmed=true) ok
> > > Jun 11 13:01:01 ha3 crmd[5014]: notice: run_graph: Transition 3
(Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-305.bz2): Complete
> > > Jun 11 13:01:01 ha3 crmd[5014]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
> > > Jun 11 13:01:06 ha3 attrd[5012]: notice: write_attribute: Sent update
8 with 1 changes for pingd, id=<n/a>, set=(null)
> > > Jun 11 13:01:06 ha3 crmd[5014]: notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]
> > > Jun 11 13:01:06 ha3 pengine[5013]: notice: unpack_config: On loss of
CCM Quorum: Ignore
> > > Jun 11 13:01:06 ha3 pengine[5013]: warning: unpack_rsc_op_failure:
Processing failed op start for ha3_fabric_ping on ha3: unknown error (1)
> > > Jun 11 13:01:06 ha3 pengine[5013]: notice: process_pe_message:
Calculated Transition 4: /var/lib/pacemaker/pengine/pe-input-306.bz2
> > > Jun 11 13:01:06 ha3 crmd[5014]: notice: run_graph: Transition 4
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-306.bz2): Complete
> > > Jun 11 13:01:06 ha3 crmd[5014]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
> > > Jun 11 13:01:06 ha3 attrd[5012]: notice: attrd_cib_callback: Update 8
for pingd[ha3]=(null): OK (0)
> > >
> > > /etc/corosync/corosync.conf
> > > # Please read the corosync.conf.5 manual page
> > > totem {
> > > version: 2
> > >
> > > crypto_cipher: none
> > > crypto_hash: none
> > >
> > > interface {
> > > ringnumber: 0
> > > bindnetaddr: 10.10.0.0
> > > mcastport: 5405
> > > ttl: 1
> > > }
> > > transport: udpu
> > > }
> > >
> > > logging {
> > > fileline: off
> > > to_logfile: no
> > > to_syslog: yes
> > > #logfile: /var/log/cluster/corosync.log
> > > debug: off
> > > timestamp: on
> > > logger_subsys {
> > > subsys: QUORUM
> > > debug: off
> > > }
> > > }
> > >
> > > nodelist {
> > > node {
> > > ring0_addr: 10.10.0.14
> > > }
> > >
> > > node {
> > > ring0_addr: 10.10.0.15
> > > }
> > > }
> > >
> > > quorum {
> > > # Enable and configure quorum subsystem (default: off)
> > > # see also corosync.conf.5 and votequorum.5
> > > provider: corosync_votequorum
> > > expected_votes: 2
> > > }
> > > [root at ha3 ~]#
> > >
> > > Paul Cain
> > >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> > [attachment "signature.asc" deleted by Paul E Cain/Lenexa/IBM]
_______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
> [attachment "signature.asc" deleted by Paul E Cain/Lenexa/IBM]
_______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

[attachment "signature.asc" deleted by Paul E Cain/Lenexa/IBM]
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140701/7b76ec9b/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140701/7b76ec9b/attachment-0003.gif>


More information about the Pacemaker mailing list