[ClusterLabs] Fencing questions.

Arjun Pandey apandepublic at gmail.com
Mon Oct 19 15:26:04 UTC 2015


Hi  Digimer

Please find  my response inilne.

On Mon, Oct 19, 2015 at 8:21 PM, Digimer <lists at alteeve.ca> wrote:

> On 19/10/15 06:53 AM, Arjun Pandey wrote:
> > Hi
> >
> > I am running a 2 node cluster with this config on centos 6.5/6.6  where
>
> It's important to keep both nodes on the same minor version,
> particularly in this case. Please either upgrade centos 6.5 to 6.6 or
> both to 6.7.
>
[Arjun]

> My bad.  Both the nodes are on centos 6.6 now.  We used to support this on
> 6.5 earlier.
>


> > i have a multi-state resource foo being run in master/slave mode and  a
> > bunch of floating IP addresses configured. Additionally i have
> > a collocation constraint for the IP addr to be collocated with the
> master.
> >
> > Please find the following files attached
> > cluster.conf
> > CIB
>
> It's preferable on a mailing list to copy the text into the body of the
> message. Easier to read.
> [Arjun] Adding now
>

cluster.conf
<cluster config_version="9" name="ucc">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="orana" nodeid="1">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="orana"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="kamet" nodeid="2">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="kamet"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>

CIB

<cib admin_epoch="0" cib-last-written="Fri Oct 16 01:16:42 2015"
crm_feature_set="3.0.9" epoch="44" num_updates="5"
validate-with="pacemaker-2.0" have-quorum="1" dc-uuid="kamet">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="1.1.11-97629de"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
name="cluster-infrastructure" value="cman"/>
        <nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy" value="ignore"/>
        <nvpair id="cib-bootstrap-options-cluster-recheck-interval"
name="cluster-recheck-interval" value="30s"/>
        <nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="true"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="orana" uname="orana"/>
      <node id="kamet" uname="kamet"/>
    </nodes>
    <resources>
      <master id="foo-master">
        <primitive class="ocf" id="foo" provider="heartbeat" type="uc">
          <instance_attributes id="foo-instance_attributes">
            <nvpair id="foo-instance_attributes-state" name="state"
value="/var/run/uc/role"/>
          </instance_attributes>
          <operations>
            <op id="foo-start-interval-0s" interval="0s" name="start"
on-fail="restart" timeout="100s"/>
            <op id="foo-monitor-interval-10s-role-Master" interval="10s"
name="monitor" on-fail="restart" role="Master" timeout="100s"/>
            <op id="foo-monitor-interval-11s-role-Slave" interval="11s"
name="monitor" on-fail="restart" role="Slave" timeout="100s"/>
            <op id="foo-promote-interval-0s" interval="0s" name="promote"
on-fail="restart" timeout="100s"/>
            <op id="foo-demote-interval-0s" interval="0s" name="demote"
on-fail="restart" timeout="100s"/>
            <op id="foo-stop-interval-0s" interval="0s" name="stop"
on-fail="restart" timeout="100s"/>
          </operations>
        </primitive>
        <meta_attributes id="foo-master-meta_attributes">
          <nvpair id="foo-master-meta_attributes-master-max"
name="master-max" value="1"/>
          <nvpair id="foo-master-meta_attributes-master-node-max"
name="master-node-max" value="1"/>
          <nvpair id="foo-master-meta_attributes-clone-max"
name="clone-max" value="2"/>
          <nvpair id="foo-master-meta_attributes-clone-node-max"
name="clone-node-max" value="1"/>
          <nvpair id="foo-master-meta_attributes-notify" name="notify"
value="true"/>
          <nvpair id="foo-master-meta_attributes-ordered" name="ordered"
value="true"/>
        </meta_attributes>
      </master>
      <primitive class="stonith" id="fence-uc-orana" type="fence_ilo4">
        <instance_attributes id="fence-uc-orana-instance_attributes">
          <nvpair id="fence-uc-orana-instance_attributes-login"
name="login" value="root"/>
          <nvpair id="fence-uc-orana-instance_attributes-passwd"
name="passwd" value="paswword"/>
          <nvpair id="fence-uc-orana-instance_attributes-ipaddr"
name="ipaddr" value="10.11.10.30"/>
          <nvpair id="fence-uc-orana-instance_attributes-pcmk_host_list"
name="pcmk_host_list" value="orana"/>
          <nvpair id="fence-uc-orana-instance_attributes-action"
name="action" value="reboot"/>
          <nvpair id="fence-uc-orana-instance_attributes-lanplus"
name="lanplus" value="1"/>
          <nvpair id="fence-uc-orana-instance_attributes-delay"
name="delay" value="0"/>
        </instance_attributes>
        <operations>
          <op id="fence-uc-orana-monitor-interval-60s" interval="60s"
name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence-uc-kamet" type="fence_ilo4">
        <instance_attributes id="fence-uc-kamet-instance_attributes">
          <nvpair id="fence-uc-kamet-instance_attributes-login"
name="login" value="root"/>
          <nvpair id="fence-uc-kamet-instance_attributes-passwd"
name="passwd" value="paswword"/>
          <nvpair id="fence-uc-kamet-instance_attributes-ipaddr"
name="ipaddr" value="10.11.10.21"/>
          <nvpair id="fence-uc-kamet-instance_attributes-pcmk_host_list"
name="pcmk_host_list" value="kamet"/>
          <nvpair id="fence-uc-kamet-instance_attributes-action"
name="action" value="reboot"/>
          <nvpair id="fence-uc-kamet-instance_attributes-lanplus"
name="lanplus" value="1"/>
          <nvpair id="fence-uc-kamet-instance_attributes-delay"
name="delay" value="10"/>
        </instance_attributes>
        <operations>
          <op id="fence-uc-kamet-monitor-interval-60s" interval="60s"
name="monitor"/>
        </operations>
      </primitive>
      <primitive class="ocf" id="CWS-FLT" provider="heartbeat"
type="IPaddr2">
        <instance_attributes id="CWS-FLT-instance_attributes">
          <nvpair id="CWS-FLT-instance_attributes-ip" name="ip"
value="10.41.0.108"/>
          <nvpair id="CWS-FLT-instance_attributes-cidr_netmask"
name="cidr_netmask" value="32"/>
          <nvpair id="CWS-FLT-instance_attributes-iflabel" name="iflabel"
value="CWS-FLT"/>
        </instance_attributes>
        <operations>
          <op id="CWS-FLT-start-timeout-20s" interval="0s" name="start"
timeout="20s"/>
          <op id="CWS-FLT-stop-timeout-20s" interval="0s" name="stop"
timeout="20s"/>
          <op id="CWS-FLT-monitor-interval-200ms" interval="200ms"
name="monitor"/>
        </operations>
        <meta_attributes id="CWS-FLT-meta_attributes">
          <nvpair id="CWS-FLT-meta_attributes-failure-timeout"
name="failure-timeout" value="3s"/>
          <nvpair id="CWS-FLT-meta_attributes-migration-threshold"
name="migration-threshold" value="2"/>
        </meta_attributes>
      </primitive>
      <primitive class="ocf" id="MGMT-FLT" provider="heartbeat"
type="IPaddr2">
        <instance_attributes id="MGMT-FLT-instance_attributes">
          <nvpair id="MGMT-FLT-instance_attributes-ip" name="ip"
value="10.61.0.227"/>
          <nvpair id="MGMT-FLT-instance_attributes-cidr_netmask"
name="cidr_netmask" value="32"/>
          <nvpair id="MGMT-FLT-instance_attributes-iflabel" name="iflabel"
value="MGMT-FLT"/>
        </instance_attributes>
        <operations>
          <op id="MGMT-FLT-start-timeout-20s" interval="0s" name="start"
timeout="20s"/>
          <op id="MGMT-FLT-stop-timeout-20s" interval="0s" name="stop"
timeout="20s"/>
          <op id="MGMT-FLT-monitor-interval-200ms" interval="200ms"
name="monitor"/>
        </operations>
        <meta_attributes id="MGMT-FLT-meta_attributes">
          <nvpair id="MGMT-FLT-meta_attributes-failure-timeout"
name="failure-timeout" value="3s"/>
          <nvpair id="MGMT-FLT-meta_attributes-migration-threshold"
name="migration-threshold" value="2"/>
        </meta_attributes>
      </primitive>
      <primitive class="ocf" id="MME-FLT" provider="heartbeat"
type="IPaddr2">
        <instance_attributes id="MME-FLT-instance_attributes">
          <nvpair id="MME-FLT-instance_attributes-ip" name="ip"
value="10.21.0.108"/>
          <nvpair id="MME-FLT-instance_attributes-cidr_netmask"
name="cidr_netmask" value="32"/>
          <nvpair id="MME-FLT-instance_attributes-iflabel" name="iflabel"
value="MME-FLT"/>
        </instance_attributes>
        <operations>
          <op id="MME-FLT-start-timeout-20s" interval="0s" name="start"
timeout="20s"/>
          <op id="MME-FLT-stop-timeout-20s" interval="0s" name="stop"
timeout="20s"/>
          <op id="MME-FLT-monitor-interval-200ms" interval="200ms"
name="monitor"/>
        </operations>
        <meta_attributes id="MME-FLT-meta_attributes">
          <nvpair id="MME-FLT-meta_attributes-failure-timeout"
name="failure-timeout" value="3s"/>
          <nvpair id="MME-FLT-meta_attributes-migration-threshold"
name="migration-threshold" value="2"/>
        </meta_attributes>
      </primitive>
      <primitive class="ocf" id="SGW-FLT" provider="heartbeat"
type="IPaddr2">
        <instance_attributes id="SGW-FLT-instance_attributes">
          <nvpair id="SGW-FLT-instance_attributes-ip" name="ip"
value="10.31.0.108"/>
          <nvpair id="SGW-FLT-instance_attributes-cidr_netmask"
name="cidr_netmask" value="32"/>
          <nvpair id="SGW-FLT-instance_attributes-iflabel" name="iflabel"
value="SGW-FLT"/>
        </instance_attributes>
        <operations>
          <op id="SGW-FLT-start-timeout-20s" interval="0s" name="start"
timeout="20s"/>
          <op id="SGW-FLT-stop-timeout-20s" interval="0s" name="stop"
timeout="20s"/>
          <op id="SGW-FLT-monitor-interval-200ms" interval="200ms"
name="monitor"/>
        </operations>
        <meta_attributes id="SGW-FLT-meta_attributes">
          <nvpair id="SGW-FLT-meta_attributes-failure-timeout"
name="failure-timeout" value="3s"/>
          <nvpair id="SGW-FLT-meta_attributes-migration-threshold"
name="migration-threshold" value="2"/>
        </meta_attributes>
      </primitive>
    </resources>
    <constraints>
      <rsc_colocation id="colocation-CWS-FLT-foo-master-INFINITY"
rsc="CWS-FLT" rsc-role="Started" score="INFINITY" with-rsc="foo-master"
with-rsc-role="Master"/>
      <rsc_order first="CWS-FLT" first-action="start"
id="order-CWS-FLT-foo-master-mandatory" then="foo-master"
then-action="promote"/>
      <rsc_colocation id="colocation-MGMT-FLT-foo-master-INFINITY"
rsc="MGMT-FLT" rsc-role="Started" score="INFINITY" with-rsc="foo-master"
with-rsc-role="Master"/>
      <rsc_order first="MGMT-FLT" first-action="start"
id="order-MGMT-FLT-foo-master-mandatory" then="foo-master"
then-action="promote"/>
      <rsc_colocation id="colocation-MME-FLT-foo-master-INFINITY"
rsc="MME-FLT" rsc-role="Started" score="INFINITY" with-rsc="foo-master"
with-rsc-role="Master"/>
      <rsc_order first="MME-FLT" first-action="start"
id="order-MME-FLT-foo-master-mandatory" then="foo-master"
then-action="promote"/>
      <rsc_colocation id="colocation-SGW-FLT-foo-master-INFINITY"
rsc="SGW-FLT" rsc-role="Started" score="INFINITY" with-rsc="foo-master"
with-rsc-role="Master"/>
      <rsc_order first="SGW-FLT" first-action="start"
id="order-SGW-FLT-foo-master-mandatory" then="foo-master"
then-action="promote"/>
    </constraints>
  </configuration>
  <status>
    <node_state id="orana" uname="orana" in_ccm="true" crmd="online"
crm-debug-origin="do_state_transition" join="member" expected="member">
      <transient_attributes id="orana">
        <instance_attributes id="status-orana">
          <nvpair id="status-orana-shutdown" name="shutdown" value="0"/>
          <nvpair id="status-orana-probe_complete" name="probe_complete"
value="true"/>
          <nvpair id="status-orana-master-foo" name="master-foo" value="5"/>
        </instance_attributes>
      </transient_attributes>
      <lrm id="orana">
        <lrm_resources>
          <lrm_resource id="foo" type="uc" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="foo_last_0" operation_key="foo_start_0"
operation="start" crm-debug-origin="build_active_RAs"
crm_feature_set="3.0.9"
transition-key="16:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;16:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="32" rc-code="0" op-status="0" interval="0" last-run="1444976038"
last-rc-change="1444976038" exec-time="121" queue-time="0"
op-digest="96d68f528fd950fa93acae8f44e75df5" on_node="orana"/>
            <lrm_rsc_op id="foo_monitor_11000"
operation_key="foo_monitor_11000" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="14:13:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;14:13:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="35" rc-code="0" op-status="0" interval="11000"
last-rc-change="1444976038" exec-time="40" queue-time="0"
op-digest="f4e31338d1a8837389f1948c9c05d8a8" on_node="orana"/>
          </lrm_resource>
          <lrm_resource id="fence-uc-orana" type="fence_ilo4"
class="stonith">
            <lrm_rsc_op id="fence-uc-orana_last_0"
operation_key="fence-uc-orana_monitor_0" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="13:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:7;13:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="10" rc-code="7" op-status="0" interval="0" last-run="1444976037"
last-rc-change="1444976037" exec-time="2" queue-time="0"
op-digest="3e7fbc0a0806462458d260013dc65c63" on_node="orana"/>
          </lrm_resource>
          <lrm_resource id="CWS-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="CWS-FLT_last_0"
operation_key="CWS-FLT_monitor_0" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="15:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:7;15:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="18" rc-code="7" op-status="0" interval="0" last-run="1444976038"
last-rc-change="1444976038" exec-time="95" queue-time="0"
op-digest="bd17ddd816711afe0f94e3e486dd8fa3" on_node="orana"/>
          </lrm_resource>
          <lrm_resource id="MGMT-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="MGMT-FLT_last_0"
operation_key="MGMT-FLT_monitor_0" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="16:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:7;16:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="22" rc-code="7" op-status="0" interval="0" last-run="1444976038"
last-rc-change="1444976038" exec-time="93" queue-time="1"
op-digest="5db4cd09e5d23feb415974ce45457356" on_node="orana"/>
          </lrm_resource>
          <lrm_resource id="fence-uc-kamet" type="fence_ilo4"
class="stonith">
            <lrm_rsc_op id="fence-uc-kamet_last_0"
operation_key="fence-uc-kamet_start_0" operation="start"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="45:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;45:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="31" rc-code="0" op-status="0" interval="0" last-run="1444976038"
last-rc-change="1444976038" exec-time="112" queue-time="0"
op-digest="bbef8a89ec92b4cc275e37cbb6323d4a" on_node="orana"/>
            <lrm_rsc_op id="fence-uc-kamet_monitor_60000"
operation_key="fence-uc-kamet_monitor_60000" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="46:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;46:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="33" rc-code="0" op-status="0" interval="60000"
last-rc-change="1444976038" exec-time="96" queue-time="0"
op-digest="472b1773192077d14fdb311c4f15c455" on_node="orana"/>
          </lrm_resource>
          <lrm_resource id="SGW-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="SGW-FLT_last_0"
operation_key="SGW-FLT_monitor_0" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="18:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:7;18:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="30" rc-code="7" op-status="0" interval="0" last-run="1444976038"
last-rc-change="1444976038" exec-time="91" queue-time="0"
op-digest="e6567f75483c28bef41238ae28239b83" on_node="orana"/>
          </lrm_resource>
          <lrm_resource id="MME-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="MME-FLT_last_0"
operation_key="MME-FLT_monitor_0" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="17:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:7;17:10:7:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="26" rc-code="7" op-status="0" interval="0" last-run="1444976038"
last-rc-change="1444976038" exec-time="92" queue-time="0"
op-digest="432035408e1b5acf3f42708b492fb7ab" on_node="orana"/>
          </lrm_resource>
        </lrm_resources>
      </lrm>
    </node_state>
    <node_state id="kamet" uname="kamet" crmd="online"
crm-debug-origin="do_state_transition" in_ccm="true" join="member"
expected="member">
      <transient_attributes id="kamet">
        <instance_attributes id="status-kamet">
          <nvpair id="status-kamet-shutdown" name="shutdown" value="0"/>
          <nvpair id="status-kamet-master-foo" name="master-foo"
value="10"/>
          <nvpair id="status-kamet-probe_complete" name="probe_complete"
value="true"/>
        </instance_attributes>
      </transient_attributes>
      <lrm id="kamet">
        <lrm_resources>
          <lrm_resource id="foo" type="uc" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="foo_last_0" operation_key="foo_promote_0"
operation="promote" crm-debug-origin="build_active_RAs"
crm_feature_set="3.0.9"
transition-key="7:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;7:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="48" rc-code="0" op-status="0" interval="0" last-run="1444975971"
last-rc-change="1444975971" exec-time="1040" queue-time="0"
op-digest="96d68f528fd950fa93acae8f44e75df5" on_node="kamet"/>
            <lrm_rsc_op id="foo_monitor_10000"
operation_key="foo_monitor_10000" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="14:2:8:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:8;14:2:8:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="51" rc-code="8" op-status="0" interval="10000"
last-rc-change="1444975972" exec-time="15" queue-time="0"
op-digest="f4e31338d1a8837389f1948c9c05d8a8" on_node="kamet"/>
          </lrm_resource>
          <lrm_resource id="fence-uc-orana" type="fence_ilo4"
class="stonith">
            <lrm_rsc_op id="fence-uc-orana_last_0"
operation_key="fence-uc-orana_start_0" operation="start"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="33:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;33:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="38" rc-code="0" op-status="0" interval="0" last-run="1444975970"
last-rc-change="1444975970" exec-time="81" queue-time="0"
op-digest="3e7fbc0a0806462458d260013dc65c63" on_node="kamet"/>
            <lrm_rsc_op id="fence-uc-orana_monitor_60000"
operation_key="fence-uc-orana_monitor_60000" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="34:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;34:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="49" rc-code="0" op-status="0" interval="60000"
last-rc-change="1444975971" exec-time="94" queue-time="0"
op-digest="da753514aceb43a1ec31fe7370a5e10a" on_node="kamet"/>
          </lrm_resource>
          <lrm_resource id="CWS-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="CWS-FLT_last_0" operation_key="CWS-FLT_start_0"
operation="start" crm-debug-origin="build_active_RAs"
crm_feature_set="3.0.9"
transition-key="37:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;37:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="39" rc-code="0" op-status="0" interval="0" last-run="1444975970"
last-rc-change="1444975970" exec-time="39" queue-time="0"
op-digest="bd17ddd816711afe0f94e3e486dd8fa3" on_node="kamet"/>
            <lrm_rsc_op id="CWS-FLT_monitor_200"
operation_key="CWS-FLT_monitor_200" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="38:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;38:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="44" rc-code="0" op-status="0" interval="200"
last-rc-change="1444975971" exec-time="47" queue-time="0"
op-digest="dd48e25aee40f046ed8929789ede6cbd" on_node="kamet"/>
          </lrm_resource>
          <lrm_resource id="MME-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="MME-FLT_last_0" operation_key="MME-FLT_start_0"
operation="start" crm-debug-origin="build_active_RAs"
crm_feature_set="3.0.9"
transition-key="41:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;41:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="41" rc-code="0" op-status="0" interval="0" last-run="1444975970"
last-rc-change="1444975970" exec-time="41" queue-time="0"
op-digest="432035408e1b5acf3f42708b492fb7ab" on_node="kamet"/>
            <lrm_rsc_op id="MME-FLT_monitor_200"
operation_key="MME-FLT_monitor_200" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="42:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;42:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="47" rc-code="0" op-status="0" interval="200"
last-rc-change="1444975971" exec-time="46" queue-time="0"
op-digest="b1863a933764285520f905a8fb72bc4b" on_node="kamet"/>
          </lrm_resource>
          <lrm_resource id="fence-uc-kamet" type="fence_ilo4"
class="stonith">
            <lrm_rsc_op id="fence-uc-kamet_last_0"
operation_key="fence-uc-kamet_stop_0" operation="stop"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="44:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;44:12:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="54" rc-code="0" op-status="0" interval="0" last-run="1444976192"
last-rc-change="1444976192" exec-time="1" queue-time="0"
op-digest="bbef8a89ec92b4cc275e37cbb6323d4a" on_node="kamet"/>
          </lrm_resource>
          <lrm_resource id="SGW-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="SGW-FLT_last_0" operation_key="SGW-FLT_start_0"
operation="start" crm-debug-origin="build_active_RAs"
crm_feature_set="3.0.9"
transition-key="43:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;43:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="42" rc-code="0" op-status="0" interval="0" last-run="1444975970"
last-rc-change="1444975970" exec-time="39" queue-time="0"
op-digest="e6567f75483c28bef41238ae28239b83" on_node="kamet"/>
            <lrm_rsc_op id="SGW-FLT_monitor_200"
operation_key="SGW-FLT_monitor_200" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="44:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;44:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="45" rc-code="0" op-status="0" interval="200"
last-rc-change="1444975971" exec-time="46" queue-time="1"
op-digest="1b39e28afccb0a6e8718c07837509e14" on_node="kamet"/>
          </lrm_resource>
          <lrm_resource id="MGMT-FLT" type="IPaddr2" class="ocf"
provider="heartbeat">
            <lrm_rsc_op id="MGMT-FLT_last_0"
operation_key="MGMT-FLT_start_0" operation="start"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="39:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;39:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="40" rc-code="0" op-status="0" interval="0" last-run="1444975970"
last-rc-change="1444975970" exec-time="41" queue-time="0"
op-digest="5db4cd09e5d23feb415974ce45457356" on_node="kamet"/>
            <lrm_rsc_op id="MGMT-FLT_monitor_200"
operation_key="MGMT-FLT_monitor_200" operation="monitor"
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9"
transition-key="40:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
transition-magic="0:0;40:1:0:0d04ebc5-af15-4b9a-b908-81505ef8ca62"
call-id="46" rc-code="0" op-status="0" interval="200"
last-rc-change="1444975971" exec-time="46" queue-time="0"
op-digest="98d947dc43c9271e135cd6da8f56744b" on_node="kamet"/>
          </lrm_resource>
        </lrm_resources>
      </lrm>
    </node_state>
  </status>
</cib>

> Issues that i have :-
> > 1. Daemons required for fencing
> > Earlier we were invoking cman start quorum from pacemaker script which
> > ensured that fenced / gfs and other daemons are not started. This was ok
> > since fencing wasn't being handled earlier.
>
> The cman fencing is simply a pass-through to pacemaker. When pacemaker
> tells cman that fencing succeeded, it inform DLM and begins cleanup.
>
> > For fencing purpose do we only need the fenced to be started ?  We don't
> > have any gfs partitions that we want to monitor via pacemaker. My
> > concern here is that if i use the unmodified script then pacemaker start
> > time increases significantly. I see a difference of 60 sec from the
> > earlier startup before service pacemaker status shows up as started.
>
> Don't start fenced manually, just start pacemaker and let it handle
> everything. Ideally, use the pcs command (and pcsd daemon on the nodes)
> to start/stop the cluster, but you'll need to upgrade to 6.7.

[Arjun]
I am not starting fencing manually,it gets started from pacemaker setup
itself. However if i look at cman init script start routine
there's a case where if one calls  "cman start quorum"  in which fenced/gfs
and  other daemons are not started.
Source code link
https://git.fedorahosted.org/cgit/cluster.git/tree/cman/init.d/cman.in?h=RHEL6#n795

This  is what  gets called from our pacemaker init script

I was wondering whether i should add a new case in this script because we
don't really use gfs/dlm. This is leading to substantial increase in
pacemaker startup time

> 2. Fencing test cases.
> >  Based on the internet queries i could find , apart from plugging out
> > the dedicated cable. The only other case suggested is killing corosync
> > process on one of the nodes.
> > Are there any other basic cases that i should look at ?
> > What about bring up interface down manually ? I understand that this is
> > an unlikely scenario but i am just looking for more ways to test this
> out.
>
> echo c > /proc/sysrq-trigger == kernel panic. It's my preferred test.
> Also, killing the power to the node will cause IPMI to fail and will
> test your backup fence method, if you have it, or ensure the cluster
> locks up if you don't (better to hang than to risk corruption).
>
> > 3. Testing whether fencing is working or not.
> > Previously i have been using fence_ilo4 from the shell to test whether
> > the command is working. I was assuming that similar invocation would be
> > done by stonith when actual fencing needs to be done.
> >
> > However based on other threads i could find people also use fence_tool
> > <node-name> to try this out. According to me this tests out whether
> > fencing when invoked by fenced for a particular node succeeds or not. Is
> > that valid ?
>
> Fence tool is just a command to control the cluster's fencing. The
> fence_X agents do the actual work.
>
> > Since we are configuring fence_pcmk as the fence device the flow of
> > things is
> > fenced -> fence_pcmk -> stonith -> fence agent.
>
> Basically correct.
>
> > 4. Fencing agent to be used (fence_ipmilan vs fence_ilo4)
> > Also for ILO fencing i see fence_ilo4 and fence_ipmilan both available.
> > I had been using fence_ilo4 till now.
>
> Which ever works is fine. I believe a lot of the fence_X out-of-band
> agents are actually just links to fence_ipmilan, but I might be wrong.
>
> > I think this mail has multiple questions and i will probably send out
> > another mail for a few issues i see after fencing takes place.
> >
> > Thanks in advance
> > Arjun
> >
> >
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20151019/5df89fd8/attachment.htm>


More information about the Users mailing list