[ClusterLabs] EXTERNAL: Re: Pacemaker not reacting as I would expect when two resources fail at the same time
Andrei Borzenkov
arvidjaar at gmail.com
Sat Jun 8 03:00:00 EDT 2019
08.06.2019 5:12, Harvey Shepherd пишет:
> Thank you for your advice Ken. Sorry for the delayed reply - I was trying out a few things and trying to capture extra info. The changes that you suggested make sense, and I have incorporated them into my config. However, the original issue remains whereby Pacemaker does not attempt to restart the failed m_main_system process. I tried setting the migration-threshold of that resource to 1, to try to get Pacemaker to force it to be promoted on the other node, but this had no effect - the master instance remains "failed" and the slave instance remains "running" but is not promoted.
As far as I understand, for a clone to be promoted on a node this node
must have explicit master score or location constraint for this clone.
Master score is normally set by resource agent.
> Snipped output from crm_mon:
>
> Current DC: primary (version unknown) - partition with quorum
> Last updated: Sat Jun 8 02:04:05 2019
> Last change: Sat Jun 8 01:51:25 2019 by hacluster via crmd on primary
>
> 2 nodes configured
> 26 resources configured
>
> Online: [ primary secondary ]
>
> Active resources:
>
> Clone Set: m_main_system [main_system] (promotable)
> main_system (ocf::main_system-ocf): FAILED secondary
> Slaves: [ primary ]
>
> Migration Summary:
> * Node secondary:
> main_system: migration-threshold=1 fail-count=1 last-failure='Sat Jun 8 01:52:08 2019'
>
> Failed Resource Actions:
> * main_system_monitor_10000 on secondary 'unknown error' (1): call=214, status=complete, exitreason='',
> last-rc-change='Sat Jun 8 01:52:08 2019', queued=0ms, exec=0ms
>
>
> From the logs I see:
>
> 2019 Jun 8 01:52:09.574 daemon.warning VIRTUAL pacemaker-schedulerd 1131 warning: Processing failed monitor of main_system:1 on secondary: unknown error
> 2019 Jun 8 01:52:09.586 daemon.warning VIRTUAL pacemaker-schedulerd 1131 warning: Forcing m_main_system away from secondary after 1 failures (max=1)
> 2019 Jun 8 01:52:09.586 daemon.warning VIRTUAL pacemaker-schedulerd 1131 warning: Forcing m_main_system away from secondary after 1 failures (max=1)
> 2019 Jun 8 01:52:10.692 daemon.warning VIRTUAL pacemaker-controld 1132 warning: Transition 35 (Complete=33, Pending=0, Fired=0, Skipped=0, Incomplete=67, Source=/var/lib/pacemaker/pengine/pe-input-47.bz2): Terminated
Making this file available may help to determine why it decided to not
promote resource.
> 2019 Jun 8 01:52:10.692 daemon.warning VIRTUAL pacemaker-controld 1132 warning: Transition failed: terminated
>
>
> Do you have any further suggestions? For your information I've upgraded Pacemaker to 2.0.2, but the behaviour is the same.
>
> Thanks,
> Harvey
> ________________________________________
> From: Users <users-bounces at clusterlabs.org> on behalf of Ken Gaillot <kgaillot at redhat.com>
> Sent: Saturday, 1 June 2019 5:40 a.m.
> To: Cluster Labs - All topics related to open-source clustering welcomed
> Subject: EXTERNAL: Re: [ClusterLabs] Pacemaker not reacting as I would expect when two resources fail at the same time
>
> On Thu, 2019-05-30 at 23:39 +0000, Harvey Shepherd wrote:
>> Hi All,
>>
>> I'm running Pacemaker 2.0.1 on a cluster containing two nodes; one
>> master and one slave. I have a main master/slave resource
>> (m_main_system), a group of resources that run in active-active mode
>> (active_active - i.e. run on both nodes), and a group that runs in
>> active-disabled mode (snmp_active_disabled - resources only run on
>> the current promoted master). The snmp_active_disabled group is
>> configured to be co-located with the master of m_main_system, so only
>> a failure of the master m_main_system resource can trigger a
>> failover. The constraints specify that m_main_system must be started
>> before snmp_active_disabled.
>>
>> The problem I'm having is that when a resource in the
>> snmp_active_disabled group fails and gets into a constant cycle where
>> Pacemaker tries to restart it, and I then kill m_main_system on the
>> master, then Pacemaker still constantly tries to restart the failed
>> snmp_active_disabled resource and ignores the more important
>> m_main_system process which should be triggering a failover. If I
>> stabilise the snmp_active_disabled resource then Pacemaker finally
>> acts on the m_main_system failure. I hope I've described this well
>> enough, but I've included a cut down form of my CIB config below if
>> it helps!
>>
>> Is this a bug or an error in my config? Perhaps the order in which
>> the groups are defined in the CIB matters despite the constraints?
>> Any help would be gratefully received.
>>
>> Thanks,
>> Harvey
>>
>> <configuration>
>> <crm_config>
>> <cluster_property_set id="cib-bootstrap-options">
>> <nvpair name="stonith-enabled" value="false" id="cib-bootstrap-
>> options-stonith-enabled"/>
>> <nvpair name="no-quorum-policy" value="ignore" id="cib-
>> bootstrap-options-no-quorum-policy"/>
>> <nvpair name="have-watchdog" value="false" id="cib-bootstrap-
>> options-have-watchdog"/>
>> <nvpair name="cluster-name" value="lbcluster" id="cib-
>> bootstrap-options-cluster-name"/>
>> <nvpair name="start-failure-is-fatal" value="false" id="cib-
>> bootstrap-options-start-failure-is-fatal"/>
>> <nvpair name="cluster-recheck-interval" value="0s" id="cib-
>> bootstrap-options-cluster-recheck-interval"/>
>> </cluster_property_set>
>> </crm_config>
>> <nodes>
>> <node id="1" uname="primary"/>
>> <node id="2" uname="secondary"/>
>> </nodes>
>> <resources>
>> <group id="snmp_active_disabled">
>> <primitive id="snmpd" class="lsb" type="snmpd">
>> <operations>
>> <op name="monitor" interval="10s" id="snmpd-monitor-
>> 10s"/>
>> <op name="start" interval="0" timeout="30s" id="snmpd-
>> start-30s"/>
>> <op name="stop" interval="0" timeout="30s" id="snmpd-
>> stop-30s"/>
>> </operations>
>> </primitive>
>> <primitive id="snmp-auxiliaries" class="lsb" type="snmp-
>> auxiliaries">
>> <operations>
>> <op name="monitor" interval="10s" id="snmp-auxiliaries-
>> monitor-10s"/>
>> <op name="start" interval="0" timeout="30s" id="snmp-
>> auxiliaries-start-30s"/>
>> <op name="stop" interval="0" timeout="30s" id="snmp-
>> auxiliaries-stop-30s"/>
>> </operations>
>> </primitive>
>> </group>
>> <clone id="clone_active_active">
>> <meta_attributes id="clone_active_active_meta_attributes">
>> <nvpair id="group-unique" name="globally-unique"
>> value="false"/>
>> </meta_attributes>
>> <group id="active_active">
>> <primitive id="logd" class="lsb" type="logd">
>> <operations>
>> <op name="monitor" interval="10s" id="logd-monitor-10s"/>
>> <op name="start" interval="0" timeout="30s" id="logd-
>> start-30s"/>
>> <op name="stop" interval="0" timeout="30s" id="logd-stop-
>> 30s"/>
>> </operations>
>> </primitive>
>> <primitive id="serviced" class="lsb" type="serviced">
>> <operations>
>> <op name="monitor" interval="10s" id="serviced-monitor-
>> 10s"/>
>> <op name="start" interval="0" timeout="30s" id="serviced-
>> start-30s"/>
>> <op name="stop" interval="0" timeout="30s" id="serviced-
>> stop-30s"/>
>> </operations>
>> </primitive>
>> </group>
>> </clone>
>> <master id="m_main_system">
>> <meta_attributes id="m_main_system-meta_attributes">
>> <nvpair name="notify" value="true" id="m_main_system-
>> meta_attributes-notify"/>
>> <nvpair name="clone-max" value="2" id="m_main_system-
>> meta_attributes-clone-max"/>
>> <nvpair name="promoted-max" value="1" id="m_main_system-
>> meta_attributes-promoted-max"/>
>> <nvpair name="promoted-node-max" value="1" id="m_main_system-
>> meta_attributes-promoted-node-max"/>
>> </meta_attributes>
>> <primitive id="main_system" class="ocf" provider="acme"
>> type="main-system-ocf">
>> <operations>
>> <op name="start" interval="0" timeout="120s"
>> id="main_system-start-0"/>
>> <op name="stop" interval="0" timeout="120s"
>> id="main_system-stop-0"/>
>> <op name="promote" interval="0" timeout="120s"
>> id="main_system-promote-0"/>
>> <op name="demote" interval="0" timeout="120s"
>> id="main_system-demote-0"/>
>> <op name="monitor" interval="10s" timeout="10s"
>> role="Master" id="main_system-monitor-10s"/>
>> <op name="monitor" interval="11s" timeout="10s"
>> role="Slave" id="main_system-monitor-11s"/>
>> <op name="notify" interval="0" timeout="60s"
>> id="main_system-notify-0"/>
>> </operations>
>> </primitive>
>> </master>
>> </resources>
>> <constraints>
>> <rsc_colocation id="master_only_snmp_rscs_with_main_system"
>> score="INFINITY" rsc="snmp_active_disabled" with-rsc="m_main_system"
>> with-rsc-role="Master"/>
>> <rsc_order id="snmp_active_disabled_after_main_system"
>> kind="Mandatory" first="m_main_system" then="snmp_active_disabled"/>
>
> You want first-action="promote" in the above constraint, otherwise the
> slave being started (or the master being started but not yet promoted)
> is sufficient to start snmp_active_disabled (even though the colocation
> ensures it will only be started on the same node where the master will
> be).
>
> I'm not sure if that's related to your issue, but it's worth trying
> first.
>
>> <rsc_order id="active_active_after_main_system" kind="Mandatory"
>> first="m_main_system" then="clone_active_active"/>
>
> You may also want to set interleave to true on clone_active_active, if
> you want it to depend only on the local instance of m_main_system, and
> not both instances.
>
>> </constraints>
>> <rsc_defaults>
>> <meta_attributes id="rsc-options">
>> <nvpair name="resource-stickiness" value="1" id="rsc-options-
>> resource-stickiness"/>
>> <nvpair name="migration-threshold" value="0" id="rsc-options-
>> migration-threshold"/>
>> <nvpair name="requires" value="nothing" id="rsc-options-
>> requires"/>
>> </meta_attributes>
>> </rsc_defaults>
>> </configuration>
> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list