[ClusterLabs] EXTERNAL: Re: Pacemaker not reacting as I would expect when two resources fail at the same time

Sat Jun 8 03:00:00 EDT 2019

08.06.2019 5:12, Harvey Shepherd пишет:
> Thank you for your advice Ken. Sorry for the delayed reply - I was trying out a few things and trying to capture extra info. The changes that you suggested make sense, and I have incorporated them into my config. However, the original issue remains whereby Pacemaker does not attempt to restart the failed m_main_system process. I tried setting the migration-threshold of that resource to 1, to try to get Pacemaker to force it to be promoted on the other node, but this had no effect - the master instance remains "failed" and the slave instance remains "running" but is not promoted.

As far as I understand, for a clone to be promoted on a node this node
must have explicit master score or location constraint for this clone.
Master score is normally set by resource agent.

> Snipped output from crm_mon:
> 
> Current DC: primary (version unknown) - partition with quorum
> Last updated: Sat Jun  8 02:04:05 2019
> Last change: Sat Jun  8 01:51:25 2019 by hacluster via crmd on primary
> 
> 2 nodes configured
> 26 resources configured
> 
> Online: [ primary secondary ]
> 
> Active resources:
> 
>  Clone Set: m_main_system [main_system] (promotable)
>      main_system      (ocf::main_system-ocf):    FAILED secondary
>      Slaves: [ primary ]
> 
> Migration Summary:
> * Node secondary:
>    main_system: migration-threshold=1 fail-count=1 last-failure='Sat Jun  8 01:52:08 2019'
> 
> Failed Resource Actions:
> * main_system_monitor_10000 on secondary 'unknown error' (1): call=214, status=complete, exitreason='',
>     last-rc-change='Sat Jun  8 01:52:08 2019', queued=0ms, exec=0ms
> 
> 
> From the logs I see:
> 
> 2019 Jun  8 01:52:09.574 daemon.warning VIRTUAL pacemaker-schedulerd 1131  warning: Processing failed monitor of main_system:1 on secondary: unknown error 
> 2019 Jun  8 01:52:09.586 daemon.warning VIRTUAL pacemaker-schedulerd 1131  warning: Forcing m_main_system away from secondary after 1 failures (max=1)
> 2019 Jun  8 01:52:09.586 daemon.warning VIRTUAL pacemaker-schedulerd 1131  warning: Forcing m_main_system away from secondary after 1 failures (max=1)
> 2019 Jun  8 01:52:10.692 daemon.warning VIRTUAL pacemaker-controld 1132  warning: Transition 35 (Complete=33, Pending=0, Fired=0, Skipped=0, Incomplete=67, Source=/var/lib/pacemaker/pengine/pe-input-47.bz2): Terminated

Making this file available may help to determine why it decided to not
promote resource.

> 2019 Jun  8 01:52:10.692 daemon.warning VIRTUAL pacemaker-controld 1132  warning: Transition failed: terminated
> 
> 
> Do you have any further suggestions? For your information I've upgraded Pacemaker to 2.0.2, but the behaviour is the same.
> 
> Thanks,
> Harvey
> ________________________________________
> From: Users <users-bounces at clusterlabs.org> on behalf of Ken Gaillot <kgaillot at redhat.com>
> Sent: Saturday, 1 June 2019 5:40 a.m.
> To: Cluster Labs - All topics related to open-source clustering welcomed
> Subject: EXTERNAL: Re: [ClusterLabs] Pacemaker not reacting as I would expect when two resources fail at the same time
> 
> On Thu, 2019-05-30 at 23:39 +0000, Harvey Shepherd wrote:
>> Hi All,
>>
>> I'm running Pacemaker 2.0.1 on a cluster containing two nodes; one
>> master and one slave. I have a main master/slave resource
>> (m_main_system), a group of resources that run in active-active mode
>> (active_active - i.e. run on both nodes), and a group that runs in
>> active-disabled mode (snmp_active_disabled - resources only run on
>> the current promoted master). The snmp_active_disabled group is
>> configured to be co-located with the master of m_main_system, so only
>> a failure of the master m_main_system resource can trigger a
>> failover. The constraints specify that m_main_system must be started
>> before snmp_active_disabled.
>>
>> The problem I'm having is that when a resource in the
>> snmp_active_disabled group fails and gets into a constant cycle where
>> Pacemaker tries to restart it, and I then kill m_main_system on the
>> master, then Pacemaker still constantly tries to restart the failed
>> snmp_active_disabled resource and ignores the more important
>> m_main_system process which should be triggering a failover. If I
>> stabilise the snmp_active_disabled resource then Pacemaker finally
>> acts on the m_main_system failure. I hope I've described this well
>> enough, but I've included a cut down form of my CIB config below if
>> it helps!
>>
>> Is this a bug or an error in my config? Perhaps the order in which
>> the groups are defined in the CIB matters despite the constraints?
>> Any help would be gratefully received.
>>
>> Thanks,
>> Harvey
>>
>> <configuration>
>>   <crm_config>
>>     <cluster_property_set id="cib-bootstrap-options">
>>       <nvpair name="stonith-enabled" value="false" id="cib-bootstrap-
>> options-stonith-enabled"/>
>>       <nvpair name="no-quorum-policy" value="ignore" id="cib-
>> bootstrap-options-no-quorum-policy"/>
>>       <nvpair name="have-watchdog" value="false" id="cib-bootstrap-
>> options-have-watchdog"/>
>>       <nvpair name="cluster-name" value="lbcluster" id="cib-
>> bootstrap-options-cluster-name"/>
>>       <nvpair name="start-failure-is-fatal" value="false" id="cib-
>> bootstrap-options-start-failure-is-fatal"/>
>>       <nvpair name="cluster-recheck-interval" value="0s" id="cib-
>> bootstrap-options-cluster-recheck-interval"/>
>>     </cluster_property_set>
>>   </crm_config>
>>   <nodes>
>>     <node id="1" uname="primary"/>
>>     <node id="2" uname="secondary"/>
>>   </nodes>
>>   <resources>
>>     <group id="snmp_active_disabled">
>>         <primitive id="snmpd" class="lsb" type="snmpd">
>>           <operations>
>>             <op name="monitor" interval="10s" id="snmpd-monitor-
>> 10s"/>
>>             <op name="start" interval="0" timeout="30s" id="snmpd-
>> start-30s"/>
>>             <op name="stop" interval="0" timeout="30s" id="snmpd-
>> stop-30s"/>
>>           </operations>
>>         </primitive>
>>         <primitive id="snmp-auxiliaries" class="lsb" type="snmp-
>> auxiliaries">
>>           <operations>
>>             <op name="monitor" interval="10s" id="snmp-auxiliaries-
>> monitor-10s"/>
>>             <op name="start" interval="0" timeout="30s" id="snmp-
>> auxiliaries-start-30s"/>
>>             <op name="stop" interval="0" timeout="30s" id="snmp-
>> auxiliaries-stop-30s"/>
>>           </operations>
>>         </primitive>
>>     </group>
>>     <clone id="clone_active_active">
>>       <meta_attributes id="clone_active_active_meta_attributes">
>>         <nvpair id="group-unique" name="globally-unique"
>> value="false"/>
>>       </meta_attributes>
>>       <group id="active_active">
>>         <primitive id="logd" class="lsb" type="logd">
>>           <operations>
>>             <op name="monitor" interval="10s" id="logd-monitor-10s"/>
>>             <op name="start" interval="0" timeout="30s" id="logd-
>> start-30s"/>
>>             <op name="stop" interval="0" timeout="30s" id="logd-stop-
>> 30s"/>
>>           </operations>
>>         </primitive>
>>         <primitive id="serviced" class="lsb" type="serviced">
>>           <operations>
>>             <op name="monitor" interval="10s" id="serviced-monitor-
>> 10s"/>
>>             <op name="start" interval="0" timeout="30s" id="serviced-
>> start-30s"/>
>>             <op name="stop" interval="0" timeout="30s" id="serviced-
>> stop-30s"/>
>>           </operations>
>>         </primitive>
>>       </group>
>>     </clone>
>>     <master id="m_main_system">
>>       <meta_attributes id="m_main_system-meta_attributes">
>>         <nvpair name="notify" value="true" id="m_main_system-
>> meta_attributes-notify"/>
>>         <nvpair name="clone-max" value="2" id="m_main_system-
>> meta_attributes-clone-max"/>
>>         <nvpair name="promoted-max" value="1" id="m_main_system-
>> meta_attributes-promoted-max"/>
>>         <nvpair name="promoted-node-max" value="1" id="m_main_system-
>> meta_attributes-promoted-node-max"/>
>>       </meta_attributes>
>>       <primitive id="main_system" class="ocf" provider="acme"
>> type="main-system-ocf">
>>         <operations>
>>           <op name="start" interval="0" timeout="120s"
>> id="main_system-start-0"/>
>>           <op name="stop" interval="0" timeout="120s"
>> id="main_system-stop-0"/>
>>           <op name="promote" interval="0" timeout="120s"
>> id="main_system-promote-0"/>
>>           <op name="demote" interval="0" timeout="120s"
>> id="main_system-demote-0"/>
>>           <op name="monitor" interval="10s" timeout="10s"
>> role="Master" id="main_system-monitor-10s"/>
>>           <op name="monitor" interval="11s" timeout="10s"
>> role="Slave" id="main_system-monitor-11s"/>
>>           <op name="notify" interval="0" timeout="60s"
>> id="main_system-notify-0"/>
>>          </operations>
>>        </primitive>
>>     </master>
>>   </resources>
>>   <constraints>
>>     <rsc_colocation id="master_only_snmp_rscs_with_main_system"
>> score="INFINITY" rsc="snmp_active_disabled" with-rsc="m_main_system"
>> with-rsc-role="Master"/>
>>     <rsc_order id="snmp_active_disabled_after_main_system"
>> kind="Mandatory" first="m_main_system" then="snmp_active_disabled"/>
> 
> You want first-action="promote" in the above constraint, otherwise the
> slave being started (or the master being started but not yet promoted)
> is sufficient to start snmp_active_disabled (even though the colocation
> ensures it will only be started on the same node where the master will
> be).
> 
> I'm not sure if that's related to your issue, but it's worth trying
> first.
> 
>>     <rsc_order id="active_active_after_main_system" kind="Mandatory"
>> first="m_main_system" then="clone_active_active"/>
> 
> You may also want to set interleave to true on clone_active_active, if
> you want it to depend only on the local instance of m_main_system, and
> not both instances.
> 
>>   </constraints>
>>   <rsc_defaults>
>>     <meta_attributes id="rsc-options">
>>       <nvpair name="resource-stickiness" value="1" id="rsc-options-
>> resource-stickiness"/>
>>       <nvpair name="migration-threshold" value="0" id="rsc-options-
>> migration-threshold"/>
>>       <nvpair name="requires" value="nothing" id="rsc-options-
>> requires"/>
>>     </meta_attributes>
>>   </rsc_defaults>
>> </configuration>
> --
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>