[ClusterLabs] Cloned ressource is restarted on all nodes if one node fails

Mon Aug 9 09:19:29 EDT 2021

On 09.08.2021 16:00, Andreas Janning wrote:
> Hi,
> 
> yes, by "service" I meant the apache-clone resource.
> 
> Maybe I can give a more stripped down and detailed example:
> 
> *Given the following configuration:*
> [root at pacemaker-test-1 cluster]# pcs cluster cib --config
> <configuration>
>   <crm_config>
>     <cluster_property_set id="cib-bootstrap-options">
>       <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog"
> value="false"/>
>       <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> value="1.1.23-1.el7_9.1-9acf116022"/>
>       <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> name="cluster-infrastructure" value="corosync"/>
>       <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name"
> value="pacemaker-test"/>
>       <nvpair id="cib-bootstrap-options-stonith-enabled"
> name="stonith-enabled" value="false"/>
>       <nvpair id="cib-bootstrap-options-symmetric-cluster"
> name="symmetric-cluster" value="false"/>
>       <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> name="last-lrm-refresh" value="1628511747"/>
>     </cluster_property_set>
>   </crm_config>
>   <nodes>
>     <node id="1" uname="pacemaker-test-1"/>
>     <node id="2" uname="pacemaker-test-2"/>
>   </nodes>
>   <resources>
>     <clone id="apache-clone">
>       <primitive class="ocf" id="apache" provider="heartbeat" type="apache">
>         <instance_attributes id="apache-instance_attributes">
>           <nvpair id="apache-instance_attributes-port" name="port"
> value="80"/>
>           <nvpair id="apache-instance_attributes-statusurl"
> name="statusurl" value="http://localhost/server-status"/>
>         </instance_attributes>
>         <operations>
>           <op id="apache-monitor-interval-10s" interval="10s"
> name="monitor" timeout="20s"/>
>           <op id="apache-start-interval-0s" interval="0s" name="start"
> timeout="40s"/>
>           <op id="apache-stop-interval-0s" interval="0s" name="stop"
> timeout="60s"/>
>         </operations>
>       </primitive>
>       <meta_attributes id="apache-meta_attributes">
>         <nvpair id="apache-clone-meta_attributes-clone-max"
> name="clone-max" value="2"/>
>         <nvpair id="apache-clone-meta_attributes-clone-node-max"
> name="clone-node-max" value="1"/>
>         <nvpair id="apache-clone-meta_attributes-interleave"
> name="interleave" value="true"/>
>       </meta_attributes>
>     </clone>
>   </resources>
>   <constraints>
>     <rsc_location id="location-apache-clone-pacemaker-test-1-100"
> node="pacemaker-test-1" rsc="apache-clone" score="100"
> resource-discovery="exclusive"/>
>     <rsc_location id="location-apache-clone-pacemaker-test-2-0"
> node="pacemaker-test-2" rsc="apache-clone" score="0"
> resource-discovery="exclusive"/>
>   </constraints>
>   <rsc_defaults>
>     <meta_attributes id="rsc_defaults-options">
>       <nvpair id="rsc_defaults-options-resource-stickiness"
> name="resource-stickiness" value="50"/>
>     </meta_attributes>
>   </rsc_defaults>
> </configuration>
> 
> 
> *With the cluster in a running state:*
> 
> [root at pacemaker-test-1 cluster]# pcs status
> Cluster name: pacemaker-test
> Stack: corosync
> Current DC: pacemaker-test-2 (version 1.1.23-1.el7_9.1-9acf116022) -
> partition with quorum
> Last updated: Mon Aug  9 14:45:38 2021
> Last change: Mon Aug  9 14:43:14 2021 by hacluster via crmd on
> pacemaker-test-1
> 
> 2 nodes configured
> 2 resource instances configured
> 
> Online: [ pacemaker-test-1 pacemaker-test-2 ]
> 
> Full list of resources:
> 
>  Clone Set: apache-clone [apache]
>      Started: [ pacemaker-test-1 pacemaker-test-2 ]
> 
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled
> 
> *When simulating an error by killing the apache-resource on
> pacemaker-test-1:*
> 
> [root at pacemaker-test-1 ~]# killall httpd
> 
> *After a few seconds, the cluster notices that the apache-resource is down
> on pacemaker-test-1 and restarts it on pacemaker-test-1 (this is fine):*
> 
> [root at pacemaker-test-1 cluster]# cat corosync.log | grep crmd:

Never ever filter logs that you show unless you know what you are doing.

You skipped the most interesting part that is the intended actions.
Which are

Aug 09 15:59:37.889 ha1 pacemaker-schedulerd[3783] (LogAction)  notice:
 * Recover    apache:0     ( ha1 -> ha2 )
Aug 09 15:59:37.889 ha1 pacemaker-schedulerd[3783] (LogAction)  notice:
 * Move       apache:1     ( ha2 -> ha1 )

So pacemaker decides to "swap" nodes where current instances are running.

Looking at scores

Using the original execution date of: 2021-08-09 12:59:37Z

Current cluster status:
Online: [ ha1 ha2 ]

 vip	(ocf::pacemaker:Dummy):	 Started ha1
 Clone Set: apache-clone [apache]
     apache	(ocf::pacemaker:Dummy):	 FAILED ha1
     Started: [ ha2 ]

Allocation scores:
pcmk__clone_allocate: apache-clone allocation score on ha1: 200
pcmk__clone_allocate: apache-clone allocation score on ha2: 0
pcmk__clone_allocate: apache:0 allocation score on ha1: 101
pcmk__clone_allocate: apache:0 allocation score on ha2: 0
pcmk__clone_allocate: apache:1 allocation score on ha1: 100
pcmk__clone_allocate: apache:1 allocation score on ha2: 1
pcmk__native_allocate: apache:1 allocation score on ha1: 100
pcmk__native_allocate: apache:1 allocation score on ha2: 1
pcmk__native_allocate: apache:1 allocation score on ha1: 100
pcmk__native_allocate: apache:1 allocation score on ha2: 1
pcmk__native_allocate: apache:0 allocation score on ha1: -INFINITY
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pcmk__native_allocate: apache:0 allocation score on ha2: 0
pcmk__native_allocate: vip allocation score on ha1: 100
pcmk__native_allocate: vip allocation score on ha2: 0

Transition Summary:
 * Recover    apache:0     ( ha1 -> ha2 )
 * Move       apache:1     ( ha2 -> ha1 )

No, I do not have explanation why pacemaker decides that apache:0 cannot
run on ha1 in this case and so decides to move it to another node. It
most certainly has something to do with asymmetric cluster and location
scores. If you set the same location scores for apache-clone on both
nodes pacemaker will recover failed instance and won't attempt to move
it. Like

location location-apache-clone-ha1-100 apache-clone
resource-discovery=exclusive 100: ha1
location location-apache-clone-ha2-100 apache-clone
resource-discovery=exclusive 100: ha2