[ClusterLabs] Cloned ressource is restarted on all nodes if one node fails

Mon Aug 9 15:57:05 EDT 2021

On Mon, Aug 9, 2021 at 6:19 AM Andrei Borzenkov <arvidjaar at gmail.com> wrote:

> On 09.08.2021 16:00, Andreas Janning wrote:
> > Hi,
> >
> > yes, by "service" I meant the apache-clone resource.
> >
> > Maybe I can give a more stripped down and detailed example:
> >
> > *Given the following configuration:*
> > [root at pacemaker-test-1 cluster]# pcs cluster cib --config
> > <configuration>
> >   <crm_config>
> >     <cluster_property_set id="cib-bootstrap-options">
> >       <nvpair id="cib-bootstrap-options-have-watchdog"
> name="have-watchdog"
> > value="false"/>
> >       <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> > value="1.1.23-1.el7_9.1-9acf116022"/>
> >       <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> > name="cluster-infrastructure" value="corosync"/>
> >       <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name"
> > value="pacemaker-test"/>
> >       <nvpair id="cib-bootstrap-options-stonith-enabled"
> > name="stonith-enabled" value="false"/>
> >       <nvpair id="cib-bootstrap-options-symmetric-cluster"
> > name="symmetric-cluster" value="false"/>
> >       <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> > name="last-lrm-refresh" value="1628511747"/>
> >     </cluster_property_set>
> >   </crm_config>
> >   <nodes>
> >     <node id="1" uname="pacemaker-test-1"/>
> >     <node id="2" uname="pacemaker-test-2"/>
> >   </nodes>
> >   <resources>
> >     <clone id="apache-clone">
> >       <primitive class="ocf" id="apache" provider="heartbeat"
> type="apache">
> >         <instance_attributes id="apache-instance_attributes">
> >           <nvpair id="apache-instance_attributes-port" name="port"
> > value="80"/>
> >           <nvpair id="apache-instance_attributes-statusurl"
> > name="statusurl" value="http://localhost/server-status"/>
> >         </instance_attributes>
> >         <operations>
> >           <op id="apache-monitor-interval-10s" interval="10s"
> > name="monitor" timeout="20s"/>
> >           <op id="apache-start-interval-0s" interval="0s" name="start"
> > timeout="40s"/>
> >           <op id="apache-stop-interval-0s" interval="0s" name="stop"
> > timeout="60s"/>
> >         </operations>
> >       </primitive>
> >       <meta_attributes id="apache-meta_attributes">
> >         <nvpair id="apache-clone-meta_attributes-clone-max"
> > name="clone-max" value="2"/>
> >         <nvpair id="apache-clone-meta_attributes-clone-node-max"
> > name="clone-node-max" value="1"/>
> >         <nvpair id="apache-clone-meta_attributes-interleave"
> > name="interleave" value="true"/>
> >       </meta_attributes>
> >     </clone>
> >   </resources>
> >   <constraints>
> >     <rsc_location id="location-apache-clone-pacemaker-test-1-100"
> > node="pacemaker-test-1" rsc="apache-clone" score="100"
> > resource-discovery="exclusive"/>
> >     <rsc_location id="location-apache-clone-pacemaker-test-2-0"
> > node="pacemaker-test-2" rsc="apache-clone" score="0"
> > resource-discovery="exclusive"/>
> >   </constraints>
> >   <rsc_defaults>
> >     <meta_attributes id="rsc_defaults-options">
> >       <nvpair id="rsc_defaults-options-resource-stickiness"
> > name="resource-stickiness" value="50"/>
> >     </meta_attributes>
> >   </rsc_defaults>
> > </configuration>
> >
> >
> > *With the cluster in a running state:*
> >
> > [root at pacemaker-test-1 cluster]# pcs status
> > Cluster name: pacemaker-test
> > Stack: corosync
> > Current DC: pacemaker-test-2 (version 1.1.23-1.el7_9.1-9acf116022) -
> > partition with quorum
> > Last updated: Mon Aug  9 14:45:38 2021
> > Last change: Mon Aug  9 14:43:14 2021 by hacluster via crmd on
> > pacemaker-test-1
> >
> > 2 nodes configured
> > 2 resource instances configured
> >
> > Online: [ pacemaker-test-1 pacemaker-test-2 ]
> >
> > Full list of resources:
> >
> >  Clone Set: apache-clone [apache]
> >      Started: [ pacemaker-test-1 pacemaker-test-2 ]
> >
> > Daemon Status:
> >   corosync: active/disabled
> >   pacemaker: active/disabled
> >   pcsd: active/enabled
> >
> > *When simulating an error by killing the apache-resource on
> > pacemaker-test-1:*
> >
> > [root at pacemaker-test-1 ~]# killall httpd
> >
> > *After a few seconds, the cluster notices that the apache-resource is
> down
> > on pacemaker-test-1 and restarts it on pacemaker-test-1 (this is fine):*
> >
> > [root at pacemaker-test-1 cluster]# cat corosync.log | grep crmd:
>
> Never ever filter logs that you show unless you know what you are doing.
>
> You skipped the most interesting part that is the intended actions.
> Which are
>
> Aug 09 15:59:37.889 ha1 pacemaker-schedulerd[3783] (LogAction)  notice:
>  * Recover    apache:0     ( ha1 -> ha2 )
> Aug 09 15:59:37.889 ha1 pacemaker-schedulerd[3783] (LogAction)  notice:
>  * Move       apache:1     ( ha2 -> ha1 )
>
> So pacemaker decides to "swap" nodes where current instances are running.
>

Correct. I've only skimmed this thread but it looks like:

https://github.com/ClusterLabs/pacemaker/pull/2313
https://bugzilla.redhat.com/show_bug.cgi?id=1931023

I've had some personal things get in the way of following up on the PR for
a while. In my experience, configuring resource-stickiness has worked
around the issue.

> Looking at scores
>
> Using the original execution date of: 2021-08-09 12:59:37Z
>
> Current cluster status:
> Online: [ ha1 ha2 ]
>
>  vip    (ocf::pacemaker:Dummy):  Started ha1
>  Clone Set: apache-clone [apache]
>      apache     (ocf::pacemaker:Dummy):  FAILED ha1
>      Started: [ ha2 ]
>
> Allocation scores:
> pcmk__clone_allocate: apache-clone allocation score on ha1: 200
> pcmk__clone_allocate: apache-clone allocation score on ha2: 0
> pcmk__clone_allocate: apache:0 allocation score on ha1: 101
> pcmk__clone_allocate: apache:0 allocation score on ha2: 0
> pcmk__clone_allocate: apache:1 allocation score on ha1: 100
> pcmk__clone_allocate: apache:1 allocation score on ha2: 1
> pcmk__native_allocate: apache:1 allocation score on ha1: 100
> pcmk__native_allocate: apache:1 allocation score on ha2: 1
> pcmk__native_allocate: apache:1 allocation score on ha1: 100
> pcmk__native_allocate: apache:1 allocation score on ha2: 1
> pcmk__native_allocate: apache:0 allocation score on ha1: -INFINITY
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> pcmk__native_allocate: apache:0 allocation score on ha2: 0
> pcmk__native_allocate: vip allocation score on ha1: 100
> pcmk__native_allocate: vip allocation score on ha2: 0
>
> Transition Summary:
>  * Recover    apache:0     ( ha1 -> ha2 )
>  * Move       apache:1     ( ha2 -> ha1 )
>
>
> No, I do not have explanation why pacemaker decides that apache:0 cannot
> run on ha1 in this case and so decides to move it to another node. It
> most certainly has something to do with asymmetric cluster and location
> scores. If you set the same location scores for apache-clone on both
> nodes pacemaker will recover failed instance and won't attempt to move
> it. Like
>
> location location-apache-clone-ha1-100 apache-clone
> resource-discovery=exclusive 100: ha1
> location location-apache-clone-ha2-100 apache-clone
> resource-discovery=exclusive 100: ha2
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>

-- 
Regards,

Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210809/c62826b0/attachment.htm>