[ClusterLabs] Antw: [EXT] Re: Disable all resources in a group if one or more of them fail and are unable to reactivate

damiano giuliani damianogiuliani87 at gmail.com
Thu Jan 28 11:42:44 EST 2021


Hi Ulrich, thanks for the answer,
as Ken explained me, there isnt any way to prevent earlier members from
running
if a later member has no available node,
if no node is available for the failed member, then it will just remain
stopped,and the earlier
members will stay active where they are.
i really hope was a solution or workaorund for this, but as ken clarify,
pacemaker cant hadle this exceptions.

Many thanks for your quick and effective support.

Have a good evening!

Damiano


Il giorno gio 28 gen 2021 alle ore 11:15 Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> ha scritto:

> >>> damiano giuliani <damianogiuliani87 at gmail.com> schrieb am 27.01.2021
> um
> 19:25
> in Nachricht
> <CAG=zYNOx-R=wKbhtm=4N7qaoYKE=ofORVQ7jA0jr17oYjgqOhQ at mail.gmail.com>:
> > Hi Andrei, Thanks for ur help.
> > if one of my resource in the group  fails or the primary node went down (
> > in my case acspcmk-02 ), the probe notices it and pacemaker tries to
> > restart the whole resource group on the second node.
> > if the second node cant run one of my grouped resources, it tries to stop
> > them.
>
> And what exactly is what you want? The behavior described it how the
> cluster
> handles it normally.
>
> >
> >
> > i attached my cluster status; my primary node ( acspcmk-02 ) fails and
> the
> > resource group tries to restart on the acspcmk-01, i keep broken the
> > resource  "lta-subscription-backend-ope-s3" on purpose and as you can see
> > some grouped resources are still started..
> > i would like to know how achive a  condition that the resource group must
> > start properly for each resources, if not stop all the group without some
> > services still up and running.
> >
> >
> > 2 nodes configured
> > 28 resources configured
> >
> > Online: [ acspcmk-01 ]
> > OFFLINE: [ acspcmk-02 ]
> >
> > Full list of resources:
> >
> >  Clone Set: lta-odata-frontend-ope-s1-clone [lta-odata-frontend-ope-s1]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Clone Set: lta-odata-frontend-ope-s2-clone [lta-odata-frontend-ope-s2]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Clone Set: lta-odata-frontend-ope-s3-clone [lta-odata-frontend-ope-s3]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Clone Set: s1ltaestimationtime-clone [s1ltaestimationtime]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Clone Set: s2ltaestimationtime-clone [s2ltaestimationtime]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Clone Set: s3ltaestimationtime-clone [s3ltaestimationtime]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Clone Set: openresty-clone [openresty]
> >      Started: [ acspcmk-01 ]
> >      Stopped: [ acspcmk-02 ]
> >  Resource Group: LTA_SINGLE_RESOURCES
> >      VIP        (ocf::heartbeat:IPaddr2):       Started acspcmk-01
> >      lta-subscription-backend-ope-s1
> >  (systemd:lta-subscription-backend-ope-s1):      Started acspcmk-01
> >      lta-subscription-backend-ope-s2
> >  (systemd:lta-subscription-backend-ope-s2):      Started acspcmk-01
> >      lta-subscription-backend-ope-s3
> >  (systemd:lta-subscription-backend-ope-s3):      Stopped
> >      s1ltaquotaservice  (systemd:s1ltaquotaservice):    Stopped
> >      s2ltaquotaservice  (systemd:s2ltaquotaservice):    Stopped
> >      s3ltaquotaservice  (systemd:s3ltaquotaservice):    Stopped
> >      s1ltarolling       (systemd:s1ltarolling): Stopped
> >      s2ltarolling       (systemd:s2ltarolling): Stopped
> >      s3ltarolling       (systemd:s3ltarolling): Stopped
> >      s1srvnotificationdispatcher
> >  (systemd:s1srvnotificationdispatcher):  Stopped
> >      s2srvnotificationdispatcher
> >  (systemd:s2srvnotificationdispatcher):  Stopped
> >      s3srvnotificationdispatcher
> >  (systemd:s3srvnotificationdispatcher):  Stopped
> >
> > Failed Resource Actions:
> > * lta-subscription-backend-ope-s3_start_0 on acspcmk-01 'unknown error'
> > (1): call=466, status=complete, exitreason='',
> >     last-rc-change='Wed Jan 27 13:00:21 2021', queued=0ms, exec=2128ms
> >
> > Daemon Status:
> >   corosync: active/disabled
> >   pacemaker: active/disabled
> >   pcsd: active/enabled
> >   sbd: active/enabled
> >
> >
> >   I hope i explained my problem at my best,
> >
> > Thanks for your time and help.
> >
> > Good Evening
> >
> > Damiano
> >
> > Il giorno mer 27 gen 2021 alle ore 19:03 Andrei Borzenkov <
> > arvidjaar at gmail.com> ha scritto:
> >
> >> 27.01.2021 19:06, damiano giuliani пишет:
> >> > Hi all im pretty new to the clusters, im struggling trying to
> configure
> a
> >> > bounch of resources and test how they failover.my need is to start and
> >> > manage a group of resources as one (in order to archive this a
> resource
> >> > group has been created), and if one of them cant run and still fails,
> the
> >> > cluster will try to restart the resource group in the secondary node,
> if
> >> it
> >> > cant run the all the resource toghter disable all the resource group.
> >> > i would like to know if there is a way to set the cluster to disable
> all
> >> > the resources of the group (or the group itself) if it cant be run all
> >> the
> >> > resoruces somewhere.
> >> >
> >>
> >> That's what pacemaker group does. I am not sure what you mean with
> >> "disable all resources". If resource fail count on a node exceeds
> >> threshold, this node is banned from running resource. If resource failed
> >> on every node, no node can run it until you clear fail count.
> >>
> >> "Disable resource" in pacemaker would mean setting its target-role to
> >> stopped. That does not happen automatically (at least I am not aware of
> >> it).
> >> _______________________________________________
> >> Manage your subscription:
> >> https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >> ClusterLabs home: https://www.clusterlabs.org/
> >>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20210128/93ec6ea8/attachment.htm>


More information about the Users mailing list