[ClusterLabs] Disabled resources after parallel removing of group
Александр Руденко
a.rudikk at gmail.com
Mon May 20 04:43:30 EDT 2024
Alexey, thank you!
Now, it's clear for me.
сб, 18 мая 2024 г. в 02:11, <alexey at pavlyuts.ru>:
> Hi Alexander,
>
>
>
> AFAIK, Pacemaker itself only have deal with XML-based configuration
> database, shared across all cluster. Each time you call pcs or any other
> tool it takes XML (or part of it) from pacemaker, tweaks it and then push
> it back to Pacemaker. Each time XML is pushed, Pacemaker completely rethink
> the new config, look to the current state and schedule changes from current
> state to target state. I can’t point you to exact place in the docs where
> this described, but this from Pacemaker docs.
>
>
>
> Therefore, each use of pcs command triggering this process immediately.
> Seems that some async-driven side effects may happen from this. Then, you
> may do ANY count of changes in one *stroke if Pacemaker got the new
> config with all these changes*. So, you need to enforce management tools
> FIRST prepare all changes and THEN push it all at once. And then you have
> no need to complete separate changes in background because the preparation
> is very fast. And final application will be done at max possible speed too.
>
>
>
> Miroslav exampled how to manage bulk delete, but this is the *common way
> to manage massive change*. Any operations could be done! You take the
> Pacemaker CIB to a file, complete all the changes against the file instead
> write each one to CIB and then push the total back, then Pacemaker will
> schedule all changes.
>
>
>
> You may put ANY commands in any mix: add, change, delete, but use -f
> <filename> option for changes to be done against file. You may keep
> original to push diff (as at Miroslav example), or you may just push whole
> changed config, AFAIK, there no difference.
>
>
>
> ###########################
>
> # Make a copy of CIB into local file
>
> pcs cluster cib config.xml
>
>
>
> # do changes against file
>
> pcs -f config.xml resource add <blah-blah-blah>
>
>
>
> pcs -f config.xml constraint <blah-blah-blah>
>
>
>
> pcs -f config.xml resource disable <id>
>
>
>
> pcs -f config.xml resource remove <id>
>
>
>
> # And finally push the whole ‘configuration’ scope back (mind there no
> diff, but push only config scope)
>
> pcs cluster cib-push config.xml –config
>
>
>
> ############################
>
>
>
> And Pacemaker apply all changes at once.
>
>
>
> Miroslav’s example taken from pcs man page
> <https://manpages.ubuntu.com/manpages/jammy/man8/pcs.8.html> for command
> ‘cluster cib-push’. My example works too.
>
>
>
> Have a good failover! Means no failover at all )))
>
>
>
> Alex
>
>
>
>
>
> *From:* Users <users-bounces at clusterlabs.org> *On Behalf Of *Александр
> Руденко
> *Sent:* Friday, May 17, 2024 6:46 PM
> *To:* Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>
> *Subject:* Re: [ClusterLabs] Disabled resources after parallel removing
> of group
>
>
>
> Miroslav, thank you!
>
> It helps me understand that it's not a configuration issue.
>
> BTW, is it okay to create new resources in parallel?
>
> On timeline it looks like:
>
> pcs resource create resA1 .... --group groupA
>
> pcs resource create resB1 .... --group groupB
> resA1 Started
> pcs resource create resA2 .... --group groupA
>
> res B1 Started
> pcs resource create resB2 .... --group groupB
>
> res A2 Started
>
> res B2 Started
>
>
>
> For now, it works okay)
>
> In our case, cluster events like 'create' and 'remove' are generated by
> users, and for now we don't have any queue for operations. But now, I
> realized that we need a queue for 'remove' operations. Maybe we need a
> queue for 'create' operations to?
>
>
>
> пт, 17 мая 2024 г. в 17:49, Miroslav Lisik <mlisik at redhat.com>:
>
> Hi Aleksandr!
>
> It is not safe to use `pcs resource remove` command in parallel because
> you run into the same issues as you already described. Processes run by
> remove command are not synchronized.
>
> Unfortunately, remove command does not support more than one resource
> yet.
>
> If you really need to remove resources at once you can use this method:
> 1. get the current cib configuration:
> pcs cluster cib > original.xml
>
> 2. create a new copy of the file:
> cp original.xml new.xml
>
> 3. disable all to be removed resources using -f option and new
> configuration file:
> pcs -f new.xml resource disable <resource id>...
>
> 4. remove resources using -f option and new configuration file:
> pcs -f new.xml resource remove <resource id>
> ...
>
> 5. push new cib configuration to the cluster
> pcs cluster cib-push new.xml diff-against=original.xml
>
>
> On 5/17/24 13:47, Александр Руденко wrote:
> > Hi!
> >
> > I am new in the pacemaker world, and I, unfortunately, have problems
> > with simple actions like group removal. Please, help me understand when
> > I'm wrong.
> >
> > For simplicity I will use standard resources like IPaddr2 (but we have
> > this problem on any type of our custom resources).
> >
> > I have 5 groups like this:
> >
> > Full List of Resources:
> > * Resource Group: group-1:
> > * ip-11 (ocf::heartbeat:IPaddr2): Started vdc16
> > * ip-12 (ocf::heartbeat:IPaddr2): Started vdc16
> > * Resource Group: group-2:
> > * ip-21 (ocf::heartbeat:IPaddr2): Started vdc17
> > * ip-22 (ocf::heartbeat:IPaddr2): Started vdc17
> > * Resource Group: group-3:
> > * ip-31 (ocf::heartbeat:IPaddr2): Started vdc18
> > * ip-32 (ocf::heartbeat:IPaddr2): Started vdc18
> > * Resource Group: group-4:
> > * ip-41 (ocf::heartbeat:IPaddr2): Started vdc16
> > * ip-42 (ocf::heartbeat:IPaddr2): Started vdc16
> >
> > Groups were created by next simple script:
> > cat groups.sh
> > pcs resource create ip-11 ocf:heartbeat:IPaddr2 ip=10.7.1.11
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-1
> > pcs resource create ip-12 ocf:heartbeat:IPaddr2 ip=10.7.1.12
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-1
> >
> > pcs resource create ip-21 ocf:heartbeat:IPaddr2 ip=10.7.1.21
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-2
> > pcs resource create ip-22 ocf:heartbeat:IPaddr2 ip=10.7.1.22
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-2
> >
> > pcs resource create ip-31 ocf:heartbeat:IPaddr2 ip=10.7.1.31
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-3
> > pcs resource create ip-32 ocf:heartbeat:IPaddr2 ip=10.7.1.32
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-3
> >
> > pcs resource create ip-41 ocf:heartbeat:IPaddr2 ip=10.7.1.41
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-4
> > pcs resource create ip-42 ocf:heartbeat:IPaddr2 ip=10.7.1.42
> > cidr_netmask=24 nic=lo op monitor interval=10s --group group-4
> >
> > Next, i try to remove all of these group in 'parallel':
> > cat remove.sh
> > pcs resource remove group-1 &
> > sleep 0.2
> > pcs resource remove group-2 &
> > sleep 0.2
> > pcs resource remove group-3 &
> > sleep 0.2
> > pcs resource remove group-4 &
> >
> > After this, every time I have a few resources in some groups which were
> > not removed. It looks like:
> >
> > Full List of Resources:
> > * Resource Group: group-2 (disabled):
> > * ip-21 (ocf::heartbeat:IPaddr2): Stopped (disabled)
> > * Resource Group: group-4 (disabled):
> > * ip-41 (ocf::heartbeat:IPaddr2): Stopped (disabled)
> >
> > In logs, I can see success stopping all resources, but after stopping
> > some resources it looks like pacemaker just 'forgot' about deletion and
> > didn't.
> >
> > Cluster name: pacemaker1
> > Cluster Summary:
> > * Stack: corosync
> > * Current DC: vdc16 (version 2.1.0-8.el8-7c3f660707) - partition with
> > quorum
> > * Last updated: Fri May 17 14:30:14 2024
> > * Last change: Fri May 17 14:30:05 2024 by root via cibadmin on vdc16
> > * 3 nodes configured
> > * 2 resource instances configured (2 DISABLED)
> >
> > Node List:
> > * Online: [ vdc16 vdc17 vdc18 ]
> >
> > Host OS is CentOS 8.4. Cluster with default settings. vdc16,vdc17,vdc18
> > are VMs with 4 vCPU.
> >
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20240520/05ab7f5f/attachment.htm>
More information about the Users
mailing list