[ClusterLabs] Resource stop sequence with massive CIB update

Mon Aug 12 21:35:36 UTC 2024

On Mon, 2024-08-12 at 22:47 +0300, alexey at pavlyuts.ru wrote:
> Hi All,
>  
> We use Pacemaker in specific scenario, where complex network
> environment, including VLANs, IPs and routes managed by external
> system and integrated by glue code by:
> Load CIB database config section with cibadmin –query --
> scope=configuration
> Add/delete prototypes and constraints
> Apply the new config by cibadmin –replace --scope=configuration --
> xml-pip --sync-call
> CIB taken from stdout, new cib load to by stdin, all done by Python
> code
>  
> All types handed with standard ocf:heartbeat resource scripts.
>  
> VLANs defined as clones to ensure it is up on all nodes.
> Then, order constraints given to start IP after vlan-clone (to ensure
> VLAN exists), then route after proper IP.
>  
> This works very good on mass-create, but we got some problems on
> mass-delete.
>  
> For my understanding of Pacemaker architecture and behavior: if it
> got the new config, it recalculates resource allocation, build up the
> target map with respect to [co]location constraints and then schedule
> changes with respect to order constraints. So, if we delete at once
> VLANS, IPs and routes, we also have to delete its constraints. Then,
> the scheduling of resource stop will NOT take order constraints from
> OLD config into consideration. Then, all the stops for VLAN, IPs and
> routes will start in random order. However:

Correct

> If VLAN deletes (stops) it also deletes all IPs, bound to the
> interface. And all routes.
> Then, IP resource trying to remove IP address which already deleted,
> and then files. Moreover, as scripts run in parallel, it may see IP
> active when it checks, but there already no IP when it tries to
> delete. As it failed, it left as an orphan (stopped, blocked) and
> only be clear with cleanup command. And this also ruins future CIB
> updates.
> About same logic between IP and routes.
>  
> After realizing this, I have changed the logic, use two stages:
> Read CIB
> Disable all resources to delete (setting target_role=Stopped) and
> send it with cibadmin
> Delete all the resources from CIB and send it with cibadmin

Good plan

> My idea was that Pacemaker will plan and do resource shutdown at step
> 2 with respect to order constraints which are still in the config.
> And then, it is safe to delete already stopped resources.
>  
> But I see the same troubles with this two-stage approach. Sometimes
> some resources fail to stop because referenced entity is already
> deleted.
>  
> It seems like one of two:
> Pacemaker does not respect order constraints when we put the new
> config section directly
> I misunderstand --sync-call cibadmin option, and it won’t wait for
> new config really applied and returns immediately, therefore delete
> starts before all stops compete. I did not find any explanation, and
> my guess was it should wait for changes applied by pacemaker, but I
> am not sure.

The second one: --sync-call only waits for the change to be committed
to the CIB, not for the cluster to respond. For that, call crm_resource
--wait afterward.

>  
> I need an advice about this situation and more information about
> –sync-call option. Is it right approach or I need extra delay? Or
> wait for everything stop by request state once and once?
>  
> I will be very grateful for any ideas or information!
>  
> Sincerely,
>  
> Alex
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot <kgaillot at redhat.com>