[ClusterLabs] Resource stop sequence with massive CIB update
alexey at pavlyuts.ru
alexey at pavlyuts.ru
Mon Aug 12 19:47:36 UTC 2024
Hi All,
We use Pacemaker in specific scenario, where complex network environment,
including VLANs, IPs and routes managed by external system and integrated by
glue code by:
1. Load CIB database config section with cibadmin -query
--scope=configuration
2. Add/delete prototypes and constraints
3. Apply the new config by cibadmin -replace --scope=configuration
--xml-pip --sync-call
CIB taken from stdout, new cib load to by stdin, all done by Python code
All types handed with standard ocf:heartbeat resource scripts.
VLANs defined as clones to ensure it is up on all nodes.
Then, order constraints given to start IP after vlan-clone (to ensure VLAN
exists), then route after proper IP.
This works very good on mass-create, but we got some problems on
mass-delete.
For my understanding of Pacemaker architecture and behavior: if it got the
new config, it recalculates resource allocation, build up the target map
with respect to [co]location constraints and then schedule changes with
respect to order constraints. So, if we delete at once VLANS, IPs and
routes, we also have to delete its constraints. Then, the scheduling of
resource stop will NOT take order constraints from OLD config into
consideration. Then, all the stops for VLAN, IPs and routes will start in
random order. However:
1. If VLAN deletes (stops) it also deletes all IPs, bound to the
interface. And all routes.
2. Then, IP resource trying to remove IP address which already deleted,
and then files. Moreover, as scripts run in parallel, it may see IP active
when it checks, but there already no IP when it tries to delete. As it
failed, it left as an orphan (stopped, blocked) and only be clear with
cleanup command. And this also ruins future CIB updates.
3. About same logic between IP and routes.
4.
After realizing this, I have changed the logic, use two stages:
1. Read CIB
2. Disable all resources to delete (setting target_role=Stopped) and
send it with cibadmin
3. Delete all the resources from CIB and send it with cibadmin
My idea was that Pacemaker will plan and do resource shutdown at step 2 with
respect to order constraints which are still in the config. And then, it is
safe to delete already stopped resources.
But I see the same troubles with this two-stage approach. Sometimes some
resources fail to stop because referenced entity is already deleted.
It seems like one of two:
1. Pacemaker does not respect order constraints when we put the new
config section directly
2. I misunderstand --sync-call cibadmin option, and it won't wait for
new config really applied and returns immediately, therefore delete starts
before all stops compete. I did not find any explanation, and my guess was
it should wait for changes applied by pacemaker, but I am not sure.
I need an advice about this situation and more information about -sync-call
option. Is it right approach or I need extra delay? Or wait for everything
stop by request state once and once?
I will be very grateful for any ideas or information!
Sincerely,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20240812/79ce72fd/attachment.htm>
More information about the Users
mailing list