[ClusterLabs] Resource stop sequence with massive CIB update

Mon Aug 12 22:15:59 UTC 2024

Hi Ken, 

thank you great for your prompt help!

> The second one: --sync-call only waits for the change to be committed to the CIB,
> not for the cluster to respond. For that, call crm_resource --wait afterward.

It seems like crm_resource --wait is a very reliable and proven, but veeeeery slow approach. I found that even on sustained cluster it takes around 2-2.4 seconds for this command returns. When I add it to my script flow after resource disable and before deletion, it takes the same time about 2-2.2 sec:

2024-08-12 21:58:14,105 - DEBUG Start applying resource stop changes
2024-08-12 21:58:14,126 - DEBUG Start crm_resource --wailt
2024-08-12 21:58:16,152 - DEBUG Start applying final changes
2024-08-12 21:58:16,173 - INFO     Send request answer

Here it waits 2.26s. The same as sustained call, so I suppose the delay is not too much related to cluster state. 

But I know that real stop happens within about 100-200 ms unless some special conditions happen. The problem is my code works as API server to external system, and it is only one small part of job so 2s of waiting is not a good for me (((

Is there any way to get a list of unfinished start/stop tasks? Sure, I can build the list of resource to be stopped and just check with crm_mon about its status unless everyone stopped or error, I can catch resource events and watch for every planned stop finished, but it looks simpler and more reliable to get active stop task list, if available.

Thank you once again!

Sincerely,

Aalex