[ClusterLabs] node orderly shutdown

Tue Nov 21 14:31:13 EST 2023

On Tue, Nov 21, 2023 at 3:09 AM lejeczek via Users
<users at clusterlabs.org> wrote:
>
> Hi guys.
>
> Having a node with a couple of _promoted_ resources - when such node is os-shutdown in an orderly manner it seems that cluster takes a while.
> By a "while" I mean longer than I'd expect a relatively simple 3-node cluster to move/promote a few _promoted_ resources:
> redis, postgresql, IP
> onto another.
>
> Is there somewhere one can look, tweak or measure & troubleshot, in order to "fix" this, if possible at all?
>
> From watching such a "promoted" node I see that as systemd stops all services going into power-down target - it's _pacemaker_ which as last systemd takes bit longer before complete shutdown.
> Or perhaps you have another & more than one approach / technique to node-with-promoted-resources shutdown, a better one?

Assuming that you do want another node to be able to host those
resources, Pacemaker needs to demote/stop them on the node that's
shutting down. So you'd need to check the logs (for example, the
system logs and /var/log/pacemaker/pacemaker.log) to see what's taking
so long. You might also add `trace_ra=1` to the resource options, to
get a shell trace of everything it's doing and see where it's taking a
long time. The trace output will go to /var/lib/heartbeat/trace_ra.

If you don't want another node to run the resources, then setting the
shutdown-lock to true might help. Note that this setting applies to
**all** resources.

>
> many thanks,
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker