[ClusterLabs] Pacemaker Shutdown

Wed Jul 22 16:05:47 EDT 2020

On Tue, Jul 21, 2020 at 11:42 PM Harvey Shepherd <
Harvey.Shepherd at aviatnet.com> wrote:

> Hi All,
>
> I'm running Pacemaker 2.0.3 on a two-node cluster, controlling 40+
> resources which are a mixture of clones and other resources that are
> colocated with the master instance of certain clones. I've noticed that if
> I terminate pacemaker on the node that is hosting the master instances of
> the clones, Pacemaker focuses on stopping resources on that node BEFORE
> failing over to the other node, leading to a longer outage than necessary.
> Is there a way to change this behaviour?
>

Hi, Harvey.

As you likely know, a given resource active/passive resource will have to
stop on one node before it can start on another node, and the same goes for
a promoted clone instance having to demote on one node before it can
promote on another. There are exceptions for clone instances and for
promotable clones with promoted-max > 1 ("allow more than one master
instance"). A resource that's configured to run on one node at a time
should not try to run on two nodes during failover.

With that in mind, what exactly are you wanting to happen? Is the problem
that all resources are stopping on node 1 before *any* of them start on
node 2? Or that you want Pacemaker shutdown to kill the processes on node 1
instead of cleanly shutting them down? Or something different?

These are the actions and logs I saw during the test:
>

Ack. This seems like it's just telling us that Pacemaker is going through a
graceful shutdown. The info more relevant to the resource stop/start order
would be in /var/log/pacemaker/pacemaker.log (or less detailed in
/var/log/messages) on the DC.

# /etc/init.d/pacemaker stop
> Signaling Pacemaker Cluster Manager to terminate
>
> Waiting for cluster services to
> unload..............................................................sending
> signal 9 to procs
>
>
> 2020 Jul 22 06:16:50.581 Chassis2 daemon.notice CTR8740 pacemaker.
> Signaling Pacemaker Cluster Manager to terminate
> 2020 Jul 22 06:16:50.599 Chassis2 daemon.notice CTR8740 pacemaker. Waiting
> for cluster services to unload
> 2020 Jul 22 06:18:01.794 Chassis2 daemon.warning CTR8740
> pacemaker-based.6140  warning: new_event_notification (6140-6141-9): Broken
> pipe (32)
> 2020 Jul 22 06:18:01.794 Chassis2 daemon.warning CTR8740
> pacemaker-based.6140  warning: Notification of client
> stonithd/665bde82-cb28-40f7-9132-8321dc2f1992 failed
> 2020 Jul 22 06:18:01.794 Chassis2 daemon.warning CTR8740
> pacemaker-based.6140  warning: new_event_notification (6140-6143-8): Broken
> pipe (32)
> 2020 Jul 22 06:18:01.794 Chassis2 daemon.warning CTR8740
> pacemaker-based.6140  warning: Notification of client
> attrd/a26ca273-3422-4ebe-8cb7-95849b8ff130 failed
> 2020 Jul 22 06:18:03.320 Chassis1 daemon.warning CTR8740
> pacemaker-schedulerd.6240  warning: Blind faith: not fencing unseen nodes
> 2020 Jul 22 06:18:58.941 Chassis2 user.crit CTR8740 supervisor. pacemaker
> is inactive (3).
>
> Regards,
> Harvey
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>

-- 
Regards,

Reid Wahl, RHCA
Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200722/0ead67f8/attachment.htm>