[ClusterLabs] [EXTERNAL] Re: "node is unclean" leads to gratuitous reboot

Chris Walker cwalker at cray.com
Thu Jul 11 11:37:48 EDT 2019


On 7/11/19 6:52 AM, Users wrote:

On Thu, Jul 11, 2019 at 12:58 PM Lars Ellenberg
<lars.ellenberg at linbit.com><mailto:lars.ellenberg at linbit.com> wrote:



On Wed, Jul 10, 2019 at 06:15:56PM +0000, Michael Powell wrote:


Thanks to you and Andrei for your responses.  In our particular
situation, we want to be able to operate with either node in
stand-alone mode, or with both nodes protected by HA.  I did not
mention this, but I am working on upgrading our product
from a version which used Pacemaker version 1.0.13 and Heartbeat
to run under CentOS 7.6 (later 8.0).
The older version did not exhibit this behavior, hence my concern.



Heartbeat by default has much less aggressive timeout settings,
and clearly distinguishes between "deadtime", and "initdead",
basically a "wait_for_all" with timeout: how long to wait for other
nodes during startup before declaring them dead and proceeding in
the startup sequence, ultimately fencing unseen nodes anyways.

Pacemaker itself has "dc-deadtime", documented as
"How long to wait for a response from other nodes during startup.",



Documentation is incomplete, it is timeout to start DC (re-)election,
so it also applies to current DC failure and will delay recovery.

At least that is how I understand it :)


Along these same lines, a drawback to extending dc-deadtime is that the cluster always waits for dc-deadtime to expire before starting resources, even if all nodes have joined.  So if you have a long dc-deadtime, the cluster will always wait at least that long before starting resources, even if all nodes have joined.

I mentioned this in a previous post, but we dealt with this by synchronizing the starting of Corosync and Pacemaker with a simple ExecStartPre systemd script:


# cat /etc/systemd/system/corosync.service.d/ha_wait.conf
[Service]
ExecStartPre=/sbin/ha_wait.sh
TimeoutStartSec=11min


where ha_wait.sh has something like:

#!/bin/bash

timeout=600

peer=<hostname of HA peer>

echo "Waiting for ${peer}"
peerup() {
  systemctl -H ${peer} is-active --quiet corosync.service 2> /dev/null && return 0
  return 1
}

start=${SECONDS}
while ! peerup && [ $((SECONDS-start)) -lt ${timeout} ]; do
  echo -n .
  sleep 5
done

peerup && echo "${peer} is up, starting HA" || echo "${peer} not up after ${timeout} starting HA alone"




This will cause Corosync startup to block while waiting for the partner node to begin starting Corosync. Once the partner begins starting Corosync, both nodes will start Corosync/Pacemaker at nearly the same time. If one node never comes up, then the partner will wait 10 minutes before starting, after which the node will be fenced (startup fencing and subsequent resource startup will only happen will only occur if no-quorum-policy is set to ignore)

Thanks,
Chris




but the 20s default of that in current Pacemaker is much likely
shorter than what you had as initdead in your "old" setup.

So maybe if you set dc-deadtime to two minutes or something,
that would give you the "expected" behavior?




If you call two isolated single node clusters running the same
applications likely using the same shared resources "expected", just
set startup-fencing=false, but then do not complain about data
corruption.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190711/cb0acc4a/attachment.html>


More information about the Users mailing list