[ClusterLabs] controlling cluster behavior on startup
Ken Gaillot
kgaillot at redhat.com
Mon Jan 29 13:51:08 EST 2024
On Mon, 2024-01-29 at 18:05 +0000, Faaland, Olaf P. via Users wrote:
> Hi,
>
> I have configured clusters of node pairs, so each cluster has 2
> nodes. The cluster members are statically defined in corosync.conf
> before corosync or pacemaker is started, and quorum {two_node: 1} is
> set.
>
> When both nodes are powered off and I power them on, they do not
> start pacemaker at exactly the same time. The time difference may be
> a few minutes depending on other factors outside the nodes.
>
> My goals are (I call the first node to start pacemaker "node1"):
> 1) I want to control how long pacemaker on node1 waits before fencing
> node2 if node2 does not start pacemaker.
> 2) If node1 is part-way through that waiting period, and node2 starts
> pacemaker so they detect each other, I would like them to proceed
> immediately to probing resource state and starting resources which
> are down, not wait until the end of that "grace period".
>
> It looks from the documentation like dc-deadtime is how #1 is
> controlled, and #2 is expected normal behavior. However, I'm seeing
> fence actions before dc-deadtime has passed.
>
> Am I misunderstanding Pacemaker's expected behavior and/or how dc-
> deadtime should be used?
You have everything right. The problem is that you're starting with an
empty configuration every time, so the default dc-deadtime is being
used for the first election (before you can set the desired value).
I can't think of anything you can do to get around that, since the
controller starts the timer as soon as it starts up. Would it be
possible to bake an initial configuration into the PXE image?
When the timer value changes, we could stop the existing timer and
restart it. There's a risk that some external automation could make
repeated changes to the timeout, thus never letting it expire, but that
seems preferable to your problem. I've created an issue for that:
https://projects.clusterlabs.org/T764
BTW there's also election-timeout. I'm not sure offhand how that
interacts; it might be necessary to raise that one as well.
>
> One possibly unusual aspect of this cluster is that these two nodes
> are stateless - they PXE boot from an image on another server - and I
> build the cluster configuration at boot time with a series of pcs
> commands, because the nodes have no local storage for this
> purpose. The commands are:
>
> ['pcs', 'cluster', 'start']
> ['pcs', 'property', 'set', 'stonith-action=off']
> ['pcs', 'property', 'set', 'cluster-recheck-interval=60']
> ['pcs', 'property', 'set', 'start-failure-is-fatal=false']
> ['pcs', 'property', 'set', 'dc-deadtime=300']
> ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman',
> 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> 'pcmk_host_list=gopher11,gopher12']
> ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman',
> 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> 'pcmk_host_list=gopher11,gopher12']
> ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool',
> 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', 'op',
> 'start', 'timeout=805']
> ...
> ['pcs', 'property', 'set', 'no-quorum-policy=ignore']
BTW you don't need to change no-quorum-policy when you're using
two_node with Corosync.
>
> I could, instead, generate a CIB so that when Pacemaker is started,
> it has a full config. Is that better?
>
> thanks,
> Olaf
>
> === corosync.conf:
> totem {
> version: 2
> cluster_name: gopher11
> secauth: off
> transport: udpu
> }
> nodelist {
> node {
> ring0_addr: gopher11
> name: gopher11
> nodeid: 1
> }
> node {
> ring0_addr: gopher12
> name: gopher12
> nodeid: 2
> }
> }
> quorum {
> provider: corosync_votequorum
> two_node: 1
> }
>
> === Log excerpt
>
> Here's an except from Pacemaker logs that reflect what I'm
> seeing. These are from gopher12, the node that came up first. The
> other node, which is not yet up, is gopher11.
>
> Jan 25 17:55:38 gopher12 pacemakerd [116033]
> (main) notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7
> features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default-
> concurrent-fencing generated-manpages monotonic nagios ncurses remote
> systemd
> Jan 25 17:55:39 gopher12 pacemaker-controld [116040]
> (peer_update_callback) info: Cluster node gopher12 is now member
> (was in unknown state)
> Jan 25 17:55:43 gopher12 pacemaker-based [116035]
> (cib_perform_op) info: ++
> /cib/configuration/crm_config/cluster_property_set[@id='cib-
> bootstrap-options']: <nvpair id="cib-bootstrap-options-dc-deadtime"
> name="dc-deadtime" value="300"/>
> Jan 25 17:56:00 gopher12 pacemaker-controld [116040]
> (crm_timer_popped) info: Election Trigger just popped |
> input=I_DC_TIMEOUT time=300000ms
> Jan 25 17:56:01 gopher12 pacemaker-based [116035]
> (cib_perform_op) info: ++
> /cib/configuration/crm_config/cluster_property_set[@id='cib-
> bootstrap-options']: <nvpair id="cib-bootstrap-options-no-quorum-
> policy" name="no-quorum-policy" value="ignore"/>
> Jan 25 17:56:01 gopher12 pacemaker-controld [116040]
> (abort_transition_graph) info: Transition 0 aborted by cib-
> bootstrap-options-no-quorum-policy doing create no-quorum-
> policy=ignore: Configuration change | cib=0.26.0
> source=te_update_diff_v2:464
> path=/cib/configuration/crm_config/cluster_property_set[@id='cib-
> bootstrap-options'] complete=true
> Jan 25 17:56:01 gopher12 pacemaker-controld [116040]
> (controld_execute_fence_action) notice: Requesting fencing (off)
> targeting node gopher11 | action=11 timeout=60
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list