[ClusterLabs] controlling cluster behavior on startup
Ken Gaillot
kgaillot at redhat.com
Mon Jan 29 18:52:33 EST 2024
On Mon, 2024-01-29 at 14:35 -0800, Reid Wahl wrote:
>
>
> On Monday, January 29, 2024, Ken Gaillot <kgaillot at redhat.com> wrote:
> > On Mon, 2024-01-29 at 18:05 +0000, Faaland, Olaf P. via Users
> wrote:
> >> Hi,
> >>
> >> I have configured clusters of node pairs, so each cluster has 2
> >> nodes. The cluster members are statically defined in
> corosync.conf
> >> before corosync or pacemaker is started, and quorum {two_node: 1}
> is
> >> set.
> >>
> >> When both nodes are powered off and I power them on, they do not
> >> start pacemaker at exactly the same time. The time difference may
> be
> >> a few minutes depending on other factors outside the nodes.
> >>
> >> My goals are (I call the first node to start pacemaker "node1"):
> >> 1) I want to control how long pacemaker on node1 waits before
> fencing
> >> node2 if node2 does not start pacemaker.
> >> 2) If node1 is part-way through that waiting period, and node2
> starts
> >> pacemaker so they detect each other, I would like them to proceed
> >> immediately to probing resource state and starting resources which
> >> are down, not wait until the end of that "grace period".
> >>
> >> It looks from the documentation like dc-deadtime is how #1 is
> >> controlled, and #2 is expected normal behavior. However, I'm
> seeing
> >> fence actions before dc-deadtime has passed.
> >>
> >> Am I misunderstanding Pacemaker's expected behavior and/or how dc-
> >> deadtime should be used?
> >
> > You have everything right. The problem is that you're starting with
> an
> > empty configuration every time, so the default dc-deadtime is being
> > used for the first election (before you can set the desired value).
>
> Why would there be fence actions before dc-deadtime expires though?
There isn't -- after the (default) dc-deadtime pops, the node elects
itself DC and runs the scheduler, which considers the other node unseen
and in need of startup fencing. The dc-deadtime has been raised in the
meantime, but that no longer matters.
>
> >
> > I can't think of anything you can do to get around that, since the
> > controller starts the timer as soon as it starts up. Would it be
> > possible to bake an initial configuration into the PXE image?
> >
> > When the timer value changes, we could stop the existing timer and
> > restart it. There's a risk that some external automation could make
> > repeated changes to the timeout, thus never letting it expire, but
> that
> > seems preferable to your problem. I've created an issue for that:
> >
> > https://projects.clusterlabs.org/T764
> >
> > BTW there's also election-timeout. I'm not sure offhand how that
> > interacts; it might be necessary to raise that one as well.
> >
> >>
> >> One possibly unusual aspect of this cluster is that these two
> nodes
> >> are stateless - they PXE boot from an image on another server -
> and I
> >> build the cluster configuration at boot time with a series of pcs
> >> commands, because the nodes have no local storage for this
> >> purpose. The commands are:
> >>
> >> ['pcs', 'cluster', 'start']
> >> ['pcs', 'property', 'set', 'stonith-action=off']
> >> ['pcs', 'property', 'set', 'cluster-recheck-interval=60']
> >> ['pcs', 'property', 'set', 'start-failure-is-fatal=false']
> >> ['pcs', 'property', 'set', 'dc-deadtime=300']
> >> ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman',
> >> 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> >> 'pcmk_host_list=gopher11,gopher12']
> >> ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman',
> >> 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> >> 'pcmk_host_list=gopher11,gopher12']
> >> ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool',
> >> 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11',
> 'op',
> >> 'start', 'timeout=805']
> >> ...
> >> ['pcs', 'property', 'set', 'no-quorum-policy=ignore']
> >
> > BTW you don't need to change no-quorum-policy when you're using
> > two_node with Corosync.
> >
> >>
> >> I could, instead, generate a CIB so that when Pacemaker is
> started,
> >> it has a full config. Is that better?
> >>
> >> thanks,
> >> Olaf
> >>
> >> === corosync.conf:
> >> totem {
> >> version: 2
> >> cluster_name: gopher11
> >> secauth: off
> >> transport: udpu
> >> }
> >> nodelist {
> >> node {
> >> ring0_addr: gopher11
> >> name: gopher11
> >> nodeid: 1
> >> }
> >> node {
> >> ring0_addr: gopher12
> >> name: gopher12
> >> nodeid: 2
> >> }
> >> }
> >> quorum {
> >> provider: corosync_votequorum
> >> two_node: 1
> >> }
> >>
> >> === Log excerpt
> >>
> >> Here's an except from Pacemaker logs that reflect what I'm
> >> seeing. These are from gopher12, the node that came up first.
> The
> >> other node, which is not yet up, is gopher11.
> >>
> >> Jan 25 17:55:38 gopher12 pacemakerd [116033]
> >> (main) notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7
> >> features:agent-manpages ascii-docs compat-2.0 corosync-ge-2
> default-
> >> concurrent-fencing generated-manpages monotonic nagios ncurses
> remote
> >> systemd
> >> Jan 25 17:55:39 gopher12 pacemaker-controld [116040]
> >> (peer_update_callback) info: Cluster node gopher12 is now
> member
> >> (was in unknown state)
> >> Jan 25 17:55:43 gopher12 pacemaker-based [116035]
> >> (cib_perform_op) info: ++
> >> /cib/configuration/crm_config/cluster_property_set[@id='cib-
> >> bootstrap-options']: <nvpair id="cib-bootstrap-options-dc-
> deadtime"
> >> name="dc-deadtime" value="300"/>
> >> Jan 25 17:56:00 gopher12 pacemaker-controld [116040]
> >> (crm_timer_popped) info: Election Trigger just popped |
> >> input=I_DC_TIMEOUT time=300000ms
> >> Jan 25 17:56:01 gopher12 pacemaker-based [116035]
> >> (cib_perform_op) info: ++
> >> /cib/configuration/crm_config/cluster_property_set[@id='cib-
> >> bootstrap-options']: <nvpair id="cib-bootstrap-options-no-quorum-
> >> policy" name="no-quorum-policy" value="ignore"/>
> >> Jan 25 17:56:01 gopher12 pacemaker-controld [116040]
> >> (abort_transition_graph) info: Transition 0 aborted by cib-
> >> bootstrap-options-no-quorum-policy doing create no-quorum-
> >> policy=ignore: Configuration change | cib=0.26.0
> >> source=te_update_diff_v2:464
> >> path=/cib/configuration/crm_config/cluster_property_set[@id='cib-
> >> bootstrap-options'] complete=true
> >> Jan 25 17:56:01 gopher12 pacemaker-controld [116040]
> >> (controld_execute_fence_action) notice: Requesting fencing (off)
> >> targeting node gopher11 | action=11 timeout=60
> >>
> >>
> >> _______________________________________________
> >> Manage your subscription:
> >> https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >> ClusterLabs home: https://www.clusterlabs.org/
> >>
> > --
> > Ken Gaillot <kgaillot at redhat.com>
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
> >
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list