[ClusterLabs] question about dc-deadtime

Andrew Beekhof abeekhof at redhat.com
Mon Jan 9 18:52:56 EST 2017


On Fri, Dec 16, 2016 at 7:26 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
> On 12/15/2016 02:00 PM, Chris Walker wrote:
>> Hello,
>>
>> I have a quick question about dc-deadtime.  I believe that Digimer and
>> others on this list might have already addressed this, but I want to
>> make sure I'm not missing something.
>>
>> If my understanding is correct, dc-deadtime sets the amount of time that
>> must elapse before a cluster is formed (DC is elected, etc), regardless
>> of which nodes have joined the cluster.  In other words, even if all
>> nodes that are explicitly enumerated in the nodelist section have
>> started Pacemaker, they will still wait dc-deadtime before forming a
>> cluster.
>>
>> In my case, I have a two-node cluster on which I'd like to allow a
>> pretty long time (~5 minutes) for both nodes to join before giving up on
>> them.  However, if they both join quickly, I'd like to proceed to form a
>> cluster immediately; I don't want to wait for the full five minutes to
>> elapse before forming a cluster.  Further, if a node doesn't respond
>> within five minutes, I want to fence it and start resources on the node
>> that is up.
>
> Pacemaker+corosync behaves as you describe by default.
>
> dc-deadtime is how long to wait for an election to finish, but if the
> election finishes sooner than that (i.e. a DC is elected), it stops
> waiting. It doesn't even wait for all nodes, just a quorum.

You're confusing dc_deadtime with election_timeout:

./crmd/control.c:899: { XML_CONFIG_ATTR_DC_DEADTIME, "dc_deadtime",
"time", NULL, "20s", &check_time,
./crmd/control.c-900-          "How long to wait for a response from
other nodes during startup.",
./crmd/control.c-901-          "The \"correct\" value will depend on
the speed/load of your network and the type of switches used."
./crmd/control.c-902-        },

./crmd/control.c:934: { XML_CONFIG_ATTR_ELECTION_FAIL,
"election_timeout", "time", NULL, "2min", &check_timer,
./crmd/control.c-935-          "*** Advanced Use Only ***.", "If need
to adjust this value, it probably indicates the presence of a bug."
./crmd/control.c-936-        },

"during startup" is incomplete though... we also start that timer
after partition changes in case the DC was one of the nodes lost.

>
> Also, with startup-fencing=true (the default), any unseen nodes will be
> fenced, and the remaining nodes will proceed to host resources. Of
> course, it needs quorum for this, too.
>
> With two nodes, quorum is handled specially, but that's a different topic.
>
>> With Pacemaker/Heartbeat, the initdead parameter did exactly what I
>> want, but I don't see any way to do this with Pacemaker/Corosync.  From
>> reading other posts, it looks like people use an external agent to start
>> HA daemons once nodes are up ... is this a correct understanding?
>>
>> Thanks very much,
>> Chris
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list