[ClusterLabs] Antw: [EXT] True time periods/CLOCK_MONOTONIC node vs. cluster wide (Was: Coming in Pacemaker 2.0.4: dependency on monotonic clock for systemd resources)

Jan Pokorný jpokorny at redhat.com
Thu Mar 12 11:30:13 EDT 2020

On 12/03/20 08:22 +0100, Ulrich Windl wrote:
> Sorry for top-posting, but if you have NTP-synced your nodes,
> CLOCK_MONOTONIC will not have much advantage over CLOCK_REALTIME
> as the clocks will be rather the same,

I guess you mean they will be rather the same amongst the nodes
relative to each other (not MONOTONIC vs. REALTIME, since they
will trivially differ at some point), regardless if you use
either of these.

How is exactly the same point in time to synchronize going to be
achieved?  Otherwise, you'll suffer "clock not the same" eventually.

How is the failure to NTP-sync going to be detected and what
consequences will it impose on cluster?  If you'll happily continue
regardless, you'll suffer "clock not the same" eventually.

NTP is currently highly recommended as an upstream guidance, but
it can easily be out of the scope of HA projects ... just consider
the autonomous units Digimer was talking about.  Still, they
could at least in theory keep some cluster-private time measuring
sync'd without any exact external calendar anchoring
(i.e., "stop-watching" rather than "calendar-watching").

Also, at barebone level and when REALTIME wouldn't be ever used
for anything in cluster (but logging perhaps, for which
"calendar-watching" [see above] may be of practical value), cluster
would _not_ be interested in calendar-correct time relating (but
NTP is, primarily) --- rather just in the sense of measuring the
periods, hence that particular clocks amongst the nodes are
reasonably comparable (ticking rather synchronously).

> and they won't "jump".
> IMHO the latter is the main reason for using CLOCK_MONOTONIC


> (if the admin

or said NTP client, conventional change (see "leap second",
for instance), HW fault (battery for instance) or anything else

> decides to adjust the real-time clock).  So far the theory. In
> practice the clock jumps, even with NTP, especially if the node had
> been running for a long time, is no[uz]t updating the RTC, and then
> is fenced. The clock may be off by minutes after boot then, and NTP
> is quite conservative when adjusting the time (first it won't
> believe that the clock if off that far, then after minutes the clock
> will actually jump. That's why some fast pre-ntpd adjustment is
> generally used.) The point is (if you can sync your clocks to
> real-time at all): How long do you want to wait for all your nodes
> to agree on some common time?

Specifically with NTP and systemd picture, I think you can order
startup of some units once "NTP synchronized" target is reached.

> Maybe CLOCK_MONOTONIC could help here...

NTP in calendar-correct sense can be skipped altogether
when cluster is happy just relying on that (and MONOTONIC-like
synchronization exists across nodes for good measure).

> The other useful application is any sort of timeout or repeat thing
> that should not be affected by adjusting the real-time clock.

That's exactly the point of using that for node-local measurements
and why I propose users would need to willingly opt-in for cluster
stack components to compile at all where MONOTONIC is not present,
provided that the components are timeout/interval sensitive.

Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200312/7d1b844c/attachment.sig>

More information about the Users mailing list