[ClusterLabs] Pacemaker log showing time mismatch after

Ken Gaillot kgaillot at redhat.com
Mon Feb 11 16:03:58 EST 2019


On Fri, 2019-02-01 at 08:10 +0100, Jan Pokorný wrote:
> On 28/01/19 09:47 -0600, Ken Gaillot wrote:
> > On Mon, 2019-01-28 at 18:04 +0530, Dileep V Nair wrote:
> > Pacemaker can handle the clock jumping forward, but not backward.
> 
> I am rather surprised, are we not using monotonic time only, then?
> If so, why?

The scheduler runs on a single node (the DC) but must take as input the
resource history (including timestamps) on all nodes. We need wall
clock time to compare against time-based rules. Also, if we get two
resource history entries from a node, we don't know if it rebooted in
between, so a monotonic timestamp alone wouldn't be sufficient.

However, it might be possible to store both time representations in the
history (and possibly maintain some sort of cluster knowledge about
monotonic clocks to compare them within and across nodes), and use one
or the other depending on the context. I haven't tried to determine how
feasible that would be, but it would be a major project.

> We shall not need any explicit time synchronization across the nodes
> since we are already backed by extended virtual synchrony from
> corosync, eventhough it could introduce strangenesses when
> time-based rules kick in.

Pacemaker determines the state of a resource by replaying its resource
history in the CIB. A history entry can be replaced only by a newer
event. Thus if there's a start event in the history, and a stop result
comes in, we have to know which one is newer to determine whether the
resource is started or stopped.

Something along those lines is likely the cause of:

https://bugs.clusterlabs.org/show_bug.cgi?id=5246
-- 
Ken Gaillot <kgaillot at redhat.com>




More information about the Users mailing list