[ClusterLabs] Updated attribute is not displayed in crm_mon

Ken Gaillot kgaillot at redhat.com
Mon Aug 14 17:41:36 UTC 2017


On Mon, 2017-08-14 at 12:33 -0500, Ken Gaillot wrote:
> On Wed, 2017-08-02 at 09:59 +0000, 井上 和徳 wrote:
> > Hi,
> > 
> > In Pacemaker-1.1.17, the attribute updated while starting pacemaker is not displayed in crm_mon.
> > In Pacemaker-1.1.16, it is displayed and results are different.
> > 
> > https://github.com/ClusterLabs/pacemaker/commit/fe44f400a3116a158ab331a92a49a4ad8937170d
> > This commit is the cause, but the following result (3.) is expected behavior?
> 
> This turned out to be an odd one. The sequence of events is:
> 
> 1. When the node leaves the cluster, the DC (correctly) wipes all its
> transient attributes from attrd and the CIB.
> 
> 2. Pacemaker is newly started on the node, and a transient attribute is
> set before the node joins the cluster.
> 
> 3. The node joins the cluster, and its transient attributes (including
> the new value) are sync'ed with the rest of the cluster, in both attrd
> and the CIB. So far, so good.
> 
> 4. Because this is the node's first join since its crmd started, its
> crmd wipes all of its transient attributes again. The idea is that the
> node may have restarted so quickly that the DC hasn't yet done it (step
> 1 here), so clear them now to avoid any problems with old values.
> However, the crmd wipes only the CIB -- not attrd (arguably a bug).

Whoops, clarification: the node may have restarted so quickly that
corosync didn't notice it left, so the DC would never have gotten the
"peer lost" message that triggers wiping its transient attributes.

I suspect the crmd wipes only the CIB in this case because we assumed
attrd would be empty at this point -- missing exactly this case where a
value was set between start-up and first join.

> 5. With the older pacemaker version, both the joining node and the DC
> would request a full write-out of all values from attrd. Because step 4
> only wiped the CIB, this ends up restoring the new value. With the newer
> pacemaker version, this step is no longer done, so the value winds up
> staying in attrd but not in CIB (until the next write-out naturally
> occurs).
> 
> I don't have a solution yet, but step 4 is clearly the problem (rather
> than the new code that skips step 5, which is still a good idea
> performance-wise). I'll keep working on it.
> 
> > [test case]
> > 1. Start pacemaker on two nodes at the same time and update the attribute during startup.
> >    In this case, the attribute is displayed in crm_mon.
> > 
> >    [root at node1 ~]# ssh -f node1 'systemctl start pacemaker ; attrd_updater -n KEY -U V-1' ; \
> >                    ssh -f node3 'systemctl start pacemaker ; attrd_updater -n KEY -U V-3'
> >    [root at node1 ~]# crm_mon -QA1
> >    Stack: corosync
> >    Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with quorum
> > 
> >    2 nodes configured
> >    0 resources configured
> > 
> >    Online: [ node1 node3 ]
> > 
> >    No active resources
> > 
> > 
> >    Node Attributes:
> >    * Node node1:
> >        + KEY                               : V-1
> >    * Node node3:
> >        + KEY                               : V-3
> > 
> > 
> > 2. Restart pacemaker on node1, and update the attribute during startup.
> > 
> >    [root at node1 ~]# systemctl stop pacemaker
> >    [root at node1 ~]# systemctl start pacemaker ; attrd_updater -n KEY -U V-10
> > 
> > 
> > 3. The attribute is registered in attrd but it is not registered in CIB,
> >    so the updated attribute is not displayed in crm_mon.
> > 
> >    [root at node1 ~]# attrd_updater -Q -n KEY -A
> >    name="KEY" host="node3" value="V-3"
> >    name="KEY" host="node1" value="V-10"
> > 
> >    [root at node1 ~]# crm_mon -QA1
> >    Stack: corosync
> >    Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with quorum
> > 
> >    2 nodes configured
> >    0 resources configured
> > 
> >    Online: [ node1 node3 ]
> > 
> >    No active resources
> > 
> > 
> >    Node Attributes:
> >    * Node node1:
> >    * Node node3:
> >        + KEY                               : V-3
> > 
> > 
> > Best Regards
> > 
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 

-- 
Ken Gaillot <kgaillot at redhat.com>








More information about the Users mailing list