[ClusterLabs] Updated attribute is not displayed in crm_mon

Ken Gaillot kgaillot at redhat.com
Mon Aug 14 17:33:45 UTC 2017


On Wed, 2017-08-02 at 09:59 +0000, 井上 和徳 wrote:
> Hi,
> 
> In Pacemaker-1.1.17, the attribute updated while starting pacemaker is not displayed in crm_mon.
> In Pacemaker-1.1.16, it is displayed and results are different.
> 
> https://github.com/ClusterLabs/pacemaker/commit/fe44f400a3116a158ab331a92a49a4ad8937170d
> This commit is the cause, but the following result (3.) is expected behavior?

This turned out to be an odd one. The sequence of events is:

1. When the node leaves the cluster, the DC (correctly) wipes all its
transient attributes from attrd and the CIB.

2. Pacemaker is newly started on the node, and a transient attribute is
set before the node joins the cluster.

3. The node joins the cluster, and its transient attributes (including
the new value) are sync'ed with the rest of the cluster, in both attrd
and the CIB. So far, so good.

4. Because this is the node's first join since its crmd started, its
crmd wipes all of its transient attributes again. The idea is that the
node may have restarted so quickly that the DC hasn't yet done it (step
1 here), so clear them now to avoid any problems with old values.
However, the crmd wipes only the CIB -- not attrd (arguably a bug).

5. With the older pacemaker version, both the joining node and the DC
would request a full write-out of all values from attrd. Because step 4
only wiped the CIB, this ends up restoring the new value. With the newer
pacemaker version, this step is no longer done, so the value winds up
staying in attrd but not in CIB (until the next write-out naturally
occurs).

I don't have a solution yet, but step 4 is clearly the problem (rather
than the new code that skips step 5, which is still a good idea
performance-wise). I'll keep working on it.

> [test case]
> 1. Start pacemaker on two nodes at the same time and update the attribute during startup.
>    In this case, the attribute is displayed in crm_mon.
> 
>    [root at node1 ~]# ssh -f node1 'systemctl start pacemaker ; attrd_updater -n KEY -U V-1' ; \
>                    ssh -f node3 'systemctl start pacemaker ; attrd_updater -n KEY -U V-3'
>    [root at node1 ~]# crm_mon -QA1
>    Stack: corosync
>    Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with quorum
> 
>    2 nodes configured
>    0 resources configured
> 
>    Online: [ node1 node3 ]
> 
>    No active resources
> 
> 
>    Node Attributes:
>    * Node node1:
>        + KEY                               : V-1
>    * Node node3:
>        + KEY                               : V-3
> 
> 
> 2. Restart pacemaker on node1, and update the attribute during startup.
> 
>    [root at node1 ~]# systemctl stop pacemaker
>    [root at node1 ~]# systemctl start pacemaker ; attrd_updater -n KEY -U V-10
> 
> 
> 3. The attribute is registered in attrd but it is not registered in CIB,
>    so the updated attribute is not displayed in crm_mon.
> 
>    [root at node1 ~]# attrd_updater -Q -n KEY -A
>    name="KEY" host="node3" value="V-3"
>    name="KEY" host="node1" value="V-10"
> 
>    [root at node1 ~]# crm_mon -QA1
>    Stack: corosync
>    Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with quorum
> 
>    2 nodes configured
>    0 resources configured
> 
>    Online: [ node1 node3 ]
> 
>    No active resources
> 
> 
>    Node Attributes:
>    * Node node1:
>    * Node node3:
>        + KEY                               : V-3
> 
> 
> Best Regards
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Ken Gaillot <kgaillot at redhat.com>








More information about the Users mailing list