[ClusterLabs] Antw: Q: native_color scores for clones

Tue Sep 4 05:22:47 EDT 2018

>>> In Reply to my message am 30.08.2018 um 12:23 in Nachricht <5B87C5A0.A46 : 161 :
60728>:
> Hi!
> 
> After having found showscores.sh, I thought I can improve the perfomance by 
> porting it to Perl, but it seems the slow part actually is calling pacemakers 
> helper scripts like crm_attribute, crm_failcount, etc...

Actually the performance gain was less than expected, until I added a cache for calling external programs reading stickiness, fail count and migration threshold. Here are the numbers:

showscores.sh (original):
real    0m46.181s
user    0m15.573s
sys     0m21.761s

showscores.pl (without cache):
real    0m46.053s
user    0m15.861s
sys     0m20.997s

showscores.pl (with cache):
real    0m25.998s
user    0m7.940s
sys     0m12.609s

This made me think whether it's possible to retrieve such attributes in a more efficient way, arising the question how the corresponding tools actually do work (those attributes are obviously not part of the CIB).
I can get the CIB via cib_admin and I can parse the XML if needed, but how can I get these other attributes? Tracing crm_attribute, it seems it reads some partially binary file from locations like /dev/shm/qb-attrd-response-*.
I would think that _all_ relevant attributes should be part of the CIB...

The other thing I realized was that both "migration threshold" and "stickiness" are both undefined for several resources (due to the fact that the default values for those also aren't defined). I really wonder: Why not (e.g.) specify a default stickiness as integer 0 instead of having a magic NULL value of any type?

And the questions that I really tried to answer (but failed to do so) using this tool were:
Where can I see the tendency of the cluster to move (i.e.: balance) resources?
What will happen when I lower the stickiness of a specific resource by some amount?

The current output of my tool looks a bit different (as there were some bugs parsing the output of the tools in the initial version ;-), and I've implemented and used "sort by column", specifically Resource, then node, ...):

Resource               Node     Score Stickin. Fail Count Migr. Thr.
---------------------- ---- --------- -------- ---------- ----------
prm_DLM:0              h02  -INFINITY        1          0      ? (D)
prm_DLM:0              h06          1        1          0      ? (D)
prm_DLM:1              h02          1        1          0      ? (D)
prm_DLM:1              h06          0        1          0      ? (D)
prm_O2CB:0             h02  -INFINITY        1          0      ? (D)
prm_O2CB:0             h06          1        1          0      ? (D)
prm_O2CB:1             h02          1        1          0      ? (D)
prm_O2CB:1             h06  -INFINITY        1          0      ? (D)
prm_cfs_locks:0        h02  -INFINITY        1          0      ? (D)
prm_cfs_locks:0        h06          1        1          0      ? (D)
prm_cfs_locks:1        h02          1        1          0      ? (D)
prm_cfs_locks:1        h06  -INFINITY        1          0      ? (D)
prm_s02_ctdb:0         h02  -INFINITY        1          0      ? (D)
prm_s02_ctdb:0         h06          1        1          0      ? (D)
prm_s02_ctdb:1         h02          1        1          0      ? (D)
prm_s02_ctdb:1         h06  -INFINITY        1          0      ? (D)

In the table above "?" denotes an undefined value, and suffix " (D)" indicates that the default value is being used, so "? (D)" actually means, the resource had no value set, and the default value also wasn't set, so there is no actual value (see above for discussion of this "feature").

Another interesting point from the previous answers is this:
How is clone-max=2 or clone-node-max=1 or  master-node-max=1 or master-max=1 actually implemented? Magic scores, hidden location constraints, or what?

I tried to locate good documentation for that, but failed to find such.
(In my personal opinion, once you try to document things, you'll find bugs, bad concepts, etc.)
Maybe start documenting things better, to make the product better, too.

Regards,
Ulrich

> 
> But anyway: Being quite confident what my program produces (;-)), I found 
> some odd score values for clones that run in a two node cluster. For example:
> 
> Resource               Node     Score Stickin. Fail Count Migr. Thr.
> ---------------------- ---- --------- -------- ---------- ----------
> prm_DLM:1              h02          1        0          0          0
> prm_DLM:1              h06          0        0          0          0
> prm_DLM:0              h02  -INFINITY        0          0          0
> prm_DLM:0              h06          1        0          0          0
> prm_O2CB:1             h02          1        0          0          0
> prm_O2CB:1             h06  -INFINITY        0          0          0
> prm_O2CB:0             h02  -INFINITY        0          0          0
> prm_O2CB:0             h06          1        0          0          0
> prm_cfs_locks:0        h02  -INFINITY        0          0          0
> prm_cfs_locks:0        h06          1        0          0          0
> prm_cfs_locks:1        h02          1        0          0          0
> prm_cfs_locks:1        h06  -INFINITY        0          0          0
> prm_s02_ctdb:0         h02  -INFINITY        0          0          0
> prm_s02_ctdb:0         h06          1        0          0          0
> prm_s02_ctdb:1         h02          1        0          0          0
> prm_s02_ctdb:1         h06  -INFINITY        0          0          0
> 
> For prm_DLM:1 for example one node has score 0, the other node has score 1, 
> but for prm:DLM:0 the host that has 1 for prm_DLM:1 has -INFINITY (not 0), 
> while the other host has the usual 1. So I guess that even without -INFINITY 
> the configuration would be stable. For prm_O2CB two nodes have -INFINITY as 
> score. For prm_cfs_locks the pattern is as usual, and for rpm_s02_ctdb to 
> nodes have -INFINITY again.
> 
> I don't understand where those -INFINITY scores come from. Pacemaker is 
> SLES11 SP4 (1.1.12-f47ea56).
> 
> It might also be a bug, because when I look at a three-node cluster, I see 
> that a ":0" resource had score 1 once, and 0 twice, but the corrsponding ":2" 
> resource has scores 0, 1, and -INFINITY, and the ":1" resource has score 1 
> once and -INFINITY twice.
> 
> When I look at the "clone_solor" scores, the prm_DLM:* primitives look as 
> expected (no -INFINITY). However the cln_DLM clones have score like 10000, 
> 8200 and 2200 (depending on the node).
> 
> Can someone explain, please?
> 
> Regards,
> Ulrich
> 
>