[ClusterLabs] Bug in ocf-shellfuncs, ocf_local_nodename function?
Ken Gaillot
kgaillot at redhat.com
Fri Nov 18 02:09:39 CET 2016
On 11/17/2016 11:59 AM, Israel Brewster wrote:
> This refers specifically to build version
> 5434e9646462d2c3c8f7aad2609d0ef1875839c7 of the ocf-shellfuncs file, on
> CentOS 6.8, so it might not be an issue on later builds (if any) or
> different operating systems, but it would appear that the
> ocf_local_nodename function can have issues with certain configurations.
> Specially, I was debugging an issue I was having with a resource agent
> that I traced down to that function returning the FQDN of the machine
> rather than the actual node name, which in my case was a short name.
>
> In looking at the code, I see that the function is looking for a
> pacemaker version greater than 1.1.8, in which case it uses crm_node
> (which works), otherwise it just uses "uname -n", which returns the FQDN
> (at least in my configuration). To get the current version, it runs the
> command:
>
> local version=$(pacemakerd -$ | grep "Pacemaker .*" | awk '{ print $2 }')
>
> Which on CentOS 6.8 returns (as of today, at least):
>
> 1.1.14-8.el6_8.1
>
> Unfortunately, when that string is passed to the ocf_version_cmp
> function to compare against 1.1.8, it returns 3, for "bad format", and
> so falls back to using "uname -n", even though the version *is* greater
> than 1.1.8, and crm_node would return the proper value.
>
> Of course, if you always set up your cluster to use the FQDN of the
> servers as the node name, or more specifically always set them up such
> that the output of uname -n is the node name, then there isn't an issue
> other than perhaps a undetectably slight loss of efficiency. However, as
> I accidentally proved by doing otherwise, there is no actual requirement
> when setting up a cluster that the node names match uname -n (although
> perhaps it is considered "best practice"?), as long as they resolve to
> an IP.
Yes, it is considered a "best practice" (or at least a "safer
practice"), because issues like this tend to pop up periodically. :(
I'd recommend filing a bug against the resource-agents package, so the
version comparison can be made more intelligent.
>
> I've worked around this in my installation by simply modifying the
> resource agent to call crm_node directly (since I know I am running on a
> version greater than 1.1.8), but I figured I might mention it, since I
> don't get any results when trying to google the issue.
> -----------------------------------------------
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293
> -----------------------------------------------
More information about the Users
mailing list