[ClusterLabs] ocf scripts shell and local variables

Klaus Wenninger kwenning at redhat.com
Mon Aug 29 10:02:45 EDT 2016


On 08/29/2016 03:47 PM, Ken Gaillot wrote:
> On 08/29/2016 04:17 AM, Gabriele Bulfon wrote:
>> Hi Ken,
>>
>> I have been talking with the illumos guys about the shell problem.
>> They all agreed that ksh (and specially the ksh93 used in illumos) is
>> absolutely Bourne-compatible, and that the "local" variables used in the
>> ocf shells is not a Bourne syntax, but probably a bash specific.
>> This means that pointing the scripts to "#!/bin/sh" is portable as long
>> as the scripts are really Bourne-shell only syntax, as any Unix variant
>> may link whatever Bourne-shell they like.
>> In this case, it should point to "#!/bin/bash" or whatever shell the
>> script was written for.
>> Also, in this case, the starting point is not the ocf-* script, but the
>> original RA (IPaddr, but almost all of them).
>>
>> What about making the code base of RA and ocf-* portable?
>> It may be just by changing them to point to bash, or with some kind of
>> configure modifier to be able to specify the shell to use.
>>
>> Meanwhile, changing the scripts by hands into #!/bin/bash worked like a
>> charm, and I will start patching.
>>
>> Gabriele
> Interesting, I thought local was posix, but it's not. It seems everyone
> but solaris implemented it:
>
> http://stackoverflow.com/questions/18597697/posix-compliant-way-to-scope-variables-to-a-function-in-a-shell-script
>
> Please open an issue at:
>
> https://github.com/ClusterLabs/resource-agents/issues
>
> The simplest solution would be to require #!/bin/bash for all RAs that
> use local, but I'm not sure that's fair to the distros that support
> local in a non-bash default shell. Another possibility would be to
> modify all RAs to avoid local entirely, by using unique variable
> prefixes per function. Or, it may be possible to guard every instance of
> local with a check for ksh, which would use typeset instead. Raising the
> issue will allow some discussion of the possibilities.

An issue that probably doesn't just hit us with RAs.
(e.g. tools/cibsecret.in - just the first grep result ...)
Thus we probably might raise it against pacemaker as well -
especially as current testing seems not to be too well suited to
detect it.

>
>> ----------------------------------------------------------------------------------------
>> *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/>
>> *Music: *http://www.gabrielebulfon.com <http://www.gabrielebulfon.com/>
>> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>>
>>
>>
>> ----------------------------------------------------------------------------------
>>
>> Da: Ken Gaillot <kgaillot at redhat.com>
>> A: gbulfon at sonicle.com Cluster Labs - All topics related to open-source
>> clustering welcomed <users at clusterlabs.org>
>> Data: 26 agosto 2016 15.56.02 CEST
>> Oggetto: Re: ocf scripts shell and local variables
>>
>>     On 08/26/2016 08:11 AM, Gabriele Bulfon wrote:
>>     > I tried adding some debug in ocf-shellfuncs, showing env and ps
>>     -ef into
>>     > the corosync.log
>>     > I suspect it's always using ksh, because in the env output I
>>     produced I
>>     > find this: KSH_VERSION=.sh.version
>>     > This is normally not present in the environment, unless ksh is running
>>     > the shell.
>>
>>     The RAs typically start with #!/bin/sh, so whatever that points to on
>>     your system is what will be used.
>>
>>     > I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the
>>     > beginning, no way, same output.
>>
>>     You'd have to change the RA that includes them.
>>
>>     > Any idea how can I change the used shell to support "local" variables?
>>
>>     You can either edit the #!/bin/sh line at the top of each RA, or figure
>>     out how to point /bin/sh to a Bourne-compatible shell. ksh isn't
>>     Bourne-compatible, so I'd expect lots of #!/bin/sh scripts to fail with
>>     it as the default shell.
>>
>>     > Gabriele
>>     >
>>     >
>>     ----------------------------------------------------------------------------------------
>>     > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/>
>>     > *Music: *http://www.gabrielebulfon.com
>>     <http://www.gabrielebulfon.com/>
>>     > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>>     >
>>     >
>>     ------------------------------------------------------------------------
>>     >
>>     >
>>     > *Da:* Gabriele Bulfon <gbulfon at sonicle.com>
>>     > *A:* kgaillot at redhat.com Cluster Labs - All topics related to
>>     > open-source clustering welcomed <users at clusterlabs.org>
>>     > *Data:* 26 agosto 2016 10.12.13 CEST
>>     > *Oggetto:* Re: [ClusterLabs] ocf::heartbeat:IPaddr
>>     >
>>     >
>>     > I looked around what you suggested, inside ocf-binaris and
>>     > ocf-shellfuncs etc.
>>     > So I found also these logs in corosync.log :
>>     >
>>     > Aug 25 17:50:33 [2250] crmd: notice: process_lrm_event:
>>     > xstorage1-xstorage2_wan2_IP_start_0:22 [
>>     > /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
>>     > such file or
>>     > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[354]: local:
>>     > not found [No such file or
>>     > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[355]: local:
>>     > not found [No such file or
>>     > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[356]: local:
>>     > not found [No such file or directory]\nocf-exit-reason:Setup
>>     > problem: coul
>>     >
>>     > Aug 25 17:50:33 [2246] lrmd: notice: operation_finished:
>>     > xstorage2_wan2_IP_start_0:3613:stderr [
>>     > /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
>>     > such file or directory] ]
>>     >
>>     > Looks like the shell is not happy with the "local" variable
>>     definition.
>>     > I tried running ocf-shellfuncs manually with sh and bash and they
>>     > all run without errors.
>>     > How can I see what shell is running these scripts?
>>     >
>>     >
>>     ----------------------------------------------------------------------------------------
>>     > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/>
>>     > *Music: *http://www.gabrielebulfon.com
>>     <http://www.gabrielebulfon.com/>
>>     > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>>     >
>>     >
>>     >
>>     >
>>     ----------------------------------------------------------------------------------
>>     >
>>     > Da: Ken Gaillot <kgaillot at redhat.com>
>>     > A: users at clusterlabs.org
>>     > Data: 25 agosto 2016 18.07.42 CEST
>>     > Oggetto: Re: [ClusterLabs] ocf::heartbeat:IPaddr
>>     >
>>     > On 08/25/2016 10:51 AM, Gabriele Bulfon wrote:
>>     > > Hi,
>>     > >
>>     > > I'm advancing with this monster cluster on XStreamOS/illumos ;)
>>     > >
>>     > > In the previous older tests I used heartbeat, and I had these
>>     > lines to
>>     > > take care of the swapping public IP addresses:
>>     > >
>>     > > primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params
>>     > ip="1.2.3.4"
>>     > > cidr_netmask="255.255.255.0" nic="e1000g1"
>>     > > primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params
>>     > ip="1.2.3.5"
>>     > > cidr_netmask="255.255.255.0" nic="e1000g1"
>>     > >
>>     > > location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1
>>     > > location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2
>>     > >
>>     > > They get configured, but then I get this in crm status:
>>     > >
>>     > > xstorage1_wan1_IP (ocf::heartbeat:IPaddr): Stopped
>>     > > xstorage2_wan2_IP (ocf::heartbeat:IPaddr): Stopped
>>     > >
>>     > > Failed Actions:
>>     > > * xstorage1_wan1_IP_start_0 on xstorage1 'not installed' (5):
>>     > call=20,
>>     > > status=complete, exitreason='Setup problem: couldn't find command:
>>     > > /usr/bin/gawk',
>>     > > last-rc-change='Thu Aug 25 17:50:32 2016', queued=1ms, exec=158ms
>>     > > * xstorage2_wan2_IP_start_0 on xstorage1 'not installed' (5):
>>     > call=22,
>>     > > status=complete, exitreason='Setup problem: couldn't find command:
>>     > > /usr/bin/gawk',
>>     > > last-rc-change='Thu Aug 25 17:50:33 2016', queued=1ms, exec=29ms
>>     > > * xstorage1_wan1_IP_start_0 on xstorage2 'not installed' (5):
>>     > call=22,
>>     > > status=complete, exitreason='Setup problem: couldn't find command:
>>     > > /usr/bin/gawk',
>>     > > last-rc-change='Thu Aug 25 17:50:30 2016', queued=1ms, exec=36ms
>>     > > * xstorage2_wan2_IP_start_0 on xstorage2 'not installed' (5):
>>     > call=20,
>>     > > status=complete, exitreason='Setup problem: couldn't find command:
>>     > > /usr/bin/gawk',
>>     > > last-rc-change='Thu Aug 25 17:50:29 2016', queued=0ms, exec=150ms
>>     > >
>>     > >
>>     > > The crm configure process already checked of the presence of the
>>     > > required IPaddr shell, and it was ok.
>>     > > Now looks like it's looking for "/usr/bin/gawk", and that is
>>     > actually there!
>>     > > Is there any known incompatibility with the mixed heartbeat
>>     > ocf ? Should
>>     > > I use corosync specific ocf files or something else?
>>     >
>>     > "heartbeat" in this case is just an OCF provider name, and has
>>     > nothing
>>     > to do with the heartbeat messaging layer, other than having its
>>     > origin
>>     > in the same project. There actually has been a recent proposal
>>     > to rename
>>     > the provider to "clusterlabs" to better reflect the current reality.
>>     >
>>     > The "couldn't find command" message comes from the ocf-binaries
>>     > shell
>>     > functions. If you look at have_binary() there, it uses sed and
>>     > which,
>>     > and I'm guessing that fails on your OS somehow. You may need to
>>     > patch it.
>>     >
>>     > > Thanks again!
>>     > >
>>     > > Gabriele
>>     > >
>>     > >
>>     >
>>     ----------------------------------------------------------------------------------------
>>     > > *Sonicle S.r.l. *: http://www.sonicle.com
>>     > <http://www.sonicle.com/>
>>     > > *Music: *http://www.gabrielebulfon.com
>>     > <http://www.gabrielebulfon.com/>
>>     > > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list