[ClusterLabs] ocf scripts shell and local variables

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Sep 1 11:04:45 EDT 2016


Hi Ken,

On Wed, Aug 31, 2016 at 11:31:05AM -0500, Ken Gaillot wrote:
> On 08/30/2016 05:46 AM, Gabriele Bulfon wrote:
> > illumos (and Solaris 11) delivers ksh93, that is fully Bourn compatible,
> > but not with the bash extension of "local" variables, that is not Bourn
> > shell. It is supported in ksh93 with the "typedef" operator, instead of
> > "local".
> 
> "local" isn't Bourne or POSIX, but it isn't a bash extension either.
> Apparently, it was introduced by the original Almquist shell (ash), and
> so it is supported by both bash and dash. zsh also supports local, and
> mksh and OpenBSD ksh have a built-in alias for local='typeset'. Vanilla
> ksh (used by Solaris and derivatives) is the only shell in general use
> as /bin/sh that doesn't support it.

Thanks for doing the research :)

> Unfortunately, there is no standard way to locally scope a shell
> variable, and no simple, readable way to do it in a way that runs on
> both *ash and vanilla ksh.

Yes, that sums this issue up.

Thanks,

Dejan

> 
> > This is used inside the "ocf-*" scripts.
> > 
> > Gabriele
> > 
> > ----------------------------------------------------------------------------------------
> > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/>
> > *Music: *http://www.gabrielebulfon.com <http://www.gabrielebulfon.com/>
> > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> > 
> > 
> > 
> > ----------------------------------------------------------------------------------
> > 
> > Da: Dejan Muhamedagic <dejanmm at fastmail.fm>
> > A: gbulfon at sonicle.com Cluster Labs - All topics related to open-source
> > clustering welcomed <users at clusterlabs.org>
> > Data: 30 agosto 2016 12.20.19 CEST
> > Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
> > 
> >     Hi,
> > 
> >     On Mon, Aug 29, 2016 at 05:08:35PM +0200, Gabriele Bulfon wrote:
> >     > Sure, infact I can change all shebang to point to /bin/bash and
> >     it's ok.
> >     > The question is about current shebang /bin/sh which may go into
> >     trouble (as if one would point to a generic python but uses many
> >     specific features of a version of python).
> >     > Also, the question is about bash being a good option for RAs,
> >     being much more heavy.
> > 
> >     I'd really suggest installing a smaller shell such as /bin/dash
> >     and using that as /bin/sh. Isn't there a Bourne shell in Solaris?
> >     If you modify the RAs it could be trouble on subsequent updates.
> > 
> >     Thanks,
> > 
> >     Dejan
> > 
> >     > Gabriele
> >     >
> >     ----------------------------------------------------------------------------------------
> >     > Sonicle S.r.l.
> >     > :
> >     > http://www.sonicle.com
> >     > Music:
> >     > http://www.gabrielebulfon.com
> >     > Quantum Mechanics :
> >     > http://www.cdbaby.com/cd/gabrielebulfon
> >     >
> >     ----------------------------------------------------------------------------------
> >     > Da: Dejan Muhamedagic
> >     > A: kgaillot at redhat.com Cluster Labs - All topics related to
> >     open-source clustering welcomed
> >     > Data: 29 agosto 2016 16.43.52 CEST
> >     > Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
> >     > Hi,
> >     > On Mon, Aug 29, 2016 at 08:47:43AM -0500, Ken Gaillot wrote:
> >     > On 08/29/2016 04:17 AM, Gabriele Bulfon wrote:
> >     > Hi Ken,
> >     > I have been talking with the illumos guys about the shell problem.
> >     > They all agreed that ksh (and specially the ksh93 used in illumos) is
> >     > absolutely Bourne-compatible, and that the "local" variables used
> >     in the
> >     > ocf shells is not a Bourne syntax, but probably a bash specific.
> >     > This means that pointing the scripts to "#!/bin/sh" is portable as
> >     long
> >     > as the scripts are really Bourne-shell only syntax, as any Unix
> >     variant
> >     > may link whatever Bourne-shell they like.
> >     > In this case, it should point to "#!/bin/bash" or whatever shell the
> >     > script was written for.
> >     > Also, in this case, the starting point is not the ocf-* script,
> >     but the
> >     > original RA (IPaddr, but almost all of them).
> >     > What about making the code base of RA and ocf-* portable?
> >     > It may be just by changing them to point to bash, or with some kind of
> >     > configure modifier to be able to specify the shell to use.
> >     > Meanwhile, changing the scripts by hands into #!/bin/bash worked
> >     like a
> >     > charm, and I will start patching.
> >     > Gabriele
> >     > Interesting, I thought local was posix, but it's not. It seems
> >     everyone
> >     > but solaris implemented it:
> >     >
> >     http://stackoverflow.com/questions/18597697/posix-compliant-way-to-scope-variables-to-a-function-in-a-shell-script
> >     > Please open an issue at:
> >     > https://github.com/ClusterLabs/resource-agents/issues
> >     > The simplest solution would be to require #!/bin/bash for all RAs that
> >     > use local,
> >     > This issue was raised many times, but note that /bin/bash is a
> >     > shell not famous for being lean: it's great for interactive use,
> >     > but not so great if you need to run a number of scripts. The
> >     > complexity in bash, which is superfluous for our use case,
> >     > doesn't go well with the basic principles of HA clusters.
> >     > but I'm not sure that's fair to the distros that support
> >     > local in a non-bash default shell. Another possibility would be to
> >     > modify all RAs to avoid local entirely, by using unique variable
> >     > prefixes per function.
> >     > I doubt that we could do a moderately complex shell scripts
> >     > without capability of limiting the variables' scope and retaining
> >     > sanity at the same time.
> >     > Or, it may be possible to guard every instance of
> >     > local with a check for ksh, which would use typeset instead.
> >     Raising the
> >     > issue will allow some discussion of the possibilities.
> >     > Just to mention that this is the first time someone reported
> >     > running a shell which doesn't support local. Perhaps there's an
> >     > option that they install a shell which does.
> >     > Thanks,
> >     > Dejan
> >     >
> >     ----------------------------------------------------------------------------------------
> >     > *Sonicle S.r.l. *: http://www.sonicle.com
> >     > *Music: *http://www.gabrielebulfon.com
> >     > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> >     >
> >     ----------------------------------------------------------------------------------
> >     > Da: Ken Gaillot
> >     > A: gbulfon at sonicle.com Cluster Labs - All topics related to
> >     open-source
> >     > clustering welcomed
> >     > Data: 26 agosto 2016 15.56.02 CEST
> >     > Oggetto: Re: ocf scripts shell and local variables
> >     > On 08/26/2016 08:11 AM, Gabriele Bulfon wrote:
> >     > I tried adding some debug in ocf-shellfuncs, showing env and ps
> >     > -ef into
> >     > the corosync.log
> >     > I suspect it's always using ksh, because in the env output I
> >     > produced I
> >     > find this: KSH_VERSION=.sh.version
> >     > This is normally not present in the environment, unless ksh is running
> >     > the shell.
> >     > The RAs typically start with #!/bin/sh, so whatever that points to on
> >     > your system is what will be used.
> >     > I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the
> >     > beginning, no way, same output.
> >     > You'd have to change the RA that includes them.
> >     > Any idea how can I change the used shell to support "local" variables?
> >     > You can either edit the #!/bin/sh line at the top of each RA, or
> >     figure
> >     > out how to point /bin/sh to a Bourne-compatible shell. ksh isn't
> >     > Bourne-compatible, so I'd expect lots of #!/bin/sh scripts to fail
> >     with
> >     > it as the default shell.
> >     > Gabriele
> >     >
> >     ----------------------------------------------------------------------------------------
> >     > *Sonicle S.r.l. *: http://www.sonicle.com
> >     > *Music: *http://www.gabrielebulfon.com
> >     > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> >     >
> >     ------------------------------------------------------------------------
> >     > *Da:* Gabriele Bulfon
> >     > *A:* kgaillot at redhat.com Cluster Labs - All topics related to
> >     > open-source clustering welcomed
> >     > *Data:* 26 agosto 2016 10.12.13 CEST
> >     > *Oggetto:* Re: [ClusterLabs] ocf::heartbeat:IPaddr
> >     > I looked around what you suggested, inside ocf-binaris and
> >     > ocf-shellfuncs etc.
> >     > So I found also these logs in corosync.log :
> >     > Aug 25 17:50:33 [2250] crmd: notice: process_lrm_event:
> >     > xstorage1-xstorage2_wan2_IP_start_0:22 [
> >     > /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
> >     > such file or
> >     > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[354]: local:
> >     > not found [No such file or
> >     > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[355]: local:
> >     > not found [No such file or
> >     > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[356]: local:
> >     > not found [No such file or directory]\nocf-exit-reason:Setup
> >     > problem: coul
> >     > Aug 25 17:50:33 [2246] lrmd: notice: operation_finished:
> >     > xstorage2_wan2_IP_start_0:3613:stderr [
> >     > /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
> >     > such file or directory] ]
> >     > Looks like the shell is not happy with the "local" variable
> >     > definition.
> >     > I tried running ocf-shellfuncs manually with sh and bash and they
> >     > all run without errors.
> >     > How can I see what shell is running these scripts?
> >     >
> >     ----------------------------------------------------------------------------------------
> >     > *Sonicle S.r.l. *: http://www.sonicle.com
> >     > *Music: *http://www.gabrielebulfon.com
> >     > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> >     >
> >     ----------------------------------------------------------------------------------
> >     > Da: Ken Gaillot
> >     > A: users at clusterlabs.org
> >     > Data: 25 agosto 2016 18.07.42 CEST
> >     > Oggetto: Re: [ClusterLabs] ocf::heartbeat:IPaddr
> >     > On 08/25/2016 10:51 AM, Gabriele Bulfon wrote:
> >     > Hi,
> >     > I'm advancing with this monster cluster on XStreamOS/illumos ;)
> >     > In the previous older tests I used heartbeat, and I had these
> >     > lines to
> >     > take care of the swapping public IP addresses:
> >     > primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params
> >     > ip="1.2.3.4"
> >     > cidr_netmask="255.255.255.0" nic="e1000g1"
> >     > primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params
> >     > ip="1.2.3.5"
> >     > cidr_netmask="255.255.255.0" nic="e1000g1"
> >     > location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1
> >     > location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2
> >     > They get configured, but then I get this in crm status:
> >     > xstorage1_wan1_IP (ocf::heartbeat:IPaddr): Stopped
> >     > xstorage2_wan2_IP (ocf::heartbeat:IPaddr): Stopped
> >     > Failed Actions:
> >     > * xstorage1_wan1_IP_start_0 on xstorage1 'not installed' (5):
> >     > call=20,
> >     > status=complete, exitreason='Setup problem: couldn't find command:
> >     > /usr/bin/gawk',
> >     > last-rc-change='Thu Aug 25 17:50:32 2016', queued=1ms, exec=158ms
> >     > * xstorage2_wan2_IP_start_0 on xstorage1 'not installed' (5):
> >     > call=22,
> >     > status=complete, exitreason='Setup problem: couldn't find command:
> >     > /usr/bin/gawk',
> >     > last-rc-change='Thu Aug 25 17:50:33 2016', queued=1ms, exec=29ms
> >     > * xstorage1_wan1_IP_start_0 on xstorage2 'not installed' (5):
> >     > call=22,
> >     > status=complete, exitreason='Setup problem: couldn't find command:
> >     > /usr/bin/gawk',
> >     > last-rc-change='Thu Aug 25 17:50:30 2016', queued=1ms, exec=36ms
> >     > * xstorage2_wan2_IP_start_0 on xstorage2 'not installed' (5):
> >     > call=20,
> >     > status=complete, exitreason='Setup problem: couldn't find command:
> >     > /usr/bin/gawk',
> >     > last-rc-change='Thu Aug 25 17:50:29 2016', queued=0ms, exec=150ms
> >     > The crm configure process already checked of the presence of the
> >     > required IPaddr shell, and it was ok.
> >     > Now looks like it's looking for "/usr/bin/gawk", and that is
> >     > actually there!
> >     > Is there any known incompatibility with the mixed heartbeat
> >     > ocf ? Should
> >     > I use corosync specific ocf files or something else?
> >     > "heartbeat" in this case is just an OCF provider name, and has
> >     > nothing
> >     > to do with the heartbeat messaging layer, other than having its
> >     > origin
> >     > in the same project. There actually has been a recent proposal
> >     > to rename
> >     > the provider to "clusterlabs" to better reflect the current reality.
> >     > The "couldn't find command" message comes from the ocf-binaries
> >     > shell
> >     > functions. If you look at have_binary() there, it uses sed and
> >     > which,
> >     > and I'm guessing that fails on your OS somehow. You may need to
> >     > patch it.
> >     > Thanks again!
> >     > Gabriele
> >     >
> >     ----------------------------------------------------------------------------------------
> >     > *Sonicle S.r.l. *: http://www.sonicle.com
> >     > *Music: *http://www.gabrielebulfon.com
> >     > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list