[ClusterLabs] Antw: Re: ocf scripts shell and local variables

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Aug 30 02:03:32 EDT 2016


>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 29.08.2016 um 11:17 in
Nachricht <3973483.545.1472462255112.JavaMail.sonicle at www>:
> Hi Ken,
> I have been talking with the illumos guys about the shell problem.
> They all agreed that ksh (and specially the ksh93 used in illumos) is 
> absolutely Bourne-compatible, and that the "local" variables used in the ocf 
> shells is not a Bourne syntax, but probably a bash specific.

Hi!

"Bourne Shell" ist the one they had in the 80ies; since the 90ies you have a POSIX-compatible shell (like $(xxx)". POSIX is Bourne-compatible, but not the other way round. BASH is also POSIX-compatible, but not the other way round. I think nobody should use a non-POSIX-compatible shell for scripts. I know that some people insist on using csh for scripts, but that's different tribe ;-)


> This means that pointing the scripts to "#!/bin/sh" is portable as long as 
> the scripts are really Bourne-shell only syntax, as any Unix variant may link 
> whatever Bourne-shell they like.

No! It's bad practice that Linux uses /bin/sh for a modern Shell (it should be /usr/bin/sh), but assuming that /bin/sh is just a Bourne shell is wrong.

> In this case, it should point to "#!/bin/bash" or whatever shell the script 
> was written for.

See my first comment on it.

> Also, in this case, the starting point is not the ocf-* script, but the 
> original RA (IPaddr, but almost all of them).
> What about making the code base of RA and ocf-* portable?

Compatible to the 70ies?

> It may be just by changing them to point to bash, or with some kind of 
> configure modifier to be able to specify the shell to use.
> Meanwhile, changing the scripts by hands into #!/bin/bash worked like a 
> charm, and I will start patching.

Personally (the BASH documentation is very poor considering documenting Bourne-shell features, POSIX-features and BASH-features) I use /bin/sh for POSIX shell features being used and /bin/bash for BASH-features being used (as far as I can tell).

Regards,
Ulrich

> Gabriele
> ----------------------------------------------------------------------------------------
> Sonicle S.r.l.
> :
> http://www.sonicle.com 
> Music:
> http://www.gabrielebulfon.com 
> Quantum Mechanics :
> http://www.cdbaby.com/cd/gabrielebulfon 
> ----------------------------------------------------------------------------------
> Da: Ken Gaillot
> A: gbulfon at sonicle.com Cluster Labs - All topics related to open-source 
> clustering welcomed
> Data: 26 agosto 2016 15.56.02 CEST
> Oggetto: Re: ocf scripts shell and local variables
> On 08/26/2016 08:11 AM, Gabriele Bulfon wrote:
> I tried adding some debug in ocf-shellfuncs, showing env and ps -ef into
> the corosync.log
> I suspect it's always using ksh, because in the env output I produced I
> find this: KSH_VERSION=.sh.version
> This is normally not present in the environment, unless ksh is running
> the shell.
> The RAs typically start with #!/bin/sh, so whatever that points to on
> your system is what will be used.
> I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the
> beginning, no way, same output.
> You'd have to change the RA that includes them.
> Any idea how can I change the used shell to support "local" variables?
> You can either edit the #!/bin/sh line at the top of each RA, or figure
> out how to point /bin/sh to a Bourne-compatible shell. ksh isn't
> Bourne-compatible, so I'd expect lots of #!/bin/sh scripts to fail with
> it as the default shell.
> Gabriele
> ----------------------------------------------------------------------------------------
> *Sonicle S.r.l. *: http://www.sonicle.com 
> *Music: *http://www.gabrielebulfon.com 
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon 
> ------------------------------------------------------------------------
> *Da:* Gabriele Bulfon
> *A:* kgaillot at redhat.com Cluster Labs - All topics related to
> open-source clustering welcomed
> *Data:* 26 agosto 2016 10.12.13 CEST
> *Oggetto:* Re: [ClusterLabs] ocf::heartbeat:IPaddr
> I looked around what you suggested, inside ocf-binaris and
> ocf-shellfuncs etc.
> So I found also these logs in corosync.log :
> Aug 25 17:50:33 [2250] crmd: notice: process_lrm_event:
> xstorage1-xstorage2_wan2_IP_start_0:22 [
> /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
> such file or
> directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[354]: local:
> not found [No such file or
> directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[355]: local:
> not found [No such file or
> directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[356]: local:
> not found [No such file or directory]\nocf-exit-reason:Setup
> problem: coul
> Aug 25 17:50:33 [2246] lrmd: notice: operation_finished:
> xstorage2_wan2_IP_start_0:3613:stderr [
> /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
> such file or directory] ]
> Looks like the shell is not happy with the "local" variable definition.
> I tried running ocf-shellfuncs manually with sh and bash and they
> all run without errors.
> How can I see what shell is running these scripts?
> ----------------------------------------------------------------------------------------
> *Sonicle S.r.l. *: http://www.sonicle.com 
> *Music: *http://www.gabrielebulfon.com 
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon 
> ----------------------------------------------------------------------------------
> Da: Ken Gaillot
> A: users at clusterlabs.org 
> Data: 25 agosto 2016 18.07.42 CEST
> Oggetto: Re: [ClusterLabs] ocf::heartbeat:IPaddr
> On 08/25/2016 10:51 AM, Gabriele Bulfon wrote:
> Hi,
> I'm advancing with this monster cluster on XStreamOS/illumos ;)
> In the previous older tests I used heartbeat, and I had these
> lines to
> take care of the swapping public IP addresses:
> primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params
> ip="1.2.3.4"
> cidr_netmask="255.255.255.0" nic="e1000g1"
> primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params
> ip="1.2.3.5"
> cidr_netmask="255.255.255.0" nic="e1000g1"
> location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1
> location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2
> They get configured, but then I get this in crm status:
> xstorage1_wan1_IP (ocf::heartbeat:IPaddr): Stopped
> xstorage2_wan2_IP (ocf::heartbeat:IPaddr): Stopped
> Failed Actions:
> * xstorage1_wan1_IP_start_0 on xstorage1 'not installed' (5):
> call=20,
> status=complete, exitreason='Setup problem: couldn't find command:
> /usr/bin/gawk',
> last-rc-change='Thu Aug 25 17:50:32 2016', queued=1ms, exec=158ms
> * xstorage2_wan2_IP_start_0 on xstorage1 'not installed' (5):
> call=22,
> status=complete, exitreason='Setup problem: couldn't find command:
> /usr/bin/gawk',
> last-rc-change='Thu Aug 25 17:50:33 2016', queued=1ms, exec=29ms
> * xstorage1_wan1_IP_start_0 on xstorage2 'not installed' (5):
> call=22,
> status=complete, exitreason='Setup problem: couldn't find command:
> /usr/bin/gawk',
> last-rc-change='Thu Aug 25 17:50:30 2016', queued=1ms, exec=36ms
> * xstorage2_wan2_IP_start_0 on xstorage2 'not installed' (5):
> call=20,
> status=complete, exitreason='Setup problem: couldn't find command:
> /usr/bin/gawk',
> last-rc-change='Thu Aug 25 17:50:29 2016', queued=0ms, exec=150ms
> The crm configure process already checked of the presence of the
> required IPaddr shell, and it was ok.
> Now looks like it's looking for "/usr/bin/gawk", and that is
> actually there!
> Is there any known incompatibility with the mixed heartbeat
> ocf ? Should
> I use corosync specific ocf files or something else?
> "heartbeat" in this case is just an OCF provider name, and has
> nothing
> to do with the heartbeat messaging layer, other than having its
> origin
> in the same project. There actually has been a recent proposal
> to rename
> the provider to "clusterlabs" to better reflect the current reality.
> The "couldn't find command" message comes from the ocf-binaries
> shell
> functions. If you look at have_binary() there, it uses sed and
> which,
> and I'm guessing that fails on your OS somehow. You may need to
> patch it.
> Thanks again!
> Gabriele
> ----------------------------------------------------------------------------------------
> *Sonicle S.r.l. *: http://www.sonicle.com 
> *Music: *http://www.gabrielebulfon.com 
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon 







More information about the Users mailing list