[ClusterLabs] Antw: [EXT] Suggestions for multiple NFS mounts as LSB script

Mon Jun 29 15:04:11 EDT 2020

On Mon, 2020-06-29 at 13:20 -0400, Tony Stocker wrote:
> On Mon, Jun 29, 2020 at 11:08 AM Ulrich Windl
> <Ulrich.Windl at rz.uni-regensburg.de> wrote:
> > 
> > You could construct a script that generates the commands needed, so
> > it would
> > be rather easy to handle.
> 
> True. The initial population wouldn't be that burdensome. I was
> thinking of later when my coworkers have to add/remove mounts. I,
> honestly, don't want to be involved in that any more than I must.
> Currently they just make changes in their script and alles ist gut.
> But more than anything I don't want them mucking about with Pacemaker
> commands (which means I would have to do updates) since once they
> break things, I'm the one who would have to fix it and explain how it
> wasn't my fault.

Advantages of having a resource per mount:
- Each is monitored individually so you know if there are mount-
specific issues
- If one mount fails (to start, or later), it doesn't need to affect
the other mounts
- You can define colocation/ordering dependencies between specific
mounts if warranted
- You can choose a failover or load-balancing model

To get around the issues you mention, you could define ACLs and allow
the others access to just the resources section (still pretty broad,
but less room for trouble). You could also write a script for simply
adding/removing mounts so they don't have to know (or be able to
misuse) cluster commands.

For background on ACLs see:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#idm47160746093920

and see the man page for pcs/crm/whatever you're using for
configuration.

> > Have you considered using automount? It's like fstab, but won't
> > mount
> > automatically.
> 
> We looked at it a few years ago, but it didn't seem to react too well
> to being used in a file server (https/ftps) role and so we abandoned
> it.
> 
> > 
> > 
> > The most interesting part seems to be the question whow you define
> > (and
> > detect) a failure that will cause a node switch.
> 
> That is a VERY good question! How many mounts failed is the critical
> number when you have 130+? If a single one fails, do you suddenly
> move
> everything to the other node (even though it's just as likely to fail
> there)? Do you just monitor and issue complaints? At the moment
> there's zero checking of this, so until someone complains that they
> can't reach something, we don't know that the mount isn't working
> properly -- so apparently I guess it's not viewed as that critical.
> But at the very least, the main home directory for the https/ftps
> file
> server operations should be operational, or else it's all moot.
> 
> Is ocf_tester still available? I installed via 'yum' from the High
> Availability repository and don't see it. I also did a 'yum
> whatprovides *bin/ocf-tester' and no package came back. Do I have to
> manual download it from somewhere? If so, could someone provide a
> link
> to the most up-to-date source?
> 
> Thanks!
-- 
Ken Gaillot <kgaillot at redhat.com>