[ClusterLabs] [Problem] In RHEL8.4beta, pgsql resource control fails.
Ken Gaillot
kgaillot at redhat.com
Thu Apr 29 18:33:44 EDT 2021
On Wed, 2021-04-28 at 19:19 +0200, Jehan-Guillaume de Rorthais wrote:
> On Wed, 28 Apr 2021 12:00:40 -0500
> Ken Gaillot <kgaillot at redhat.com> wrote:
>
> > On Wed, 2021-04-28 at 18:14 +0200, Jehan-Guillaume de Rorthais
> > wrote:
> > > Hi all,
> > >
> > > It seems to me the concern raised by Ulrich hasn't been
> > > discussed:
> > >
> > > On Wed, 12 Apr 2021 Ulrich Windl wrote:
> > >
> > > > Personally I think an RA calling crm_mon is inherently broken:
> > > > Will
> > > > it ever
> > > > pass ocf-tester?
> >
> > Calling the command-line tools in an agent can be OK in some cases.
> > The
> > main concerns are:
> >
> > * Time-of-check/time-of-use: cluster status can change immediately,
> > so
> > the agent should behave reasonably if a query result is incorrect
> > at
> > the moment it's used. Ideally there would be no case where the
> > agent
> > could incorrectly report success for an action.
> >
> > * No commands that *change* the configuration (other than setting
> > node
> > attributes) should ever be used. Otherwise there's a potential for
> > an
> > infinite loop between the agent and scheduler.
> >
> > * It's best to use tools' XML output when available, because that
> > should be stable across Pacemaker releases, while the text output
> > may
> > not be. Aside from crm_mon, XML output is a recent addition, so
> > some
> > consideration must be given to backward compatibility and/or
> > requiring
> > a minimum Pacemaker version.
> >
> > * Only the configuration section of the CIB has a guaranteed
> > schema.
> > The status section can theoretically change from release to
> > release,
> > although in practice it has changed very little over the years.
> >
> > I don't use ocf-tester so I can't speak to that, but I suspect it
> > could
> > work if you exported a CIB_file variable with a sample cluster
> > status
> > beforehand. (CIB_file makes the cluster commands act as if the
> > specified file is the live CIB at the moment.)
> >
> > > Would it be possible to rely on the following command ?
> > >
> > > cibadmin --query --xpath "//status/node_state[@join='member']"
> > > | \
> > > grep -Po 'uname="\K[^"]+'
> > >
> > >
> > > Regards,
> >
> > Only full cluster nodes will have a "join" attribute, so that query
> > won't catch active remote nodes or guest nodes. Whether that's good
> > or
> > bad depends on what you're looking for.
>
> That was an example to remove the crm_mon dependency with the
> cibadmin one.
> AFAIU this agent, it uses crm_mon to:
>
> * look for the node hosting the promoted clone
> * look for a node existence
> * look for a node fully joined
>
> all of these use seems accessible by parsing the cibadmin status
> section
> output (or --xpath).
I would think remote nodes and guest nodes should be considered, too,
unless the agent specifically doesn't support that.
Remote nodes and guest nodes don't join the controller layer, so they
won't have a join entry, but they can resources.
> > The plus side is that it's a query and it returns XML.
>
> indeed.
>
> > The downsides are that node status can change quickly, so it could
> > theoretically be inaccurate a moment later when you use it, and the
> > status section is not guaranteed to stay in that format (though I
> > expect that particular part will).
>
> There's already version checks in pgsql RA code for crm_mon anyway,
> relying on
> OCF_RESKEY_crm_feature_set.
>
> > A minor point: that query will return the entire node_state XML
> > subtree; you can add -n/--no-children to return just the node_state
> > element itself.
>
> Nice!
>
> I was playing with xmllint as well, for an expanded support of
> xmllint, but it
> would add a strong dependency.
>
> Regards,
>
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list