[ClusterLabs] PAF with postgresql 13?
Ken Gaillot
kgaillot at redhat.com
Tue Mar 8 17:08:43 EST 2022
On Tue, 2022-03-08 at 17:20 +0100, Jehan-Guillaume de Rorthais wrote:
> Hi,
>
> Sorry, your mail was really hard to read on my side, but I think I
> understood
> and try to answer bellow.
>
> On Tue, 8 Mar 2022 11:45:30 +0000
> lejeczek via Users <users at clusterlabs.org> wrote:
>
> > On 08/03/2022 10:21, Jehan-Guillaume de Rorthais wrote:
> > > > op start timeout=60s \ op stop timeout=60s \ op promote
> > > > timeout=30s >> \
> > > > op demote timeout=120s \ op monitor interval=15s
> > timeout=10s >> role="Master" meta master-max=1 \ op monitor
> > interval=16s >> timeout=10s role="Slave" \ op notify
> > timeout=60s meta notify=true > Because "op" appears, we are
> > back in resource ("pgsqld") context, > anything after is
> > interpreted as ressource and operation attributes, > even
> > the "meta notify=true". That's why your pgsqld-clone doesn't
> > > have the meta attribute "notify=true" set.
> > Here is one-liner that should do - add, as per 'debug-'
> > suggestion, 'master-max=1'
>
> What debug- suggestion??
>
> ...
> > then do:
> >
> > -> $ pcs resource delete pgsqld
> >
> > '-clone' should get removed too, so now no 'pgsqld'
> > resource(s) but cluster - weirdly in my mind - leaves node
> > attributes on.
>
> indeed.
>
> > I see 'master-pgsqld' with each node and do not see why
> > 'node attributes' should be kept(certainly shown) for
> > non-existent resources(to which only resources those attrs
> > are instinct)
> > So, you want to "clean" that for, perhaps for now you are
> > not going to have/use 'pgsqlms', you can do that with:
> >
> > -> $ pcs node attribute node1 master-pgsqld="" # same for
> > remaining nodes
>
> indeed.
>
> > now .. ! repeat your one-liner which worked just a moment
> > ago and you should get exact same or similar errors(while
> > all nodes are stuck on 'slave'
>
> You have no promotion because your PostgreSQL instances has been
> stopped
> in standby mode. The cluster has no way and no score to promote one
> of them.
>
> > -> $ pcs resource debug-promote pgsqld
> > crm_resource: Error performing operation: Error occurred
> > Operation force-promote for pgsqld (ocf:heartbeat:pgsqlms)
> > returned 1 (error: Can not get current node LSN location)
> > /tmp:5432 - accepting connections
>
> NEVER use "debug-promote" or other "debug-*" command with pgsqlms, or
> any other
> cloned ressources. AFAIK, these commands works fine for "stateless"
> ressource,
> but do not (could not) create the required environnement for the
> clone and multi-state ones.
>
> So I repeat, NEVER use "debug-promote".
>
> What you want to do is setting the promotion score on the node you
> want the
> promotion to happen. Eg.:
>
> pcs node attribute srv1 master-pgsqld=1001
>
> You can use "crm_attribute" or "crm_master" as well.
>
> > ocf-exit-reason:Can not get current node LSN location
>
> This one is probably because of "debug-promote".
>
> > You have to 'cib-push' to "fix" this very problem.
> > In my(admin's) opinion this is a 100% candidate for a bug -
> > whether in PCS or PAF - perhaps authors may wish to comment?
>
> Removing the node attributes with the resource might be legit from
> the
> Pacemaker point of view, but I'm not sure how they can track the
> dependency
> (ping Ken?).
Higher-level tools like pcs or crm shell could probably do it when
removing the resource (i.e. if the resource was a promotable clone,
check for and remove any node attributes of the form master-$RSC_ID).
That sounds like a good idea to me.
Pacemaker would be a bad place to do it because Pacemaker only sees the
newly modified CIB with the resource configuration gone -- it can't
know for sure whether it was a promotable clone, and it can only know
it existed at all if there is leftover status entries (causing the
resource to be listed as "orphaned"), which isn't guaranteed.
>
> PAF has no way to know the ressource is being deleted and can not
> remove its
> node attribute before hand.
>
> Maybe PCS can look for promotable score and remove them during the
> "resource
> delete" command (ping Tomas)?
>
> Regards,
>
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list