[ClusterLabs] PAF with postgresql 13?

Ken Gaillot kgaillot at redhat.com
Tue Mar 8 17:08:43 EST 2022


On Tue, 2022-03-08 at 17:20 +0100, Jehan-Guillaume de Rorthais wrote:
> Hi,
> 
> Sorry, your mail was really hard to read on my side, but I think I
> understood
> and try to answer bellow.
> 
> On Tue, 8 Mar 2022 11:45:30 +0000
> lejeczek via Users <users at clusterlabs.org> wrote:
> 
> > On 08/03/2022 10:21, Jehan-Guillaume de Rorthais wrote:
> > > > op start timeout=60s \ op stop timeout=60s \ op promote
> > > > timeout=30s  >> \
> > > > op demote timeout=120s \ op monitor interval=15s   
> > timeout=10s >> role="Master" meta master-max=1 \ op monitor 
> > interval=16s >> timeout=10s role="Slave" \ op notify 
> > timeout=60s meta notify=true > Because "op" appears, we are 
> > back in resource ("pgsqld") context, > anything after is 
> > interpreted as ressource and operation attributes, > even 
> > the "meta notify=true". That's why your pgsqld-clone doesn't 
> >  > have the meta attribute "notify=true" set.  
> > Here is one-liner that should do - add, as per 'debug-' 
> > suggestion, 'master-max=1'
> 
> What debug- suggestion??
> 
> ...
> > then do:
> > 
> > -> $ pcs resource delete pgsqld  
> > 
> > '-clone' should get removed too, so now no 'pgsqld' 
> > resource(s) but cluster - weirdly in my mind - leaves node 
> > attributes on.
> 
> indeed.
> 
> > I see 'master-pgsqld' with each node and do not see why 
> > 'node attributes' should be kept(certainly shown) for 
> > non-existent resources(to which only resources those attrs 
> > are instinct)
> > So, you want to "clean" that for, perhaps for now you are 
> > not going to have/use 'pgsqlms', you can do that with:
> > 
> > -> $ pcs node attribute node1 master-pgsqld="" # same for   
> > remaining nodes
> 
> indeed.
> 
> > now .. ! repeat your one-liner which worked just a moment 
> > ago and you should get exact same or similar errors(while 
> > all nodes are stuck on 'slave'
> 
> You have no promotion because your PostgreSQL instances has been
> stopped
> in standby mode. The cluster has no way and no score to promote one
> of them.
> 
> > -> $ pcs resource debug-promote pgsqld  
> > crm_resource: Error performing operation: Error occurred
> > Operation force-promote for pgsqld (ocf:heartbeat:pgsqlms) 
> > returned 1 (error: Can not get current node LSN location)
> > /tmp:5432 - accepting connections
> 
> NEVER use "debug-promote" or other "debug-*" command with pgsqlms, or
> any other
> cloned ressources. AFAIK, these commands works fine for "stateless"
> ressource,
> but do not (could not) create the required environnement for the
> clone and multi-state ones.
> 
> So I repeat, NEVER use "debug-promote".
> 
> What you want to do is setting the promotion score on the node you
> want the
> promotion to happen. Eg.:
> 
>   pcs node attribute srv1 master-pgsqld=1001
> 
> You can use "crm_attribute" or "crm_master" as well.
> 
> > ocf-exit-reason:Can not get current node LSN location
> 
> This one is probably because of "debug-promote".
> 
> > You have to 'cib-push' to "fix" this very problem.
> > In my(admin's) opinion this is a 100% candidate for a bug - 
> > whether in PCS or PAF - perhaps authors may wish to comment?
> 
> Removing the node attributes with the resource might be legit from
> the
> Pacemaker point of view, but I'm not sure how they can track the
> dependency
> (ping Ken?).

Higher-level tools like pcs or crm shell could probably do it when
removing the resource (i.e. if the resource was a promotable clone,
check for and remove any node attributes of the form master-$RSC_ID).
That sounds like a good idea to me.

Pacemaker would be a bad place to do it because Pacemaker only sees the
newly modified CIB with the resource configuration gone -- it can't
know for sure whether it was a promotable clone, and it can only know
it existed at all if there is leftover status entries (causing the
resource to be listed as "orphaned"), which isn't guaranteed.

> 
> PAF has no way to know the ressource is being deleted and can not
> remove its
> node attribute before hand.
> 
> Maybe PCS can look for promotable score and remove them during the
> "resource
> delete" command (ping Tomas)?
> 
> Regards,
> 
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list