[ClusterLabs] PAF with postgresql 13?
lejeczek
peljasz at yahoo.co.uk
Tue Mar 8 06:45:30 EST 2022
On 08/03/2022 10:21, Jehan-Guillaume de Rorthais wrote:
>> op start timeout=60s \ op stop timeout=60s \ op promote timeout=30s >> \ op demote timeout=120s \ op monitor interval=15s
timeout=10s >> role="Master" meta master-max=1 \ op monitor
interval=16s >> timeout=10s role="Slave" \ op notify
timeout=60s meta notify=true > Because "op" appears, we are
back in resource ("pgsqld") context, > anything after is
interpreted as ressource and operation attributes, > even
the "meta notify=true". That's why your pgsqld-clone doesn't
> have the meta attribute "notify=true" set.
Here is one-liner that should do - add, as per 'debug-'
suggestion, 'master-max=1'
-> $ pcs resource create pgsqld ocf:heartbeat:pgsqlms
bindir=/usr/bin pgdata=/var/lib/pgsql/data op start
timeout=60s op stop timeout=60s op promote timeout=30s op
demote timeout=120s op monitor interval=15s timeout=10s
role="Master" op monitor interval=16s timeout=10s
role="Slave" op notify timeout=60s promotable notify=true
master-max=1 && pcs constraint colocation add HA-10-1-1-226
with master pgsqld-clone INFINITY && pcs constraint order
promote pgsqld-clone then start HA-10-1-1-226
symmetrical=false kind=Mandatory && pcs constraint order
demote pgsqld-clone then stop HA-10-1-1-226
symmetrical=false kind=Mandatory
but ... ! this "issue" is reproducible! So now you have
working 'pgsqlms', then do:
-> $ pcs resource delete pgsqld
'-clone' should get removed too, so now no 'pgsqld'
resource(s) but cluster - weirdly in my mind - leaves node
attributes on.
I see 'master-pgsqld' with each node and do not see why
'node attributes' should be kept(certainly shown) for
non-existent resources(to which only resources those attrs
are instinct)
So, you want to "clean" that for, perhaps for now you are
not going to have/use 'pgsqlms', you can do that with:
-> $ pcs node attribute node1 master-pgsqld="" # same for
remaining nodes
now .. ! repeat your one-liner which worked just a moment
ago and you should get exact same or similar errors(while
all nodes are stuck on 'slave'
-> $ pcs resource debug-promote pgsqld
crm_resource: Error performing operation: Error occurred
Operation force-promote for pgsqld (ocf:heartbeat:pgsqlms)
returned 1 (error: Can not get current node LSN location)
/tmp:5432 - accepting connections
ocf-exit-reason:Can not get current node LSN location
--------------------
You have to 'cib-push' to "fix" this very problem.
In my(admin's) opinion this is a 100% candidate for a bug -
whether in PCS or PAF - perhaps authors may wish to comment?
many thanks, L.
More information about the Users
mailing list