[ClusterLabs] make promoted follow promoted resource ?
lejeczek
peljasz at yahoo.co.uk
Wed Dec 6 04:30:13 EST 2023
On 26/11/2023 12:20, Reid Wahl wrote:
> On Sun, Nov 26, 2023 at 1:32 AM lejeczek via Users
> <users at clusterlabs.org> wrote:
>> Hi guys.
>>
>> With these:
>>
>> -> $ pcs resource status REDIS-6381-clone
>> * Clone Set: REDIS-6381-clone [REDIS-6381] (promotable):
>> * Promoted: [ ubusrv2 ]
>> * Unpromoted: [ ubusrv1 ubusrv3 ]
>>
>> -> $ pcs resource status PGSQL-PAF-5433-clone
>> * Clone Set: PGSQL-PAF-5433-clone [PGSQL-PAF-5433] (promotable):
>> * Promoted: [ ubusrv1 ]
>> * Unpromoted: [ ubusrv2 ubusrv3 ]
>>
>> -> $ pcs constraint ref REDIS-6381-clone
>> Resource: REDIS-6381-clone
>> colocation-REDIS-6381-clone-PGSQL-PAF-5433-clone-INFINITY
>>
>> basically promoted Redis should follow promoted pgSQL but it's not happening, usually it does.
>> I presume pcs/cluster does something internally which results in disobeying/ignoring that _colocation_ constraint for these resources.
>> I presume scoring might play a role:
>> REDIS-6385-clone with PGSQL-PAF-5435-clone (score:1001) (rsc-role:Master) (with-rsc-role:Master)
>> but usually, that scoring works, only "now" it does not.
>> Any comments I appreciate much.
>> thanks, L.
>>
>> I looked at pamaker log - snippet below after REDIS-6381-clone re-enabled - but cannot see explanation for this.
>> ...
>> notice: Calculated transition 110, saving inputs in /var/lib/pacemaker/pengine/pe-input-3729.bz2
>> notice: Transition 110 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3729.bz2): Complete
>> notice: State transition S_TRANSITION_ENGINE -> S_IDLE
>> notice: State transition S_IDLE -> S_POLICY_ENGINE
>> notice: Actions: Start REDIS-6381:0 ( ubusrv2 )
>> notice: Actions: Start REDIS-6381:1 ( ubusrv3 )
>> notice: Actions: Start REDIS-6381:2 ( ubusrv1 )
>> notice: Calculated transition 111, saving inputs in /var/lib/pacemaker/pengine/pe-input-3730.bz2
>> notice: Initiating start operation REDIS-6381_start_0 locally on ubusrv2
>> notice: Requesting local execution of start operation for REDIS-6381 on ubusrv2
>> (to redis) root on none
>> pam_unix(su:session): session opened for user redis(uid=127) by (uid=0)
>> pam_sss(su:session): Request to sssd failed. Connection refused
>> pam_unix(su:session): session closed for user redis
>> pam_sss(su:session): Request to sssd failed. Connection refused
>> notice: Setting master-REDIS-6381[ubusrv2]: (unset) -> 1000
>> notice: Transition 111 aborted by status-2-master-REDIS-6381 doing create master-REDIS-6381=1000: Transient attribute change
>> INFO: demote: Setting master to 'no-such-master'
>> notice: Result of start operation for REDIS-6381 on ubusrv2: ok
>> notice: Transition 111 (Complete=4, Pending=0, Fired=0, Skipped=1, Incomplete=14, Source=/var/lib/pacemaker/pengine/pe-input-3730.bz2): Stopped
>> notice: Actions: Promote REDIS-6381:0 ( Unpromoted -> Promoted ubusrv2 )
>> notice: Actions: Start REDIS-6381:1 ( ubusrv1 )
>> notice: Actions: Start REDIS-6381:2 ( ubusrv3 )
>> notice: Calculated transition 112, saving inputs in /var/lib/pacemaker/pengine/pe-input-3731.bz2
>> notice: Initiating notify operation REDIS-6381_pre_notify_start_0 locally on ubusrv2
>> notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>> notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>> notice: Initiating start operation REDIS-6381_start_0 on ubusrv1
>> notice: Initiating start operation REDIS-6381:2_start_0 on ubusrv3
>> notice: Initiating notify operation REDIS-6381_post_notify_start_0 locally on ubusrv2
>> notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>> notice: Initiating notify operation REDIS-6381_post_notify_start_0 on ubusrv1
>> notice: Initiating notify operation REDIS-6381:2_post_notify_start_0 on ubusrv3
>> notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>> notice: Initiating notify operation REDIS-6381_pre_notify_promote_0 locally on ubusrv2
>> notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>> notice: Initiating notify operation REDIS-6381_pre_notify_promote_0 on ubusrv1
>> notice: Initiating notify operation REDIS-6381:2_pre_notify_promote_0 on ubusrv3
>> notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>> notice: Initiating promote operation REDIS-6381_promote_0 locally on ubusrv2
>> notice: Requesting local execution of promote operation for REDIS-6381 on ubusrv2
>> notice: Result of promote operation for REDIS-6381 on ubusrv2: ok
>> notice: Initiating notify operation REDIS-6381_post_notify_promote_0 locally on ubusrv2
>> notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>> notice: Initiating notify operation REDIS-6381_post_notify_promote_0 on ubusrv1
>> notice: Initiating notify operation REDIS-6381:2_post_notify_promote_0 on ubusrv3
>> notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>> notice: Setting master-REDIS-6381[ubusrv3]: (unset) -> 1
>> notice: Transition 112 aborted by status-3-master-REDIS-6381 doing create master-REDIS-6381=1: Transient attribute change
>> notice: Setting master-REDIS-6381[ubusrv1]: (unset) -> 1
>> notice: Transition 112 (Complete=25, Pending=0, Fired=0, Skipped=5, Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-input-3731.bz2): Stopped
>> notice: Calculated transition 113, saving inputs in /var/lib/pacemaker/pengine/pe-input-3732.bz2
>> notice: Initiating monitor operation REDIS-6381_monitor_20000 locally on ubusrv2
>> notice: Requesting local execution of monitor operation for REDIS-6381 on ubusrv2
>> notice: Initiating monitor operation REDIS-6381_monitor_60000 on ubusrv3
>> notice: Initiating monitor operation REDIS-6381_monitor_45000 on ubusrv3
>> notice: Initiating monitor operation REDIS-6381_monitor_60000 on ubusrv1
>> notice: Initiating monitor operation REDIS-6381_monitor_45000 on ubusrv1
>> notice: Result of monitor operation for REDIS-6381 on ubusrv2: promoted
>> notice: Transition 113 (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3732.bz2): Complete
>> notice: State transition S_TRANSITION_ENGINE -> S_IDLE
>>
>
How much transient attributes matter here?
As in my earlier example:
Node Attributes:
* Node: ubusrv1 (1):
* master-PGSQL-PAF-5433 : 1000
* master-PGSQL-PAF-5434 : 1001
* master-PGSQL-PAF-5435 : -1000
* master-REDIS-6380 : 1
* master-REDIS-6381 : 1
* master-REDIS-6382 : 1
* master-REDIS-6385 : 2
* Node: ubusrv2 (2):
* master-PGSQL-PAF-5433 : 990
* master-PGSQL-PAF-5434 : -1000
* master-PGSQL-PAF-5435 : 1001
* master-REDIS-6380 : 1
* master-REDIS-6381 : 1
* master-REDIS-6382 : 1
* master-REDIS-6385 : 2
* Node: ubusrv3 (3):
* master-PGSQL-PAF-5433 : 1001
* master-PGSQL-PAF-5434 : -1000
* master-PGSQL-PAF-5435 : -1000
* master-REDIS-6380 : 1000
* master-REDIS-6381 : 1
* master-REDIS-6382 : 1
* master-REDIS-6385 : 2
-> $ pcs constraint colocation --full | grep REDIS-6380
REDIS-6380-clone with PGSQL-PAF-5434-clone (score:9999)
(rsc-role:Master) (with-rsc-role:Master)
(id:colocation-REDIS-6380-clone-PGSQL-PAF-5434-clone-INFINITY-1)
Right now I again have situation where _master_ REDIS-6380
should be on ubusrv1 if... that constraint above was
honored, where master PGSQL-PAF-5434 is.
And again, I can move manually:
-> $ pcs resource move --master REDIS-6380-clone ubusrv1
When moved like that Redis reports replication status as
expected and seems problem-free.
As soon as I:
-> $ pcs resource clear REDIS-6380-clone
cluster moves master REDIS-6380-clone back to _ubusrv3_
where, there is that _transient_ attr "strangely" high.
Do we have a doc/manual somewhere cover those parts -
transient bits & their role?
many thanks, L.
More information about the Users
mailing list