[ClusterLabs] make promoted follow promoted resource ?

lejeczek peljasz at yahoo.co.uk
Wed Dec 6 04:30:13 EST 2023



On 26/11/2023 12:20, Reid Wahl wrote:
> On Sun, Nov 26, 2023 at 1:32 AM lejeczek via Users
> <users at clusterlabs.org> wrote:
>> Hi guys.
>>
>> With these:
>>
>> -> $ pcs resource status REDIS-6381-clone
>>    * Clone Set: REDIS-6381-clone [REDIS-6381] (promotable):
>>      * Promoted: [ ubusrv2 ]
>>      * Unpromoted: [ ubusrv1 ubusrv3 ]
>>
>> -> $ pcs resource status PGSQL-PAF-5433-clone
>>    * Clone Set: PGSQL-PAF-5433-clone [PGSQL-PAF-5433] (promotable):
>>      * Promoted: [ ubusrv1 ]
>>      * Unpromoted: [ ubusrv2 ubusrv3 ]
>>
>> -> $ pcs constraint ref REDIS-6381-clone
>> Resource: REDIS-6381-clone
>>    colocation-REDIS-6381-clone-PGSQL-PAF-5433-clone-INFINITY
>>
>> basically promoted Redis should follow promoted pgSQL but it's not happening, usually it does.
>> I presume pcs/cluster does something internally which results in disobeying/ignoring that _colocation_ constraint for these resources.
>> I presume scoring might play a role:
>>    REDIS-6385-clone with PGSQL-PAF-5435-clone (score:1001) (rsc-role:Master) (with-rsc-role:Master)
>> but usually, that scoring works, only "now" it does not.
>> Any comments I appreciate much.
>> thanks, L.
>>
>> I looked at pamaker log - snippet below after REDIS-6381-clone re-enabled - but cannot see explanation for this.
>> ...
>>   notice: Calculated transition 110, saving inputs in /var/lib/pacemaker/pengine/pe-input-3729.bz2
>>   notice: Transition 110 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3729.bz2): Complete
>>   notice: State transition S_TRANSITION_ENGINE -> S_IDLE
>>   notice: State transition S_IDLE -> S_POLICY_ENGINE
>>   notice: Actions: Start      REDIS-6381:0         (                        ubusrv2 )
>>   notice: Actions: Start      REDIS-6381:1         (                        ubusrv3 )
>>   notice: Actions: Start      REDIS-6381:2         (                        ubusrv1 )
>>   notice: Calculated transition 111, saving inputs in /var/lib/pacemaker/pengine/pe-input-3730.bz2
>>   notice: Initiating start operation REDIS-6381_start_0 locally on ubusrv2
>>   notice: Requesting local execution of start operation for REDIS-6381 on ubusrv2
>> (to redis) root on none
>> pam_unix(su:session): session opened for user redis(uid=127) by (uid=0)
>> pam_sss(su:session): Request to sssd failed. Connection refused
>> pam_unix(su:session): session closed for user redis
>> pam_sss(su:session): Request to sssd failed. Connection refused
>>   notice: Setting master-REDIS-6381[ubusrv2]: (unset) -> 1000
>>   notice: Transition 111 aborted by status-2-master-REDIS-6381 doing create master-REDIS-6381=1000: Transient attribute change
>> INFO: demote: Setting master to 'no-such-master'
>>   notice: Result of start operation for REDIS-6381 on ubusrv2: ok
>>   notice: Transition 111 (Complete=4, Pending=0, Fired=0, Skipped=1, Incomplete=14, Source=/var/lib/pacemaker/pengine/pe-input-3730.bz2): Stopped
>>   notice: Actions: Promote    REDIS-6381:0         ( Unpromoted -> Promoted ubusrv2 )
>>   notice: Actions: Start      REDIS-6381:1         (                        ubusrv1 )
>>   notice: Actions: Start      REDIS-6381:2         (                        ubusrv3 )
>>   notice: Calculated transition 112, saving inputs in /var/lib/pacemaker/pengine/pe-input-3731.bz2
>>   notice: Initiating notify operation REDIS-6381_pre_notify_start_0 locally on ubusrv2
>>   notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>>   notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>>   notice: Initiating start operation REDIS-6381_start_0 on ubusrv1
>>   notice: Initiating start operation REDIS-6381:2_start_0 on ubusrv3
>>   notice: Initiating notify operation REDIS-6381_post_notify_start_0 locally on ubusrv2
>>   notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>>   notice: Initiating notify operation REDIS-6381_post_notify_start_0 on ubusrv1
>>   notice: Initiating notify operation REDIS-6381:2_post_notify_start_0 on ubusrv3
>>   notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>>   notice: Initiating notify operation REDIS-6381_pre_notify_promote_0 locally on ubusrv2
>>   notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>>   notice: Initiating notify operation REDIS-6381_pre_notify_promote_0 on ubusrv1
>>   notice: Initiating notify operation REDIS-6381:2_pre_notify_promote_0 on ubusrv3
>>   notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>>   notice: Initiating promote operation REDIS-6381_promote_0 locally on ubusrv2
>>   notice: Requesting local execution of promote operation for REDIS-6381 on ubusrv2
>>   notice: Result of promote operation for REDIS-6381 on ubusrv2: ok
>>   notice: Initiating notify operation REDIS-6381_post_notify_promote_0 locally on ubusrv2
>>   notice: Requesting local execution of notify operation for REDIS-6381 on ubusrv2
>>   notice: Initiating notify operation REDIS-6381_post_notify_promote_0 on ubusrv1
>>   notice: Initiating notify operation REDIS-6381:2_post_notify_promote_0 on ubusrv3
>>   notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
>>   notice: Setting master-REDIS-6381[ubusrv3]: (unset) -> 1
>>   notice: Transition 112 aborted by status-3-master-REDIS-6381 doing create master-REDIS-6381=1: Transient attribute change
>>   notice: Setting master-REDIS-6381[ubusrv1]: (unset) -> 1
>>   notice: Transition 112 (Complete=25, Pending=0, Fired=0, Skipped=5, Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-input-3731.bz2): Stopped
>>   notice: Calculated transition 113, saving inputs in /var/lib/pacemaker/pengine/pe-input-3732.bz2
>>   notice: Initiating monitor operation REDIS-6381_monitor_20000 locally on ubusrv2
>>   notice: Requesting local execution of monitor operation for REDIS-6381 on ubusrv2
>>   notice: Initiating monitor operation REDIS-6381_monitor_60000 on ubusrv3
>>   notice: Initiating monitor operation REDIS-6381_monitor_45000 on ubusrv3
>>   notice: Initiating monitor operation REDIS-6381_monitor_60000 on ubusrv1
>>   notice: Initiating monitor operation REDIS-6381_monitor_45000 on ubusrv1
>>   notice: Result of monitor operation for REDIS-6381 on ubusrv2: promoted
>>   notice: Transition 113 (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3732.bz2): Complete
>>   notice: State transition S_TRANSITION_ENGINE -> S_IDLE
>>
>
How much transient attributes matter here?
As in my earlier example:
Node Attributes:
   * Node: ubusrv1 (1):
     * master-PGSQL-PAF-5433               : 1000
     * master-PGSQL-PAF-5434               : 1001
     * master-PGSQL-PAF-5435               : -1000
     * master-REDIS-6380                   : 1
     * master-REDIS-6381                   : 1
     * master-REDIS-6382                   : 1
     * master-REDIS-6385                   : 2
   * Node: ubusrv2 (2):
     * master-PGSQL-PAF-5433               : 990
     * master-PGSQL-PAF-5434               : -1000
     * master-PGSQL-PAF-5435               : 1001
     * master-REDIS-6380                   : 1
     * master-REDIS-6381                   : 1
     * master-REDIS-6382                   : 1
     * master-REDIS-6385                   : 2
   * Node: ubusrv3 (3):
     * master-PGSQL-PAF-5433               : 1001
     * master-PGSQL-PAF-5434               : -1000
     * master-PGSQL-PAF-5435               : -1000
     * master-REDIS-6380                   : 1000
     * master-REDIS-6381                   : 1
     * master-REDIS-6382                   : 1
     * master-REDIS-6385                   : 2

-> $ pcs constraint colocation --full | grep REDIS-6380
   REDIS-6380-clone with PGSQL-PAF-5434-clone (score:9999) 
(rsc-role:Master) (with-rsc-role:Master) 
(id:colocation-REDIS-6380-clone-PGSQL-PAF-5434-clone-INFINITY-1)

Right now I again have situation where _master_ REDIS-6380 
should be on ubusrv1 if... that constraint above was 
honored, where master PGSQL-PAF-5434 is.
And again, I can move manually:
-> $ pcs resource move --master REDIS-6380-clone ubusrv1
When moved like that Redis reports replication status as 
expected and seems problem-free.
As soon as I:
-> $ pcs resource clear REDIS-6380-clone
cluster moves master REDIS-6380-clone back to _ubusrv3_ 
where, there is that _transient_ attr "strangely" high.

Do we have a doc/manual somewhere cover those parts - 
transient bits & their role?

many thanks, L.






More information about the Users mailing list