[ClusterLabs] Inconsistent pacemaker behavior
Abhay B
abhayyb at gmail.com
Wed May 9 00:24:48 EDT 2018
Was trying to simulate the "Corosync main process was not scheduled for xx
ms" issue.
Have a 2 node cluster with with a clone resource(Master-Slave).
*TEST #1*
Initially state of cluster resource:
NodeA - Master
NodeB - Slave
Current DC - NodeA
Now I freeze the corosync main process on the Master using command :
# kill -STOP $(pidof corosync)
State of cluster resource:
NodeA - Master
NodeB - Master
Current DC - NodeA NodeB
Now continue the corosync process on the Master :
# kill -CONT $(pidof corosync)
State of cluster resource:
NodeA - Master
NodeB - Slave
Current DC - NodeA
*TEST #2*
Did the same test with the below initial state of cluster resource:
NodeA - Slave
NodeB - Master
Current DC - NodeA
The final state in this case:
NodeA - Master
NodeB - Slave
Current DC - NodeA
In 1st test NodeA started as Master and ended as Master.
In 2nd test NodeB started as Master and ended as Slave.
NodeA is the current DC in both the tests.
The behavior is not well defined in these tests.
This is always reproducible.
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: APPHA
cluster-recheck-interval: 2s
dc-deadtime: 5
dc-version: 1.1.16-12.el7-94ff4df
have-watchdog: false
load-threshold: 1000%
no-quorum-policy: ignore
start-failure-is-fatal: false
stonith-enabled: false
Resource properties:
Operations: demote interval=0s timeout=30 (APPHA-demote-interval-0s)
monitor interval=1 role=Master timeout=40
(APPHA-monitor-interval-1)
monitor interval=2 role=Slave timeout=40
(APPHA-monitor-interval-2)
promote interval=0s timeout=30 (APPHA-promote-interval-0s)
start interval=0s timeout=180 (APPHA-start-interval-0s)
stop interval=0s timeout=180 (APPHA-stop-interval-0s)
Versions of the rpms installed:
pacemaker-cli-1.1.16-12.el7.x86_64
resource-agents-3.9.5-105.el7.x86_64
pacemaker-libs-1.1.16-12.el7.x86_64
pacemaker-cluster-libs-1.1.16-12.el7.x86_64
pcs-0.9.158-6.el7.centos.x86_64
corosync-2.4.0-9.el7.x86_64
pacemaker-1.1.16-12.el7.x86_64
corosynclib-2.4.0-9.el7.x86_64
Linux ditro :
CentOS Linux release 7.4.1708 (Core)
Regards,
Abhay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180509/b244e6b0/attachment.html>
More information about the Users
mailing list