[ClusterLabs] Inconsistent pacemaker behavior

Abhay B abhayyb at gmail.com
Wed May 9 00:24:48 EDT 2018


Was trying to simulate the "Corosync main process was not scheduled for xx
ms" issue.
Have a 2 node cluster with with a clone resource(Master-Slave).

*TEST #1*
  Initially state of cluster resource:
    NodeA - Master
    NodeB - Slave
    Current DC - NodeA

  Now I freeze the corosync main process on the Master using command :
  # kill -STOP $(pidof corosync)

  State of cluster resource:
    NodeA - Master
    NodeB - Master
    Current DC - NodeA NodeB

  Now continue the corosync process on the Master :
  # kill -CONT $(pidof corosync)

  State of cluster resource:
    NodeA - Master
    NodeB - Slave
    Current DC - NodeA

*TEST #2*
  Did the same test with the below initial state of cluster resource:
    NodeA - Slave
    NodeB - Master
    Current DC - NodeA

  The final state in this case:
    NodeA - Master
    NodeB - Slave
    Current DC - NodeA

In 1st test NodeA started as Master and ended as Master.
In 2nd test NodeB started as Master and ended as Slave.
NodeA is the current DC in both the tests.
The behavior is not well defined in these tests.
This is always reproducible.

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: APPHA
 cluster-recheck-interval: 2s
 dc-deadtime: 5
 dc-version: 1.1.16-12.el7-94ff4df
 have-watchdog: false
 load-threshold: 1000%
 no-quorum-policy: ignore
 start-failure-is-fatal: false
 stonith-enabled: false

Resource properties:
  Operations: demote interval=0s timeout=30 (APPHA-demote-interval-0s)
              monitor interval=1 role=Master timeout=40
(APPHA-monitor-interval-1)
              monitor interval=2 role=Slave timeout=40
(APPHA-monitor-interval-2)
              promote interval=0s timeout=30 (APPHA-promote-interval-0s)
              start interval=0s timeout=180 (APPHA-start-interval-0s)
              stop interval=0s timeout=180 (APPHA-stop-interval-0s)


Versions of the rpms installed:
pacemaker-cli-1.1.16-12.el7.x86_64
resource-agents-3.9.5-105.el7.x86_64
pacemaker-libs-1.1.16-12.el7.x86_64
pacemaker-cluster-libs-1.1.16-12.el7.x86_64
pcs-0.9.158-6.el7.centos.x86_64
corosync-2.4.0-9.el7.x86_64
pacemaker-1.1.16-12.el7.x86_64
corosynclib-2.4.0-9.el7.x86_64

Linux ditro :
CentOS Linux release 7.4.1708 (Core)

Regards,
Abhay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180509/b244e6b0/attachment.html>


More information about the Users mailing list