[Pacemaker] trouble with quorum

Groshev Andrey greenx at yandex.ru
Wed May 22 08:25:54 EDT 2013


Hello,

I try build cluster with 2 nodes + one quorum node (without pacemaker).
The sequence of actions like the following:

1. setup/start corosync on TREE nodes - all right.
# corosync-quorumtool -l|sed 's/\..*$//'
Nodeid    Votes  Name
295521290    1  dev-cluster2-node2
312298506    1  dev-cluster2-node3
329075722    1  dev-cluster2-node4

2. start pacemaer on FIRST node.
3. write config with crmsh  .... stonith-enabled="false"
4. .... no-quorum-policy="ignore"
5. write main config ocf:heartbeat:pgsql
    Like: https://github.com/t-matsuo/resource-agents/wiki/Resource-Agent-for-PostgreSQL-9.1-streaming-replication
    But with one VIP on master PG
    Resources are started on first node.

6. Next. Sync PG data with TWO node.
7. start pacemaker on TWO node. Resource started too.
8. no-quorum-policy="stop".

Ok. All resources work on two nodes.
See # corosync-quorumtool -l|sed 's/\..*$//'
Nodeid    Votes  Name
295521290    1  dev-cluster2-node2
312298506    1  dev-cluster2-node3
329075722    1  dev-cluster2-node4

# corosync-quorumtool -s
Version:          1.4.5
Nodes:            3
Ring ID:          12440
Quorum type:      corosync_votequorum
Quorate:          Yes
Node votes:      1
Expected votes:  3
Highest expected: 3
Total votes:      3
Quorum:          2
Flags:            Quorate

See crm_mon.
# crm_mon -1|grep quor
Current DC: dev-cluster2-node3.unix.tensor.ru - partition with quorum

Now, stop pacemaker on one node.
#service pacemaker stop

# corosync-quorumtool -s
Version:          1.4.5
Nodes:            3
Ring ID:          12440
Quorum type:      corosync_votequorum
Quorate:          Yes
Node votes:      1
Expected votes:  3
Highest expected: 3
Total votes:      3
Quorum:          2
Flags:            Quorate

Now, on too node stop corosync.
crm_mon - says he lost a quorum, but the resources are not stopped.
crm_mon -1|grep quor
Current DC: dev-cluster2-node4.unix.tensor.ru - partition WITHOUT quorum

But corosync says that everything is fine ....
# corosync-quorumtool -l|sed 's/\..*$//'
Nodeid    Votes  Name
295521290    1  dev-cluster2-node2
329075722    1  dev-cluster2-node4

# corosync-quorumtool -s
Version:          1.4.5
Nodes:            2
Ring ID:          12440
Quorum type:      corosync_votequorum
Quorate:          Yes
Node votes:      1
Expected votes:  3
Highest expected: 3
Total votes:      2
Quorum:          2
Flags:            Quorate

Configs corosync:
totem {
        version: 2
        secauth: off
        clear_node_high_bit: yes
        threads: 0
        interface {
                ringnumber: 0
bindnetaddr: 10.76.157.18
mcastaddr: 239.94.1.56
                mcastport: 5405
                ttl: 1
        }
}
logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: no
        logfile: /var/log/cluster/corosync.log
        debug: on
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: on
        }
}

amf {
        mode: disabled
}
service {
        name: pacemaker
        ver: 1
}
quorum {
        provider: corosync_votequorum
        expected_votes: 3
        votes:  1
}


Why this strange behavior?

My environment:
CentOS 6.3
corosync 1.4.5 from opensuse-ha
pacemaker 1.1.9 from http://clusterlabs.org/rpm-next/rhel-6/




More information about the Pacemaker mailing list