[ClusterLabs] reducing peer node death detection time

Nekrasov, Alexander alexander.nekrasov at emc.com
Wed Jun 24 18:17:50 EDT 2015


Hello,

The problem I'm facing: reducing the time between a node panic and the call to STONITH on the peer node in a two node cluster. Documentation points to the token value in corosync.conf

totem {
        version: 2
        secauth: off
        threads: 0
        token:   1000
        token_retransmits_before_loss_const: 1
        join:           1000
        consensus:      10000
        interface {
                ringnumber: 0
                bindnetaddr: 128.221.255.100
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
}

Setting token to 1s results in around 5 seconds from real node death to STONITH call on surviving node. Further reduction down to 100ms doesn't seem to have any effect. Is there a way to further reduce this delay?

Thanks,
Alexander

corosync-debuginfo-1.4.7-0.19.6.8087.0.PTF.916981
libcorosync4-1.4.7-0.19.6.8087.0.PTF.916981
corosync-1.4.7-0.19.6.8087.0.PTF.916981

pacemaker-debuginfo-1.1.11-0.7.53.7419.2.PTF.883076
libpacemaker3-1.1.11-0.7.53.7419.2.PTF.883076
pacemaker-1.1.11-0.7.53.7419.2.PTF.883076



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150624/88e25995/attachment-0002.html>


More information about the Users mailing list