[ClusterLabs] node went to stand-by after one single resource-failure

Mon Jun 8 08:05:19 EDT 2015

Hi guys!

I've configured two nodes with the stack pacemaker + corosync, with only
one resource ( just for test purposes ), and I'm having a strange result.

First a little bit of information:

pacemaker version: 1.1.12-1
corosync version: 2.3.4-1

# crm configure show
node 1053402612: server1 \
node 1053402613: server2
primitive IP-rsc_apache IPaddr2 \
params ip=xx.xx.xx.xy nic=eth0 cidr_netmask=255.255.255.192 \
meta migration-threshold=2 \
op monitor interval=20 timeout=60 on-fail=standby
property cib-bootstrap-options: \
last-lrm-refresh=1433763004 \
stonith-enabled=false \
no-quorum-policy=ignore

# cat corosync.conf
totem {
version: 2
token: 3000
token_retransmits_before_loss_const: 10
clear_node_high_bit: yes
crypto_cipher: none
crypto_hash: none
transport: udpu
interface {
ringnumber: 0
bindnetaddr: xx.xx.xx.xx
ttl: 1
}
}

logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
debug: on
timestamp: on
logger_subsys {
subsys: QUORUM
debug: on
}
}

quorum {
provider: corosync_votequorum
two_node: 1
wait_for_all: 1
}

nodelist {
        node {
        ring0_addr: server1
        }
        node {
        ring0_addr: server2
        }
}

It's easy, I configured only one resource.
This resource can fail two times till it will get migrated to the ther
node, and if the monitoring operation for this resource causes a timeout
for more than 60s, the node will be marked as a standby, and the resource
will be automatically migrated to the other node.

At least, that would be the expected behaviour, but for some reason it
isn't.

Let's say that the resource IP-rsc_apache starts on server1, and we keep it
running there for some minutes.
Then we simulate that something goes wrong with:

# ip addr del xx.xx.xx.xx/26 dev eth0

As far as I know, pacemaker should notice that this resource is not running
anymore, and should increase the failcount for the pair resource-node. So
now should be "1", and since we specified "2", the resource should run
again in server1.
But this doesn't happen, because the resource jumps directly to server2:

Jun 08 13:47:54 [5856] server1       crmd:    debug: do_state_transition:
    All 2 cluster nodes are eligible to run resources.
Jun 08 13:47:54 [5855] server1    pengine:    debug: determine_op_status:
    IP-rsc_apache_monitor_20000 on server1 returned 'not running' (7)
instead of the expected value: 'ok' (0)
Jun 08 13:47:54 [5855] server1    pengine:  warning: unpack_rsc_op_failure:
    Processing failed op monitor for IP-rsc_apache on server1: not running
(7)
Jun 08 13:47:54 [5855] server1    pengine:     info: native_print:
 IP-rsc_apache   (ocf::heartbeat:IPaddr2):       Started server2
Jun 08 13:47:54 [5855] server1    pengine:     info: get_failcount_full:
     IP-rsc_apache has failed 1 times on server1
Jun 08 13:47:54 [5855] server1    pengine:     info:
common_apply_stickiness:   IP-rsc_apache can fail 1 more times on server1
before being forced off
Jun 08 13:47:54 [5855] server1    pengine:    debug: native_assign_node:
     Assigning server2 to IP-rsc_apache
Jun 08 13:47:54 [5855] server1    pengine:     info: LogActions:
 Leave   IP-rsc_apache   (Started server2)

As you can see here, IP-rsc_apache can fail one more time, but when pengine
computes, it says that the resource is assigned to the other server.
I'm not understand so much this, because I was testing this with and
old-stack ( pacemaker 1.1.7 + heartbeat ) and it worked as I expected, but
not here.

crm_mon shows me:

# crm_mon -f

Migration summary:
* Node server1:
   IP-rsc_apache: migration-threshold=2 fail-count=1 last-failure='Mon Jun
 8 13:32:54 2015'
* Node server2:

Failed actions:
    IP-rsc_apache_monitor_20000 on server1 'not running' (7): call=24,
status=complete, last-rc-change='Mon Jun  8 13:32:54 2015', queued=0ms,
exec=0ms

# crm_mon -o
Last updated: Mon Jun  8 13:58:54 2015
Last change: Mon Jun  8 13:31:13 2015
Current DC: server1 (1053402612) - partition with quorum
2 Nodes configured
1 Resources configured

Node server1 (1053402612): standby (on-fail)
Online: [ server2 ]

IP-rsc_apache   (ocf::heartbeat:IPaddr2):       Started server2

Operations:
* Node server1:
   IP-rsc_apache: migration-threshold=2 fail-count=1 last-failure='Mon Jun
 8 13:32:54 2015'
    + (24) monitor: interval=20000ms rc=0 (ok)
    + (24) monitor: interval=20000ms rc=7 (not running)
    + (26) stop: rc=0 (ok)
* Node server2:
   IP-rsc_apache: migration-threshold=2
    + (19) start: rc=0 (ok)
    + (20) monitor: interval=20000ms rc=0 (ok)

Failed actions:
    IP-rsc_apache_monitor_20000 on server1 'not running' (7): call=24,
status=complete, last-rc-change='Mon Jun  8 13:32:54 2015', queued=0ms,
exec=0ms

It seems like pacemaker is assuming that the monitor-operation failed, and
because of this, decides to mark the node as a standby. But should not be,
no?

Somebody has some hint about that?

Thank you very much
Oscar Salvador
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150608/e3f96111/attachment-0002.html>