[ClusterLabs] node went to stand-by after one single resource-failure
osalvador.vilardaga at gmail.com
Mon Jun 8 08:58:01 EDT 2015
2015-06-08 14:23 GMT+02:00 Andrei Borzenkov <arvidjaar at gmail.com>:
> On Mon, Jun 8, 2015 at 3:05 PM, Oscar Salvador
> <osalvador.vilardaga at gmail.com> wrote:
> > Hi guys!
> > I've configured two nodes with the stack pacemaker + corosync, with only
> > resource ( just for test purposes ), and I'm having a strange result.
> > First a little bit of information:
> > pacemaker version: 1.1.12-1
> > corosync version: 2.3.4-1
> > # crm configure show
> > node 1053402612: server1 \
> > node 1053402613: server2
> > primitive IP-rsc_apache IPaddr2 \
> > params ip=xx.xx.xx.xy nic=eth0 cidr_netmask=255.255.255.192 \
> > meta migration-threshold=2 \
> > op monitor interval=20 timeout=60 on-fail=standby
> > property cib-bootstrap-options: \
> > last-lrm-refresh=1433763004 \
> > stonith-enabled=false \
> > no-quorum-policy=ignore
> > It seems like pacemaker is assuming that the monitor-operation failed,
> > because of this, decides to mark the node as a standby. But should not
> > no?
> You told it to do exactly that (on-fail=standby).
> Users mailing list: Users at clusterlabs.org
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
Yes, I told that: if the monitor-operation failed, put the node in standby.
But from my point of view, the monitor-operation doesn't fail, but the
I'm very stranged with this because as I told, I tested this with and old
version of pacemaker, and it didn't have this behaviour.
Maybe I was consufed because of that.
So, somehow is reduntant do something like that:
op monitor interval=20 timeout=60 on-fail=standby
since it will never reach the failcount of 2, no?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users