[Pacemaker] resource is too active problem in a 2-node cluster
Ajay Aggarwal
aaggarwal at verizon.com
Tue Feb 11 14:39:10 UTC 2014
Yes, we have cman (version: cman-3.0.12.1-49). We use manual fencing ( I
know it is not recommended). There is an external monitoring and
fencing service that we use (our own).
Perhaps subject line "resource is too active problem in a 2-node
cluster" was misleading. Real problem is that resource is *NOT* too
active, but pacemaker thinks it is. Which leads to undesirable recovery
procedure. See log lines below
Feb 04 11:27:38 [45167] gol-5-7-0 pengine: warning: unpack_rsc_op: Processing failed op monitor for GOL-HA on gol-5-7-0: unknown error (1)
Feb 04 11:27:38 [45167] gol-5-7-0 pengine: warning: unpack_rsc_op: Processing failed op monitor for GOL-HA on gol-5-7-6: unknown error (1)
Feb 04 11:27:38 [45167] gol-5-7-0 pengine: error: native_create_actions: Resource GOL-HA (ocf::script.sh) is active on 2 nodes attempting recovery
On 02/10/2014 09:43 PM, Digimer wrote:
> On 10/02/14 09:13 PM, Aggarwal, Ajay wrote:
>> I have a 2 node cluster with no-quorum-policy=ignore. I call these
>> nodes as node-0 and node-1. In addition, I have two cluster resources
>> in a group; an IP-address and an OCF script.
>
> Turning off quorum on a 2-node cluster is fine, in fact, it's
> required. However, that makes stonith all the more important. Without
> stonith, in any cluster but in particualr on two node clusters, things
> will not work right.
>
> First and foremost; Configure stonith and test to make sure it works.
>
>> Pacemaker version: 1.1.10
>> Corosync version: 1-4.1-15
>> OS: CentOS 6.4
>
> With CentOS/RHEL 6, you need cman as well. Please be sure to also
> configure fence_pcmk in cluster.conf to "hook" it into pacemaker's
> real fencing.
>
>> What am I doing wrong?
> <snip>
>> <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>
> That. :)
>
> Once you have stonith working, see if the problem remains.
>
More information about the Pacemaker
mailing list