[Pacemaker] starting resources with failed stonith resource

Andrew Beekhof andrew at beekhof.net
Wed Jan 8 17:10:20 EST 2014


On 8 Jan 2014, at 2:41 am, Frank Van Damme <frank.vandamme at gmail.com> wrote:

> Hi list,
> 
> I recently had some trouble with a dual-node mysql cluster, which runs
> in master-slave mode with Percona resource manager. While analyzing
> what happened to the cluster, I found this in syslog (network trouble,
> the cluster lost disk/iscsi access on both nodes, this is a piece from
> the former master trying to start up again when recovering
> connectivity):
> 
> Jan  6 07:26:49 infante pengine: [3839]: notice: get_failcount:
> Failcount for MasterSlave_mysql on infante has expired (limit was 60s)
> Jan  6 07:26:49 infante pengine: [3839]: notice: get_failcount:
> Failcount for MasterSlave_mysql on infante has expired (limit was 60s)
> Jan  6 07:26:49 infante pengine: [3839]: WARN:
> common_apply_stickiness: Forcing p-stonith-ingstad away from infante
> after 1000000 failures (max=1000000)
> Jan  6 07:26:49 infante pengine: [3839]: notice: LogActions: Start
> prim_mysql:0#011(infante)
> Jan  6 07:26:49 infante pengine: [3839]: notice: LogActions: Start
> prim_mysql:1#011(ingstad)
> 
> I don't understand it: if this means that the stonith devices have
> failed a million times,

We also set it to 1000000 when the start action fails.

> why is it trying to start the mysql resource?

It depends if any nodes need fencing.

> It's agains Pacemaker policies to start resources on a cluster without
> working stonith devices, isn't it?

Not if all nodes are present and healthy.

> 
> -- 
> Frank Van Damme
> Make everything as simple as possible, but not simpler. - Albert Einstein
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140109/5695d12d/attachment-0003.sig>


More information about the Pacemaker mailing list