[Pacemaker] About failover node in a deep swap

Wed Feb 9 13:56:33 EST 2011

On 02/09/2011 05:47 PM, Pentarh Udi wrote:
> I noticed that pacemaker does not correctly failover nodes under heavy
> load when they go into deep swap or heavy IO.
> 
> I configuring >1 nodes running apache with  MaxClients big enough to
> swap out the node, putting there some heavy php scripts (Wordpress ^_^)
> and then run heavy webserver benchmarks.
> 
> When the node comes into deep swap, load averages goes to thousands and
> its stun (but pings are okay), pacemaker in some reason do not mark the
> node as failed and do not migrate resources away.
> 
> Even more. In certain conditions pacemaker starts to migrate resources
> away, but they are failed to start on other nodes (while in normal
> condition it starts them okay):
> 
> httpd_start_0 (node=node1, call=32, rc=1, status=complete): unknown error
> httpd_start_0 (node=node2, call=43, rc=1, status=complete): unknown error
> 
> Sometimes there is a timeout error, sometimes there are no errors ever,
> but result is the resources are down.
> 
> In this case ocf::heartbeat:apache running in a group
> with ocf::heartbeat:IPaddr2, so maybe pacemaker failed to stop IPaddr2
> so it can't move ocf::heartbeat:apache because they are in a same group.
> 
> Is it a corosync "normal" behavior or I do something wrong? 90% of my
> "down conditions" are heavy load, and corosync does not handle this in
> my case.

I get this question a lot in classes and workshops. My usual response is
this: you have a highly available application running on a particular
node. That node now freaks out in terms of load average, swap or
whatever. What's the misbehaving application? That's right, 99% of the
time it's actually your cluster managed HA application. What's causing
this load? Your clients are. Now when you fail over, you get
near-certainty of that load spike hitting you right back on the node
that took iver the service. Worse, in active/active clusters you'll
actually have higher load because there are now fewer nodes to handle
the cluster's workload.

You can cause failover here by a combination of node fencing and
watchdog devices, but if you do set this up it'll likely make your
problem worse.

You need to fix your scalability issue. There's little that high
availability clustering can do for you here.

Hope this helps.
Florian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110209/22476210/attachment-0003.sig>