[Pacemaker] Problems with resource scaling

Fri Feb 24 22:35:21 EST 2012

Hi. I have been experimenting with resource scalability in Pacemaker. I
started with no resources, and attempted to configure & start a few hundred
dummy resources (a dummy ocf script that does not load the CPU) on a
cluster of 4 virtual machines using crm configure, and noted that after
adding about 200 resources the cluster grinds to a halt. To start an
additional 100 resources, it took about 10 minutes for the crm configure to
complete, and an additional 10 minutes for the resources to come up. I
noted that the CIB process spikes to >90% as soon as a new resource is
configured on the system, and stays there for sometime.  During this time
it is possible that "crm status" shows all nodes to be offline. After a
period of such instability, CIB CPU backs off, and then the cluster is
stable. Is this behavior expected with the afore-mentioned cluster size /
resource count? Are there any parameters or knobs we may want to look at /
twiddle? My understanding was that with recent performance changes resource
scaling should be much higher.

Pacemaker is v 1.1.6.3.el6, and is running with Corosync 1.4.1. The virtual
machines are running RH 6.2, and are all hosted on a dual CPU Westmere
system. I have played around with the memory / CPU allocation of the
machines but haven't seen much of a difference.

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120224/45c9df14/attachment-0002.html>