[ClusterLabs] Fence agent executing thousands of API calls per hour

Mon Jul 30 19:47:27 EDT 2018

I've set up a number of clusters in a VMware environment, and am using the fence_vmware_rest agent for fencing (from fence-agents 4.2.1), as follows:

Stonith Devices:
 Resource: vmware_fence (class=stonith type=fence_vmware_rest)
  Attributes: ip=<host> username=<username> password=<password> ssl_insecure=1 pcmk_host_check=static-list pcmk_host_list=b-gp2-dbpg35-1;b-gp2-dbpg35-2;b-gp2-dbpg35-3
  Operations: monitor interval=60s (vmware_fence-monitor-interval-60s)

We are using a dedicated service account on the VMware side for pacemaker.

The clusters are running fine, and no failover events have happened recently.  However, our VMware admin came to me asking why the pacemaker service account is logging in and executing API calls very frequently (for an environment where there are 3 clusters, 9 nodes total, he is seeing ~1400 API calls per hour as this user).  I do not see anything logged in corosync.log about why this would be, and my limited understanding was that the fence agent would only be calling the power off and reboot API's when pacemaker couldn't get a response from a node in the cluster.  I thought that using a static-list for the host_check would prevent any API calls for getting a list of hosts, although even if that were going on I would think it would be a rare event.  His concern is that this amount of load on the vmware hosts isn't sustainable.

Unfortunately the logging available from vmWare doesn't give a lot of information - it just says the number of API calls, not which API(s) were called.

Any ideas what might be going on?  Is there a way to get increased logging for the fence agent?

Thanks in advance,
-- 
Casey