[Pacemaker] Resource monitoring stops suddenly

Tue Apr 27 20:20:16 UTC 2010

I've written a custom plugin to monitor RAID status and it's working
great. Except for some reason, pacemaker seems to stop dispatching checks:

xenhost1:~ # grep RaidStatus /var/log/messages | tail -n4
Apr 27 15:38:14 xenhost1 RaidStatus[25972]: [25977]: INFO: status optimal
Apr 27 15:38:44 xenhost1 RaidStatus[26130]: [26135]: INFO: status optimal
Apr 27 15:39:14 xenhost1 RaidStatus[26288]: [26293]: INFO: status optimal
Apr 27 15:39:44 xenhost1 RaidStatus[26479]: [26484]: INFO: status optimal

xenhost2:~ # grep RaidStatus /var/log/messages | tail -n4
Apr 27 16:12:16 xenhost2 RaidStatus[21738]: [21744]: INFO: status optimal
Apr 27 16:12:46 xenhost2 RaidStatus[22006]: [22011]: INFO: status optimal
Apr 27 16:13:16 xenhost2 RaidStatus[22287]: [22292]: INFO: status optimal
Apr 27 16:13:46 xenhost2 RaidStatus[22575]: [22580]: INFO: status optimal

It starts checking again if I restart the resource, so I have no idea
what's going on here.

I do see this in the logs though:

Apr 27 15:39:44 xenhost1 RaidStatus[26479]: [26484]: INFO: status optimal
Apr 27 15:39:44 xenhost1 attrd_updater: [26485]: info: Invoked:
/usr/sbin/attrd_updater -n #health-raid -U green
Apr 27 15:39:44 xenhost1 crmd: [6620]: info: process_lrm_event: LRM
operation raidstatus:0_monitor_30000 (call=65, status=1, cib-update=0,
confirmed=true) Cancelled

Any idea what's happening?

-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth