[ClusterLabs] Delayed first monitoring

emmanuel segura emi2fast at gmail.com
Wed Aug 12 12:16:07 EDT 2015


Sorry, but from my point of view, the agent first check if the
resource is running, for example you can check that from
/usr/lib/ocf/resource.d/heartbeat/Filesystem

The logic is

Filesystem::start(parameter as parameter for the
agent)->Filesystem_start(function called from start in the case which
evaluate the parameters) -> Filesystem_status(function called for the
previous one), If the fs is already mounted return success.

so you need to check if the resource is already started.

2015-08-12 16:14 GMT+02:00 Nekrasov, Alexander <alexander.nekrasov at emc.com>:
> 1. Pacemaker will/may call a monitor before starting a resource, in which case it expects a NOT_RUNNING response. It's just checking assumptions at that point.
>
> 2. A resource::start must only return when resource::monitor is successful. Basically the logic of a start() must follow this:
>
> start() {
>   start_daemon()
>   while ! monitor() ; do
>       sleep some
>   done
>   return $OCF_SUCCESS
> }
>
>> -----Original Message-----
>> From: Miloš Kozák [mailto:milos.kozak at lejmr.com]
>> Sent: Wednesday, August 12, 2015 10:03 AM
>> To: users at clusterlabs.org
>> Subject: [ClusterLabs] Delayed first monitoring
>>
>> Hi,
>>
>> I have set up and CoroSync+CMAN+Pacemaker at CentOS 6.5 in order to
>> provide high-availability of opennebula. However, I am facing to a
>> strange problem which raises from my lack of knowleadge..
>>
>> In the log I can see that when I create a resource based on an init
>> script, typically:
>>
>> pcs resource create httpd lsb:httpd
>>
>> The httpd daemon gets started, but monitor is initiated at the same time
>> and the resource is identified as not running. This behaviour makes
>> sense since we realize that the daemon starting takes some time. In this
>> particular case, I get error code 2 which means that process is running,
>> but environment is not locked. The effect of this is that httpd resource
>> gets restarted.
>>
>> My workaround is extra sleep in status function of the init script, but
>> I dont like this solution at all! Do you have idea how to tackle this
>> problem in a proper way? I expected an op attribut which would specify
>> delay after service start and first monitoring, but I could not find
>> it..
>>
>> Thank you, Milos
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^




More information about the Users mailing list