[ClusterLabs] Antw: Delayed first monitoring

Andrei Borzenkov arvidjaar at gmail.com
Thu Aug 13 03:26:36 EDT 2015


On Thu, Aug 13, 2015 at 10:01 AM, Miloš Kozák <milos.kozak at lejmr.com> wrote:
> However,
>  this does not make sense at all. Presumably, the pacemaker should get along
> with lsb scripts which comes from system repository, right?
>

Let's forget about pacemaker for a moment. You have system startup
where service B needs service A. initscript for service A completes
and script for service B is started but service A is not yet ready to
be used.

This is a bug in startup script. Irrespectively of whether you use it
with pacemaker or not.

> Therefore, there is not way how to modify lsb script because changes is lsb
> script erase after every package update.
>
>
> I believe, the systematical approach is in introducing of delayed monitoring
> or something like this into Pacemaker. I quite wonder that nobody has come
> around this problem already?
>
>
> Milos
>
>
>
>
>
> Dne 13.8.2015 v 08:44 Ulrich Windl napsal(a):
>
>> I think the start script has to be fixed to return success when httpd is
>> actually running.
>>
>>>>> Miloš Kozák <milos.kozak at lejmr.com> schrieb am 12.08.2015 um 16:03 in
>>
>> Nachricht
>> <55CB521A.8090304 at lejmr.com>:
>>>
>>> Hi,
>>>
>>> I have set up and CoroSync+CMAN+Pacemaker at CentOS 6.5 in order to
>>> provide high-availability of opennebula. However, I am facing to a
>>> strange problem which raises from my lack of knowleadge..
>>>
>>> In the log I can see that when I create a resource based on an init
>>> script, typically:
>>>
>>> pcs resource create httpd lsb:httpd
>>>
>>> The httpd daemon gets started, but monitor is initiated at the same time
>>> and the resource is identified as not running. This behaviour makes
>>> sense since we realize that the daemon starting takes some time. In this
>>> particular case, I get error code 2 which means that process is running,
>>> but environment is not locked. The effect of this is that httpd resource
>>> gets restarted.
>>>
>>> My workaround is extra sleep in status function of the init script, but
>>> I dont like this solution at all! Do you have idea how to tackle this
>>> problem in a proper way? I expected an op attribut which would specify
>>> delay after service start and first monitoring, but I could not find it..
>>>
>>> Thank you, Milos
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list