[Pacemaker] stonithd process can restart automatically butstonith plugins can't

Andrew Beekhof beekhof at gmail.com
Mon Jun 16 07:41:37 EDT 2008


On Jun 16, 2008, at 11:36 AM, Dejan Muhamedagic wrote:

> Hi Junko-san,
>
> On Mon, Jun 16, 2008 at 01:44:28PM +0900, Junko IKEDA wrote:
>> Hi,
>>
>>> A stonith resource is started only in the current stonithd
>>> instance. If the stonithd process is gone, along with it gone is
>>> the status of all its stonith resources. A started stonith
>>> resource should more properly be termed enabled and this is only
>>> valid in the current stonithd process.
>>>
>>> In other words, there's no use trying a monitor operation with a
>>> new stonithd instance: it is "empty" and will always return "not
>>> running". The only way to proceed, once crmd realises that
>>> stonithd process has died, is to consider all stonith resources
>>> which were "started" on that node as stopped and to start them
>>> again. Probably also not to update the fail_count since the
>>> resources themselves didn't fail, just the stonithd process.
>>
>> You mean, this is stonithd's correct behavior for the current
>> specifications.
>
> stonithd has no configuration itself. There's simply no other way
> stonithd can behave.
>
>> Is it possible for crmd to have stonith resources restart when  
>> stonithd
>> died/up as its design?
>
> I certainly hope so.
>
>> or should we contrive ways to do this with migration-threshold and  
>> expire
>> fail-count?

Basically, yes.
In this particular case, it probably makes sense not to set a  
migration-threshold for the stonith resource.

> I'd say that it should be done by crmd. Don't know how complex it
> may be though.

By design, the CRM does not (and will not) try to understand the  
resources it manages.

That the CRM also has a connection to stonithd (and knows when it  
dies) does not mean that stonith resources will be treated any  
differently.
We restart them (if possible given the configuration) when they fail  
just like any other resource.  Thats it.
The CRM's design wont be changing to optimize its behavior for an  
artificial test scenario ;-)






More information about the Pacemaker mailing list