[Pacemaker] RFC: What part of the XML configuration do you hate the most?

Lars Marowsky-Bree lmb at suse.de
Wed Sep 17 12:20:49 EDT 2008


On 2008-09-17T10:09:21, Andrew Beekhof <beekhof at gmail.com> wrote:

> I can't help but feel this is all a work-around for badly written RAs 
> and/or overly aggressive timeouts.  There's nothing wrong with setting 
> large timeouts... if you set 1 hour and the op returns in 1 second, then we 
> don't wait around doing nothing for the other 59 minutes and 59 seconds.

Agreed. RAs shouldn't fail randomly. RAs are considered part of the
"trusted" infrastructure.

> But if you really really only want to report an error if N monitors fail in 
> M seconds (I still think this is crazy, but whatever), then simply 
> implement monitor_loop() which calls monitor() up to N times looking for 
> $OCF_SUCCESS and add:
>
>   <op id=... name="monitor_loop" timeout="M" interval=... />
>
> instead of a regular monitor op.  Or even in addition to a regular monitor 
> op with on_fail=ignore if you want.

Best idea so far.



Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list