[Pacemaker] Missing lrm_opstatus
Andrew Beekhof
andrew at beekhof.net
Thu Oct 7 13:26:48 EDT 2010
On Thu, Oct 7, 2010 at 6:06 PM, Ron Kerry <rkerry at sgi.com> wrote:
> On 10/7/2010 8:00 AM, Andrew Beekhof wrote:
>>
>> On Thu, Oct 7, 2010 at 11:13 AM, Dejan Muhamedagic <dejanmm at fastmail.fm>
>> wrote:
>> > On Thu, Oct 07, 2010 at 09:49:05AM +0200, Andrew Beekhof wrote:
>> >> On Tue, Oct 5, 2010 at 1:50 PM, Dejan Muhamedagic
>> <dejanmm at fastmail.fm> wrote:
>> >> > Hi,
>> >> >
>> >> > On Tue, Oct 05, 2010 at 11:18:37AM +0200, Andrew Beekhof wrote:
>> >> >> Dejan: looks like something in the lrm library.
>> >> >> Any idea why the message doesn't contain lrm_opstatus?
>> >> >
>> >> > Becase this monitor operation never run. Which seems to be a
>> >> > plausible explanation since the start-delay is set to 600s.
>> >>
>> >> Isn't that what LRM_OP_PENDING is for?
>> >> I'm happy to see that at least msg_to_op() maps missing fields to that
>> value :-)
>> >
>> > Actually it does, it's just that the library code logs the
>> > warning and then the whole message. The missing op_status is then
>> > set to LRM_OP_PENDING.
>>
>> Yep, like I said, I was happy to see that this was the case (I looked
>> up the code).
>> Might just be simpler to set it on the server side though and avoid the
>> warning.
>>
>> >
>> > BTW, using start-delay means that there's a deficiency in the RA.
>> > That attribute should be banned.
>> >
>>
>> Right, I also meant to mention that in my reply.
>> I'm still yet to see a valid use for start-delay, Ron: why is it being
>> used here?
>>
>
> Probably because we did not know any better. The intent is that the monitor
> operation not be scheduled to run until after the start operation has
> completed.
Unless the start operation returns before the resource is fully
started, this wont happen.
> The start operation for most of our RAs verifies the resource
> startup )most often by just calling the monitor function itself).
In that case, it should be safe to remove the start-delay.
> So we set
> the monitor start-delay to the same value as the start timeout. We have
> setting things up this way for quite some time and it has never caused us
> problems before. I cannot remember the history behind initially setting
> start-delay b ut it began way back when we were using straight heartbeat
> based builds (since pretty much the SLES10 time frame). Should we not do
> this?
Preferably not.
To be honest, I've no idea how it made it into the OCF spec in the first place.
>
> --
>
> Ron Kerry rkerry at sgi.com
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
More information about the Pacemaker
mailing list