[Pacemaker] [Problem] The attrd does not sometimes stop.

Mon Jan 16 22:52:35 UTC 2012

On Mon, Jan 16, 2012 at 11:42 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
> On Mon, Jan 16, 2012 at 11:30 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>> On Mon, Jan 16, 2012 at 11:27 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>> I know I could just apply the patch and be done, but I'd like to
>>> understand this so it works for the right reason.

Ok, done:

https://github.com/beekhof/pacemaker/commit/2a6b296

If I'm adding voodoo, I at least want the reason well documented so it
can be removed again if the reason goes away.

>>> On Mon, Jan 16, 2012 at 7:30 PM, Lars Ellenberg
>>> <lars.ellenberg at linbit.com> wrote:
>>>> On Mon, Jan 16, 2012 at 04:46:58PM +1100, Andrew Beekhof wrote:
>>>>> > Now we proceed to the next mainloop poll:
>>>>> >
>>>>> >> poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=5, events=POLLIN|POLLPRI}], 3, -1
>>>>> >
>>>>> > Note the -1 (infinity timeout!)
>>>>> >
>>>>> > So even though the trigger was (presumably) set,
>>>>> > and the ->prepare() should have returned true,
>>>>> > the mainloop waits forever for "something" to happen on those file descriptors.
>>>>> >
>>>>> >
>>>>> > I suggest this:
>>>>> >
>>>>> > crm_trigger_prepare should set *timeout = 0, if trigger is set.
>>>>> >
>>>>> > Also think about this race: crm_trigger_prepare was already
>>>>> > called, only then the signal came in...
>>>>> >
>>>>> > diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
>>>>> > index 2e8b1d0..fd17b87 100644
>>>>> > --- a/lib/common/mainloop.c
>>>>> > +++ b/lib/common/mainloop.c
>>>>> > @@ -33,6 +33,13 @@ static gboolean
>>>>> >  crm_trigger_prepare(GSource * source, gint * timeout)
>>>>> >  {
>>>>> >     crm_trigger_t *trig = (crm_trigger_t *) source;
>>>>> > +    /* Do not delay signal processing by the mainloop poll stage */
>>>>> > +    if (trig->trigger)
>>>>> > +           *timeout = 0;
>>>>> > +    /* To avoid races between signal delivery and the mainloop poll stage,
>>>>> > +     * make sure we always have a finite timeout. Unit: milliseconds. */
>>>>> > +    else
>>>>> > +           *timeout = 5000; /* arbitrary */
>>>>> >
>>>>> >     return trig->trigger;
>>>>> >  }
>>>>> >
>>>>> >
>>>>> > This scenario does not let the blocked IPC off the hook, though.
>>>>> > That is still possible, both for blocking send and blocking receive,
>>>>> > so that should probably be fixed as well, somehow.
>>>>> > I'm not sure how likely this "stuck in blocking IPC" is, though.
>>>>>
>>>>> Interesting, are you sure you're in the right function though?
>>>>> trigger and signal events don't have a file descriptor... wouldn't
>>>>> these polls be for the IPC related sources and wouldn't they be
>>>>> setting their own timeout?
>>>>
>>>> http://developer.gnome.org/glib/2.30/glib-The-Main-Event-Loop.html#GSourceFuncs
>>>>
>>>> iiuc, mainloop does something similar to (oversimplified):
>>>>        timeout = -1; /* infinity */
>>>>        for s in all GSource
>>>>                tmp_timeout = -1;
>>>>                s->prepare(s, &tmp_timeout)
>>>>                if (tmp_timeout >= 0 && tmp_timeout < timeout)
>>>>                        timeout = tmp_timeout;
>>>>
>>>>        poll(GSource fd set, n, timeout);
>>>
>>> I'm looking at the glib code again now, and it still looks to me like
>>> the trigger and signal sources do not appear in this fd set.
>>> Their setup functions would have to have called g_source_add_poll()
>>> somewhere, which they don't.
>>>
>>> So I'm still not seeing why its a trigger or signal sources' fault
>>> that glib is doing a never ending call to poll().
>>> poll() is going to get called regardless of whether our prepare
>>> function returns true or not.
>>>
>>> Looking closer, crm_trigger_prepare() returning TRUE results in:
>>>                  ready_source->flags |= G_SOURCE_READY;
>>>
>>> which in turn causes:
>>>          context->timeout = 0;
>>>
>>> which is essentially what adding
>>>       if (trig->trigger)
>>>               *timeout = 0;
>>>
>>> to crm_trigger_prepare() was intended to achieve.
>>>
>>> Shouldn't the fd, ipc or wait sources (who do call g_source_add_poll()
>>> and could therefor cause poll() to block forever) have a sane timeout
>>> in their prepare functions?
>>> Or is it because the signal itself is interrupting some essential part
>>> of G_CH_prepare_int() and friends?
>>>
>>>>
>>>>        for s in all GSource
>>>>                if s->check(s)
>>>>                        s->dispatch(s, ...)
>>>>
>>>> And at some stage it also orders by priority, of course.
>>>>
>>>> Also compare with the comment above /* Sigh... */ in glue G_SIG_prepare().
>>>>
>>>> BTW, the mentioned race between signal delivery and mainloop already
>>>> doing the poll stage could potentially be solved by using
>>>
>>> Again, since nothing related to the signal source ever appears in the
>>> call to poll(), I'm not seeing where the race comes from.
>>> Or am I missing something obvious?
>>>
>>>> cl_signal_set_interrupt(SIGTERM, 1),
>>>
>>> This, combined with
>>>
>>>                /*
>>>                 * If we don't set this on, then the mainloop poll(2) call
>>>                 * will never be interrupted by this signal - which sort of
>>>                 * defeats the whole purpose of a signal handler in a
>>>                 * mainloop program
>>>                 */
>>>                cl_signal_set_interrupt(signal, TRUE);
>>>
>>> looks more relevant.
>>> But I can't escape the feeling that calling this just masks the
>>> underlying "why is there a never-ending call to poll() in the first
>>> place" issue.
>>> G_CH_prepare_int() and friends /should/ be setting timeouts so that
>>> poll() can return and any sources created by g_idle_source_new() can
>>> execute.
>>
>> Actually, thinking further, I'm pretty convinced that poll() with an
>> infinite timeout is the default mode of operation for mainloops with
>> cluster-glue's IPC and FD sources.
>> And that this is not a good thing :)
>
> Far too late, brain shutting down.
>
> ...not a good thing, because it breaks the idle stuff, but most of all
> because it requires /all/ external events to come out of that poll()
> call.
> Thats why its important that "the mainloop poll(2) call will never be
> interrupted by this signal" behaviour had to be fudged around by the
> call to siginterrupt().
>
>>>> which would mean we can condense the prepare to
>>>>        if (trig->trigger)
>>>>                *timeout = 0;
>>>>        return trig->trigger;
>>>>
>>>> Glue (and heartbeat) code base is not that, let's say, involved,
>>>> because someone had been paranoid.
>>>> But because someone had been paranoid for a reason ;-)
>
> Not sure this is an excellent example of that to be honest.
>
>>>>
>>>> Cheers,
>>>>
>>>> --
>>>> : Lars Ellenberg
>>>> : LINBIT | Your Way to High Availability
>>>> : DRBD/HA support and consulting http://www.linbit.com
>>>>
>>>> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org