[Pacemaker] [Problem] The attrd does not sometimes stop.

Andrew Beekhof andrew at beekhof.net
Mon Jan 16 07:30:27 EST 2012


On Mon, Jan 16, 2012 at 11:27 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
> I know I could just apply the patch and be done, but I'd like to
> understand this so it works for the right reason.
>
> On Mon, Jan 16, 2012 at 7:30 PM, Lars Ellenberg
> <lars.ellenberg at linbit.com> wrote:
>> On Mon, Jan 16, 2012 at 04:46:58PM +1100, Andrew Beekhof wrote:
>>> > Now we proceed to the next mainloop poll:
>>> >
>>> >> poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=5, events=POLLIN|POLLPRI}], 3, -1
>>> >
>>> > Note the -1 (infinity timeout!)
>>> >
>>> > So even though the trigger was (presumably) set,
>>> > and the ->prepare() should have returned true,
>>> > the mainloop waits forever for "something" to happen on those file descriptors.
>>> >
>>> >
>>> > I suggest this:
>>> >
>>> > crm_trigger_prepare should set *timeout = 0, if trigger is set.
>>> >
>>> > Also think about this race: crm_trigger_prepare was already
>>> > called, only then the signal came in...
>>> >
>>> > diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
>>> > index 2e8b1d0..fd17b87 100644
>>> > --- a/lib/common/mainloop.c
>>> > +++ b/lib/common/mainloop.c
>>> > @@ -33,6 +33,13 @@ static gboolean
>>> >  crm_trigger_prepare(GSource * source, gint * timeout)
>>> >  {
>>> >     crm_trigger_t *trig = (crm_trigger_t *) source;
>>> > +    /* Do not delay signal processing by the mainloop poll stage */
>>> > +    if (trig->trigger)
>>> > +           *timeout = 0;
>>> > +    /* To avoid races between signal delivery and the mainloop poll stage,
>>> > +     * make sure we always have a finite timeout. Unit: milliseconds. */
>>> > +    else
>>> > +           *timeout = 5000; /* arbitrary */
>>> >
>>> >     return trig->trigger;
>>> >  }
>>> >
>>> >
>>> > This scenario does not let the blocked IPC off the hook, though.
>>> > That is still possible, both for blocking send and blocking receive,
>>> > so that should probably be fixed as well, somehow.
>>> > I'm not sure how likely this "stuck in blocking IPC" is, though.
>>>
>>> Interesting, are you sure you're in the right function though?
>>> trigger and signal events don't have a file descriptor... wouldn't
>>> these polls be for the IPC related sources and wouldn't they be
>>> setting their own timeout?
>>
>> http://developer.gnome.org/glib/2.30/glib-The-Main-Event-Loop.html#GSourceFuncs
>>
>> iiuc, mainloop does something similar to (oversimplified):
>>        timeout = -1; /* infinity */
>>        for s in all GSource
>>                tmp_timeout = -1;
>>                s->prepare(s, &tmp_timeout)
>>                if (tmp_timeout >= 0 && tmp_timeout < timeout)
>>                        timeout = tmp_timeout;
>>
>>        poll(GSource fd set, n, timeout);
>
> I'm looking at the glib code again now, and it still looks to me like
> the trigger and signal sources do not appear in this fd set.
> Their setup functions would have to have called g_source_add_poll()
> somewhere, which they don't.
>
> So I'm still not seeing why its a trigger or signal sources' fault
> that glib is doing a never ending call to poll().
> poll() is going to get called regardless of whether our prepare
> function returns true or not.
>
> Looking closer, crm_trigger_prepare() returning TRUE results in:
>                  ready_source->flags |= G_SOURCE_READY;
>
> which in turn causes:
>          context->timeout = 0;
>
> which is essentially what adding
>       if (trig->trigger)
>               *timeout = 0;
>
> to crm_trigger_prepare() was intended to achieve.
>
> Shouldn't the fd, ipc or wait sources (who do call g_source_add_poll()
> and could therefor cause poll() to block forever) have a sane timeout
> in their prepare functions?
> Or is it because the signal itself is interrupting some essential part
> of G_CH_prepare_int() and friends?
>
>>
>>        for s in all GSource
>>                if s->check(s)
>>                        s->dispatch(s, ...)
>>
>> And at some stage it also orders by priority, of course.
>>
>> Also compare with the comment above /* Sigh... */ in glue G_SIG_prepare().
>>
>> BTW, the mentioned race between signal delivery and mainloop already
>> doing the poll stage could potentially be solved by using
>
> Again, since nothing related to the signal source ever appears in the
> call to poll(), I'm not seeing where the race comes from.
> Or am I missing something obvious?
>
>> cl_signal_set_interrupt(SIGTERM, 1),
>
> This, combined with
>
>                /*
>                 * If we don't set this on, then the mainloop poll(2) call
>                 * will never be interrupted by this signal - which sort of
>                 * defeats the whole purpose of a signal handler in a
>                 * mainloop program
>                 */
>                cl_signal_set_interrupt(signal, TRUE);
>
> looks more relevant.
> But I can't escape the feeling that calling this just masks the
> underlying "why is there a never-ending call to poll() in the first
> place" issue.
> G_CH_prepare_int() and friends /should/ be setting timeouts so that
> poll() can return and any sources created by g_idle_source_new() can
> execute.

Actually, thinking further, I'm pretty convinced that poll() with an
infinite timeout is the default mode of operation for mainloops with
cluster-glue's IPC and FD sources.
And that this is not a good thing :)

>
>> which would mean we can condense the prepare to
>>        if (trig->trigger)
>>                *timeout = 0;
>>        return trig->trigger;
>>
>> Glue (and heartbeat) code base is not that, let's say, involved,
>> because someone had been paranoid.
>> But because someone had been paranoid for a reason ;-)
>>
>> Cheers,
>>
>> --
>> : Lars Ellenberg
>> : LINBIT | Your Way to High Availability
>> : DRBD/HA support and consulting http://www.linbit.com
>>
>> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list