[Pacemaker] [Problem] The attrd does not sometimes stop.

Lars Ellenberg lars.ellenberg at linbit.com
Mon Jan 16 18:11:25 EST 2012


On Mon, Jan 16, 2012 at 11:42:32PM +1100, Andrew Beekhof wrote:
> >>> http://developer.gnome.org/glib/2.30/glib-The-Main-Event-Loop.html#GSourceFuncs
> >>>
> >>> iiuc, mainloop does something similar to (oversimplified):
> >>>        timeout = -1; /* infinity */
> >>>        for s in all GSource
> >>>                tmp_timeout = -1;
> >>>                s->prepare(s, &tmp_timeout)
> >>>                if (tmp_timeout >= 0 && tmp_timeout < timeout)
> >>>                        timeout = tmp_timeout;
> >>>
> >>>        poll(GSource fd set, n, timeout);
> >>
> >> I'm looking at the glib code again now, and it still looks to me like
> >> the trigger and signal sources do not appear in this fd set.
> >> Their setup functions would have to have called g_source_add_poll()
> >> somewhere, which they don't.
> >>
> >> So I'm still not seeing why its a trigger or signal sources' fault
> >> that glib is doing a never ending call to poll().
> >> poll() is going to get called regardless of whether our prepare
> >> function returns true or not.
> >>
> >> Looking closer, crm_trigger_prepare() returning TRUE results in:
> >>                  ready_source->flags |= G_SOURCE_READY;
> >>
> >> which in turn causes:
> >>          context->timeout = 0;
> >>
> >> which is essentially what adding
> >>       if (trig->trigger)
> >>               *timeout = 0;
> >>
> >> to crm_trigger_prepare() was intended to achieve.
> >>
> >> Shouldn't the fd, ipc or wait sources (who do call g_source_add_poll()
> >> and could therefor cause poll() to block forever) have a sane timeout
> >> in their prepare functions?

Probably should, but they usually have not.
The reasoning probably is, each GSource is responsible for *itself* only.

That is why first all sources are prepared.

If no non-fd, non-pollable source feels the need to reduce the
*timeout to something finite in its prepare(), so be it.

Besides, what is sane? 1 second? 5? 120? 240?

That's why G_CH_prepare_int() sets the *timeout to 1000,
and why I suggest to set it to 0 if prepare already knows that the
trigger is set, and to some finite amount to avoid getting stuck in
poll, in case no timeout or outher source source is active which also
set some finite timeout.

BTW, if you have an *idle* sources, prepare should set timeout to 0.

For those interested, all described below
http://developer.gnome.org/glib/2.30/glib-The-Main-Event-Loop.html#GSourceFuncs

"For idle sources, the prepare and check functions always return TRUE to
indicate that the source is always ready to be processed. The prepare
function also returns a timeout value of 0 to ensure that the poll()
call doesn't block (since that would be time wasted which could have
been spent running the idle function)."

"... timeout sources ... returns a timeout value to ensure that the
poll() call doesn't block too long ..."

"... file descriptor sources ... timeout to -1 to indicate that is does
not mind how long the poll() call blocks ... "

> >> Or is it because the signal itself is interrupting some essential part
> >> of G_CH_prepare_int() and friends?

In the provided strace, it looks like the SIGTERM
is delivered while calling some G_CH_prepare_int,
the ->prepare() used by G_main_add_IPC_Channel.

Since the signal sources are of higher priority,
we probably are passt those already in this iteration,
we will only notice the trigger in the next check(),
after the poll.

So it is vital for any non-pollable source such as signals
to set a finite timeout in their prepare(),
even if we also mark that signal siginterrupt().

> >>>        for s in all GSource
> >>>                if s->check(s)
> >>>                        s->dispatch(s, ...)
> >>>
> >>> And at some stage it also orders by priority, of course.
> >>>
> >>> Also compare with the comment above /* Sigh... */ in glue G_SIG_prepare().
> >>>
> >>> BTW, the mentioned race between signal delivery and mainloop already
> >>> doing the poll stage could potentially be solved by using
> >>> cl_signal_set_interrupt(SIGTERM, 1),

As I just wrote above, that race is not solved at all.
Only the (necessarily set) finite timeout of the poll
would be shortened in that case.

> >> But I can't escape the feeling that calling this just masks the
> >> underlying "why is there a never-ending call to poll() in the first
> >> place" issue.
> >> G_CH_prepare_int() and friends /should/ be setting timeouts so that
> >> poll() can return and any sources created by g_idle_source_new() can
> >> execute.
> >
> > Actually, thinking further, I'm pretty convinced that poll() with an
> > infinite timeout is the default mode of operation for mainloops with
> > cluster-glue's IPC and FD sources.
> > And that this is not a good thing :)

Well, if there are *only* pollable sources, it is.
If there are any other sources, they should have set
their limit on what they think is an acceptable timeout
int their prepare().

> Far too late, brain shutting down.

 ;-)

> ...not a good thing, because it breaks the idle stuff,

see above, explanation on developer.gnome.org,
idle stuff is expected to set timeout 0 (or "just a few ms").

> but most of all because it requires /all/ external events to come out
> of that poll() call.

If you have any non-pollable event sources,
they need to set a finite timeout in their prepare().

> >>> which would mean we can condense the prepare to

Nope, I have already corrected myself, not sufficient.

And, actually, if the prepare() returns true,
it will be dispatched() right there, anyways, I think,
which makes the conditional *timeout = 0 useless.

So what G_SIG_prepare() does it just the right thing:
unconditionally set *timeout = 1000 for a maximum
signal callback dispatch delay of one second.

(plus whatever time it takes to call all the other prepare
functions, plut whatever time it takes to return from the
dispatch functions... so better not spend too much time there,
that's why my first suspicion in this thread was those blocking
IPC calls in some dispatch callbacks)
	    
	    Lars





More information about the Pacemaker mailing list