[Pacemaker] How to deal with unix signals in a glib mainloop (was: [Problem] The attrd does not sometimes stop.)

renayama19661014 at ybb.ne.jp renayama19661014 at ybb.ne.jp
Wed Feb 1 20:45:38 EST 2012


Hi Andrew,

> It should already be in the main repo for 1.1 and we can backport to
> pacemaker-1.0

The next correction was included in Pacemaker1.1, and I confirmed a thing.
 * https://github.com/ClusterLabs/pacemaker/commit/2a6b296b7ca42a1b671563f5ab73853ff2a8fcef#lib/common

I look forward to the next release of Pacemaker1.0.

Many Thanks!
Hideo Yamauchi.

--- On Thu, 2012/2/2, Andrew Beekhof <andrew at beekhof.net> wrote:

> On Wed, Feb 1, 2012 at 1:57 PM,  <renayama19661014 at ybb.ne.jp> wrote:
> > Hi Lars,
> > Hi Andrew,
> >
> > I confirmed that a problem did not occur with a patch of Mr. Andrew.
> >  * https://github.com/beekhof/pacemaker/commit/2a6b296
> > The examination that I carried out is repetition by start and a stop.
> >
> > Try 1.  During 405 times, start/stop succeed.
> > Try 2.  During 407 times, start/stop succeed.
> > Try 3.  During 1228 times, start/stop succeed.(Because I carried it out on the weekend, there is much number of times)
> > Try 4.  During 408 times, start/stop succeed.
> > Try 5.  During 418 times, start/stop succeed.
> >
> > The problem was settled with the patch of Mr. Andrew.
> >
> > I wish this patch is reflected to Pacemaker1.1 and Pacemaker1.0.
> 
> It should already be in the main repo for 1.1 and we can backport to
> pacemaker-1.0
> 
> >
> > Best Regard,
> > Hideo Yamauchi.
> >
> >
> > --- On Wed, 2012/1/25, renayama19661014 at ybb.ne.jp <renayama19661014 at ybb.ne.jp> wrote:
> >
> >> Hi Lars,
> >> Hi Andrew,
> >>
> >> I confirmed that a problem did not occur with a patch of Mr. Lars.
> >> The examination that I carried out is repetition by start and a stop.
> >>
> >> I tested it five times
> >>
> >> The results are as follows.
> >>
> >>  Try 1.  During 420 times, start/stop succeed.
> >>  Try 2.  During 396 times, start/stop succeed.
> >>  Try 3.  During 412 times, start/stop succeed.
> >>  Try 4.  During 1221 times, start/stop succeed.(Because I carried it out on the weekend, there is much number of times)
> >>  Try 5.  During 420 times, start/stop succeed.
> >>
> >> This test environment is the environment that problems produce well.
> >> I think that the problem is solved with a patch of Mr. Lars.
> >>
> >> Even a patch of Mr. Andrew carries out a similar test.
> >> I carry out a test a little more and finally report a result.
> >>
> >> Best Regards,
> >> Hideo Yamauchi.
> >>
> >> --- On Fri, 2012/1/20, renayama19661014 at ybb.ne.jp <renayama19661014 at ybb.ne.jp> wrote:
> >>
> >> > Hi Lars,
> >> > Hi Andrew,
> >> >
> >> > I test it now in the environment that the problem reproduces with a patch of Mr. Lars.
> >> > * The patch of msgfromIPC_ll does not apply it.
> >> > * The patch of crm_trigger_prepare applies it.
> >> >
> >> > The problem does not reappear on the test of several days for the moment.
> >> >
> >> > I carry out a test a little more and finally report a result.
> >> > And I intend to carry out the same test with a patch of Mr. Andrew afterwards.
> >> >  * https://github.com/beekhof/pacemaker/commit/2a6b296
> >> >
> >> > Best Regards,
> >> > Hideo Yamauchi.
> >> >
> >> >
> >> > --- On Fri, 2012/1/20, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
> >> >
> >> > > On Fri, Jan 20, 2012 at 09:21:58AM +1100, Angus Salkeld wrote:
> >> > > > On 19/01/12 22:23 +0100, Lars Ellenberg wrote:
> >> > > > >On Tue, Jan 17, 2012 at 12:13:37AM +0100, Lars Ellenberg wrote:
> >> > > > >>On Tue, Jan 17, 2012 at 09:52:35AM +1100, Andrew Beekhof wrote:
> >> > > > >>>
> >> > > > >>> Ok, done:
> >> > > > >>>
> >> > > > >>> https://github.com/beekhof/pacemaker/commit/2a6b296
> >> > > > >>>
> >> > > > >>> If I'm adding voodoo, I at least want the reason well documented so it
> >> > > > >>> can be removed again if the reason goes away.
> >> > > > >>
> >> > > > >>That about sums it up, then ;-)
> >> > > > >
> >> > > > >But as having to do this was just "too ugly to be true",
> >> > > > >I dug a little deeper...
> >> > > > >
> >> > > > >The way to do this is obviously to use the glib api ;-)
> >> > > > >http://developer.gnome.org/glib/2.30/glib-UNIX-specific-utilities-and-integration.html#g-unix-signal-add-full
> >> > > > >
> >> > > > >(Since glib 2.30, yay; if you don't have that yet, read on anyways)
> >> > > > >
> >> > > > >What it does internally, and what other people have also done for a long
> >> > > > >time  to solve this and similar problems, is:
> >> > > > >
> >> > > > >Add to the main context a "wakeup pipe",
> >> > > > >which is an eventfd if available,
> >> > > > >or an actual pipe if not.
> >> > > > >If it is a pipe, set those file descriptors non-blocking.
> >> > > > >And, of course, add the evenfd (or the read end of the pipe)
> >> > > > >to the poll loop (with default priority, btw,
> >> > > > >which is good enough to have the poll terminate).
> >> > > > >
> >> > > > >That is done internally when creating the main context.
> >> > > > >http://git.gnome.org/browse/glib/tree/glib/gmain.c#n548
> >> > > > >http://git.gnome.org/browse/glib/tree/glib/gwakeup.c#n138
> >> > > > >
> >> > > > >(the line numbers are correct for glib master as of today,
> >> > > > >which should correspond to 41fbf42)
> >> > > > >
> >> > > > >The g_unix_signal_handler then sets the triggers variables,
> >> > > > >and calls g_wakeup_signal(that internal wakeup source),
> >> > > > >which simply posts and event to the eventfd,
> >> > > > >or does a short (1 byte) write to the write end of the pipe.
> >> > > > >http://git.gnome.org/browse/glib/tree/glib/gmain.c#n4442
> >> > > > >http://git.gnome.org/browse/glib/tree/glib/gwakeup.c#n230
> >> > > > >
> >> > > > >Problem solved, without having to do a full check() everything,
> >> > > > >prepare() everything, and poll() again cycle every 500ms.
> >> > > > >
> >> > > > >"back in those days", when this mechanism was not really there,
> >> > > > >you could do all that "by hand".
> >> > > > >And people did. Very common idiom in glib and other mainloop
> >> > > > >applications, also frequently used to "signal" availability of work
> >> > > > >or completion of tasks between threads.
> >> > > > >
> >> > > > >static int my_wakeup_fds[2] = { -1, -1 };
> >> > > > >
> >> > > > >Then just pipe2(my_wakeup_fds, O_NONBLOCK), add my_wakeup_fds[0] as
> >> > > > >normal read fd source, and add a write(my_wakeup_fds[1], "", 1); to the
> >> > > > >signal handlers.
> >> > > >
> >> > > > signalfd makes this much easier too "man 2 signalfd"
> >> > >
> >> > > See https://bugzilla.gnome.org/show_bug.cgi?id=652072#c32
> >> > > following (or the whole bug, if you like).
> >> > >
> >> > > Also, pipes are portable.
> >> > >
> >> > >     Lars
> >> > >
> >> > >
> >> > > _______________________________________________
> >> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> > >
> >> > > Project Home: http://www.clusterlabs.org
> >> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > > Bugs: http://bugs.clusterlabs.org
> >> > >
> >> >
> >> > _______________________________________________
> >> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> >
> >> > Project Home: http://www.clusterlabs.org
> >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > Bugs: http://bugs.clusterlabs.org
> >> >
> >>
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 




More information about the Pacemaker mailing list