[Pacemaker] [Problem]When Pacemaker uses a new version of glib, g_source_remove fails.

Andrew Beekhof andrew at beekhof.net
Mon Oct 6 22:06:41 EDT 2014


On 7 Oct 2014, at 1:03 pm, renayama19661014 at ybb.ne.jp wrote:

> Hi Andrew,
> 
>>> These problems seem to be due to a correction of next glib somehow or 
>> other.
>>>   * 
>> https://github.com/GNOME/glib/commit/393503ba5bdc7c09cd46b716aaf3d2c63a6c7f9c
>>  
>> The glib behaviour on unbuntu seems reasonable, removing a source multiple times 
>> IS a valid error.
>> I need the stack trace to know where/how this situation can occur in pacemaker.
> 
> 
> Pacemaker does not remove resources several times as far as I confirmed it.
> In Ubuntu(glib2.40), an error occurs just to remove resources first.

Not quite. Returning FALSE from the callback also removes the source from glib.
So your test case effectively removes t1 twice: once implicitly by returning FALSE in timer_func1() and then again explicitly in timer_func3()

> 
> Confirmation and the deletion of resources seem to be necessary not to produce an error in Ubuntu.
> And this works well in glib of RHEL6.x.(and RHEL7.0)
> 
>         if (g_main_context_find_source_by_id (NULL, t1) != NULL) {
>                 g_source_remove(t1);
>         }
> 
> I send it to you after acquiring stack trace.
> 
> Many Thanks!
> Hideo Yamauchi.
> 
> ----- Original Message -----
>> From: Andrew Beekhof <andrew at beekhof.net>
>> To: renayama19661014 at ybb.ne.jp; The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
>> Cc: 
>> Date: 2014/10/7, Tue 09:44
>> Subject: Re: [Pacemaker] [Problem]When Pacemaker uses a new version of glib, g_source_remove fails.
>> 
>> 
>> On 6 Oct 2014, at 4:09 pm, renayama19661014 at ybb.ne.jp wrote:
>> 
>>> Hi All,
>>> 
>>> When I move the next sample in RHEL6.5(glib2-2.22.5-7.el6) and 
>> Ubuntu14.04(libglib2.0-0:amd64 2.40.0-2), movement is different.
>>> 
>>>   * Sample : test2.c
>>> {{{
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <glib.h>
>>> #include <sys/times.h>
>>> guint t1, t2, t3;
>>> gboolean timer_func2(gpointer data){
>>>          printf("TIMER EXPIRE!2\n");
>>>          fflush(stdout);
>>>          return FALSE;
>>> }
>>> gboolean timer_func1(gpointer data){
>>>          clock_t         ret;
>>>          struct tms buff;
>>> 
>>>          ret = times(&buff);
>>>          printf("TIMER EXPIRE!1 %d\n", (int)ret);
>>>          fflush(stdout);
>>>          return FALSE;
>>> }
>>> gboolean timer_func3(gpointer data){
>>>          printf("TIMER EXPIRE 3!\n");
>>>          fflush(stdout);
>>>          printf("remove timer1!\n");
>>> 
>>>          fflush(stdout);
>>>          g_source_remove(t1);
>>>          printf("remove timer2!\n");
>>>          fflush(stdout);
>>>          g_source_remove(t2);
>>>          printf("remove timer3!\n");
>>>          fflush(stdout);
>>>          g_source_remove(t3);
>>>          return FALSE;
>>> }
>>> int main(int argc, char** argv){
>>>          GMainLoop *m;
>>>          clock_t         ret;
>>>          struct tms buff;
>>>          gint64 t;
>>>          m = g_main_new(FALSE);
>>>          t1 = g_timeout_add(1000, timer_func1, NULL);
>>>          t2 = g_timeout_add(60000, timer_func2, NULL);
>>>          t3 = g_timeout_add(5000, timer_func3, NULL);
>>>          ret = times(&buff);
>>>          printf("START! %d\n", (int)ret);
>>>          g_main_run(m);
>>> }
>>> 
>>> }}}
>>>   * Result
>>> ---- RHEL6.5(glib2-2.22.5-7.el6) ---- 
>>> [root at snmp1 ~]# ./test2
>>> START! 429576012
>>> TIMER EXPIRE!1 429576112
>>> TIMER EXPIRE 3!
>>> remove timer1!
>>> remove timer2!
>>> remove timer3!
>>> 
>>> ---- Ubuntu14.04(libglib2.0-0:amd64 2.40.0-2) ----
>>> root at a1be102:~# ./test2
>>> START! 1718163089
>>> TIMER EXPIRE!1 1718163189
>>> TIMER EXPIRE 3!
>>> remove timer1!
>>> 
>>> (process:1410): GLib-CRITICAL **: Source ID 1 was not found when attempting 
>> to remove it
>>> remove timer2!
>>> remove timer3!
>>> 
>>> 
>>> These problems seem to be due to a correction of next glib somehow or 
>> other.
>>>   * 
>> https://github.com/GNOME/glib/commit/393503ba5bdc7c09cd46b716aaf3d2c63a6c7f9c
>> 
>> The glib behaviour on unbuntu seems reasonable, removing a source multiple times 
>> IS a valid error.
>> I need the stack trace to know where/how this situation can occur in pacemaker.
>> 
>>> 
>>> In g_source_remove() until before change, the deletion of the timer which 
>> practice completed is possible, but g_source_remove() after the change causes an 
>> error.
>>> 
>>> Under this influence, we get the following crit error in the environment of 
>> Pacemaker using a new version of glib.
>>> 
>>> lrmd[1632]:    error: crm_abort: crm_glib_handler: Forked child 1840 to 
>>> record non-fatal assert at logging.c:73 : Source ID 51 was not found when 
>>> attempting to remove it
>>> lrmd[1632]:    crit: crm_glib_handler: GLib: Source ID 51 was not found 
>>> when attempting to remove it
>>> 
>>> It seems that some kind of coping is necessary in Pacemaker when I think 
>> about next.
>>>   * Distribution using a new version of glib including Ubuntu.
>>>   * Version up of future glib of RHEL.
>>> 
>>> A similar problem is reported in the ML.
>>>   * http://www.gossamer-threads.com/lists/linuxha/pacemaker/91333#91333
>>>   * http://www.gossamer-threads.com/lists/linuxha/pacemaker/92408
>>> 
>>> Best Regards,
>>> Hideo Yamauchi.
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20141007/a850e4bd/attachment-0003.sig>


More information about the Pacemaker mailing list