[Pacemaker] [Problem] The attrd does not sometimes stop.

Andrew Beekhof andrew at beekhof.net
Wed Nov 2 14:49:46 UTC 2011


On Tue, Oct 18, 2011 at 12:19 PM,  <renayama19661014 at ybb.ne.jp> wrote:
> Hi,
>
> We sometimes fail in a stop of attrd.
>
> Step1. start a cluster in 2 nodes
> Step2. stop the first node.(/etc/init.d/heartbeat stop.)
> Step3. stop the second node after time passed a little.(/etc/init.d/heartbeat
> stop.)
>
> The attrd catches the TERM signal, but does not stop.

There's no evidence that it actually catches it, only that it is sent.
I've seen it before but never figured out why it occurs.

>
> (snip)
> Oct  5 02:37:38 hpdb0201 crmd: [12238]: info: do_exit: [crmd] stopped (0)
> Oct  5 02:37:38 hpdb0201 cib: [12234]: WARN: send_ipc_message: IPC Channel to
> 12238 is not connected
> Oct  5 02:37:38 hpdb0201 cib: [12234]: WARN: send_via_callback_channel:
> Delivery of reply to client 12238/0dbc9e28-d90d-4335-b9c4-9dd3fcb38163 failed
> Oct  5 02:37:38 hpdb0201 cib: [12234]: WARN: do_local_notify: A-Sync reply to
> crmd failed: reply failed
> Oct  5 02:37:38 hpdb0201 heartbeat: [12223]: info: killing
> /usr/lib64/heartbeat/attrd process group 12237 with signal 15
> Oct  5 02:47:03 hpdb0201 cib: [12234]: info: cib_stats: Processed 97 operations
> (4123.00us average, 0% utilization) in the last 10min
> Oct  5 07:15:25 hpdb0201 ccm: [12233]: WARN: G_CH_check_int: working on IPC
> channel took 1010 ms (> 100 ms)
> Oct  5 07:15:26 hpdb0201 ccm: [12233]: WARN: G_CH_check_int: working on IPC
> channel took 1010 ms (> 100 ms)
> Oct  5 07:15:37 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch:
> Dispatch function for check for signals was delayed 1030 ms (> 1010 ms) before
> being called (GSource: 0xd28010)
> Oct  5 07:15:37 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch:
> started at 431583547 should have started at 431583444
> Oct  5 07:15:44 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch:
> Dispatch function for send local status was delayed 1030 ms (> 1010 ms) before
> being called (GSource: 0xd27dd0)
> Oct  5 07:15:44 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch:
> started at 431584254 should have started at 431584151
> Oct  5 07:15:44 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch:
> Dispatch function for check for signals was delayed 1030 ms (> 1010 ms) before
> being called (GSource: 0xd28010)
> Oct  5 07:15:44 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch:
> started at 431584254 should have started at 431584151
> Oct  5 07:16:59 hpdb0201 heartbeat: [12223]: WARN: G_CH_check_int: working on
> write child took 1010 ms (> 100 ms)
> Oct  5 07:17:14 hpdb0201 stonithd: [12236]: WARN: G_CH_check_int: working on
> Heartbeat API channel took 1010 ms (> 100 ms)
> Oct  5 07:19:41 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch:
> Dispatch function for send local status was delayed 1030 ms (> 1010 ms) before
> being called (GSource: 0xd27dd0)
> Oct  5 07:19:41 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch:
> started at 431607988 should have started at 431607885
> Oct  5 07:19:41 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch:
> Dispatch function for check for signals was delayed 1030 ms (> 1010 ms) before
> being called (GSource: 0xd28010)
> Oct  5 07:19:41 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch:
> started at 431607988 should have started at 431607885
> (snip)
>
> We try the reproduction of the phenomenon, but do not reappear very much.
>
> The same phenomenon is reported by the next email.
> However, the argument of the problem is over on the way.
>
>  * http://www.gossamer-threads.com/lists/linuxha/pacemaker/62147
>
> The phenomenon occurred by the next combination.
>  * pacemaker-1.0.11
>  * resource-agents-3.9.2
>  * cluster-glue-1.0.7
>  * heartbeat-3.0.5
>
> I registered these contents with Bugzilla.
>  * http://bugs.clusterlabs.org/show_bug.cgi?id=5004
>
> Best Regards,
> Hideo Yamauchi.
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>




More information about the Pacemaker mailing list