[Pacemaker] lots of timeout/rexmit messages after failed stonith and manual reboot
    Johan Verrept 
    Johan.Verrept at able.be
       
    Mon Oct  5 09:55:32 EDT 2009
    
    
  
Hi,
    when playing with the RA at a certain point the stonith failed (it
didn't find the host in gethosts) and I rebooted the other node
manually. The result was a whole bunch of messages in my logs:
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request was delayed 2750 ms (> 1000 ms)
before being called (GSource: 0x959e298)
15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch:
started at 429631770 should have started at 429631495
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request took too long to execute: 20 ms
(> 10 ms) (GSource: 0x959e298)
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request was delayed 2740 ms (> 1000 ms)
before being called (GSource: 0x959e300)
15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch:
started at 429631772 should have started at 429631498
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request took too long to execute: 30 ms
(> 10 ms) (GSource: 0x959e300)
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request was delayed 2750 ms (> 1000 ms)
before being called (GSource: 0x959e368)
15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch:
started at 429631775 should have started at 429631500
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request took too long to execute: 30 ms
(> 10 ms) (GSource: 0x959e368)
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request was delayed 2750 ms (> 1000 ms)
before being called (GSource: 0x959e3d0)
15:53:10 SYSLOG info heartbeat [2748]: info: Gmain_timeout_dispatch:
started at 429631778 should have started at 429631503
15:53:10 SYSLOG warning heartbeat [2748]: WARN: Gmain_timeout_dispatch:
Dispatch function for retransmit request took too long to execute: 20 ms
(> 10 ms) (GSource: 0x959e3d0)
with the rebooted node reporting:
15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 251
requested. 131 is max.
15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 242
requested. 131 is max.
15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 252
requested. 131 is max.
15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 251
requested. 131 is max.
15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 314
requested. 131 is max.
15:53:10 SYSLOG warning heartbeat [2721]: WARN: Rexmit of seq 252
requested. 131 is max.
I got about a 100 of these per second.
What happened? How do I clean up something like this without rebooting
my cluster? 
	J.
    
    
More information about the Pacemaker
mailing list