[Pacemaker] Critical: Monitor operation of IPaddr2 timing out, taking more than 60s. Fails to recover.

Parshvi parshvi.17 at gmail.com
Fri Aug 10 01:44:47 EDT 2012


T
Mario Penners <mario.penners at ...> writes:

> 
> Hi Parshvi,
> 
> just a quick-shot and without analyzing your mail in detail: find
> attached an edited version of the IPaddr2 RA.
> 
> I was trying to use the original script a while agho, and basically
> nothing worked: It did not recognize the link failures (due to the way
> how the test was implemented it would only work if you have not more
> than 1 IP per interface), there was no proper support for bonding, the
> IP addresses would not be shifted ....
> 
> I did some (very minor) changes to ge the script working for us. Just
> have a shot at it if you want, maybe replacing the RA will already solve
> your problem.
> 
> Cheers,
> Mario  
> 
> On Thu, 2012-08-09 at 05:44 +0000, Parshvi wrote:
> > Parshvi <parshvi.17 at ...> writes:
> > 
> > > 
> > > Hi,
> > > 
> > > The monitor operation of IPaddr2 rsc agent is timing out.
> > > Interval: 5s
> > > Timeout: 60s
> > > The timeout was increased from an earlier 20s to now 60s. Even then, there 
are 
> > > multiple logs of monitor op. timing out.
> > > 
> > > 1) What can cause the monitor to take so long ?
> > > 2) Looking at the pe-input, what contributes to the operation time ? Is it 
> > just 
> > > the exec-time or exec-time + queue-time ?
> > > 3) Any solution proposed ?
> > > 
> 

Thanks Mario for your input.

The are some more findings:
1) The monitor is not timing out in all environments. I have been through some 
of the forum mails, and came across people talking about "heavy load on the 
system" wrt the timeout issue.
2) Could somebody explain, what exactly are we referring to when we say "heavy 
load" ? Also, how does it affect an operations execution ?
3) THE OPERATION MONITOR IS TIMING OUT ON OTHER RESOURCES TOO( ALONG WITH 
IPADDR2).
4) None of these operations were timing out in a local environment.

I added some logging in IPaddr2 resource agent script.
In func. ip_monitor(),I have printed the date at enter monitor and at exit 
monitor func.
This is what I observed for :
interval=5s
timeout=60s

enter monitor Thu Aug 9 06:26:28 AST 2012
exit monitor Thu Aug 9 06:26:28 AST 2012

enter monitor Thu Aug 9 06:26:36 AST 2012
exit monitor Thu Aug 9 06:26:36 AST 2012

[The next monitor was invoked after 71 seconds]

enter monitor Thu Aug 9 06:27:47 AST 2012
exit monitor Thu Aug 9 06:27:47 AST 2012

enter monitor Thu Aug 9 06:27:52 AST 2012
exit monitor Thu Aug 9 06:27:52 AST 2012






More information about the Pacemaker mailing list