[ClusterLabs] Constant stop/start of resource in spite of interval=0

Sat May 18 15:20:24 EDT 2019

On Sat, 18 May 2019, Kadlecsik József wrote:

> On Sat, 18 May 2019, Andrei Borzenkov wrote:
> 
> > 18.05.2019 18:34, Kadlecsik József пишет:
> 
> > > We have a resource agent which creates IP tunnels. In spite of the 
> > > configuration setting
> > > 
> > > primitive tunnel-eduroam ocf:local:tunnel \
> > >         params ....
> > >         op start timeout=120s interval=0 \
> > >         op stop timeout=300s interval=0 \
> > >         op monitor timeout=30s interval=30s depth=0 \
> > >         meta target-role=Started
> > > order bifur-eduroam-ipv4-before-tunnel-eduroam \
> > > 	Mandatory: bifur-eduroam-ipv4 tunnel-eduroam
> > > colocation tunnel-eduroam-on-bifur-eduroam-ipv4 inf: tunnel-eduroam \
> > > 	bifur-eduroam-ipv4:Started
> > > 
> > > the resource is restarted again and again. According to the debug logs:
> > > 
> > > May 16 14:20:35 [3052] bifur1       lrmd:    debug: recurring_action_timer:
> > >      Scheduling another invocation of tunnel-eduroam_monitor_30000
> > > May 16 14:20:35 [3052] bifur1       lrmd:    debug: operation_finished: 
> > > tunnel-eduroam_monitor_30000:62066 - exited with rc=0
> > > May 16 14:20:35 [3052] bifur1       lrmd:    debug: operation_finished: 
> > > tunnel-eduroam_monitor_30000:62066:stderr [ -- empty -- ]
> > > May 16 14:20:35 [3052] bifur1       lrmd:    debug: operation_finished: 
> > > tunnel-eduroam_monitor_30000:62066:stdout [ -- empty -- ]
> > > May 16 14:20:35 [3052] bifur1       lrmd:    debug: log_finished:       
> > > finished - rsc:tunnel-eduroam action:monitor call_id:1045 pid:62066 
> > > exit-code:0 exec-time:0ms queue-time:0ms
> > > May 16 14:21:04 [3054] bifur1    pengine:     info: native_print:       
> > > tunnel-eduroam  (ocf::local:tunnel):    Started bifur1
> > > May 16 14:21:04 [3054] bifur1    pengine:     info: 
> > > check_action_definition:
> > >     Parameters to tunnel-eduroam_start_0 on bifur1 changed: was 
> > > 94afff0ff7cfc62f7cb1d5bf5b4d83aa vs. now f2317cad3d54cec5d7d7aa7d0bf35cf8 
> > > (restart:3.0.11) 0:0;48:3:0:73562fd6-1fe2-4930-8c6e-5953b82ebb32
> > 
> > This means that instance attributes changed in this case pacemaker
> > restarts resource to apply new values. Turning on trace level hopefully
> > will show what exactly is being changed. You can also dump CIB before
> > and after restart to compare current information.
> 
> The strange thing is that the new value seems never be stored. Just the 
> "was-now" part from the log lines:
> 
> was 94afff0ff7cfc62f7cb1d5bf5b4d83aa vs. now f2317cad3d54cec5d7d7aa7d0bf35cf8
> was 94afff0ff7cfc62f7cb1d5bf5b4d83aa vs. now f2317cad3d54cec5d7d7aa7d0bf35cf8
> was 94afff0ff7cfc62f7cb1d5bf5b4d83aa vs. now f2317cad3d54cec5d7d7aa7d0bf35cf8
> ...
> 
> However, after issuing "cibadmin --query --local", the whole flipping 
> stopped! :-) Thanks!

No, I was wrong - it still repeats every ~15mins. The diff between two cib 
xml dumps doesn't say much to me - I'm going to enable tracing.

Best regards,
Jozsef
--
E-mail : kadlecsik.jozsef at wigner.mta.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics, Hungarian Academy of Sciences
         H-1525 Budapest 114, POB. 49, Hungary