[ClusterLabs] Constant stop/start of resource in spite of interval=0

Andrei Borzenkov arvidjaar at gmail.com
Sat May 18 14:15:57 EDT 2019


18.05.2019 18:34, Kadlecsik József пишет:
> Hello,
> 
> We have a resource agent which creates IP tunnels. In spite of the 
> configuration setting
> 
> primitive tunnel-eduroam ocf:local:tunnel \
>         params ....
>         op start timeout=120s interval=0 \
>         op stop timeout=300s interval=0 \
>         op monitor timeout=30s interval=30s depth=0 \
>         meta target-role=Started
> order bifur-eduroam-ipv4-before-tunnel-eduroam \
> 	Mandatory: bifur-eduroam-ipv4 tunnel-eduroam
> colocation tunnel-eduroam-on-bifur-eduroam-ipv4 inf: tunnel-eduroam \
> 	bifur-eduroam-ipv4:Started
> 
> the resource is restarted again and again. According to the debug logs:
> 
> May 16 14:20:35 [3052] bifur1       lrmd:    debug: recurring_action_timer:
>      Scheduling another invocation of tunnel-eduroam_monitor_30000
> May 16 14:20:35 [3052] bifur1       lrmd:    debug: operation_finished: 
> tunnel-eduroam_monitor_30000:62066 - exited with rc=0
> May 16 14:20:35 [3052] bifur1       lrmd:    debug: operation_finished: 
> tunnel-eduroam_monitor_30000:62066:stderr [ -- empty -- ]
> May 16 14:20:35 [3052] bifur1       lrmd:    debug: operation_finished: 
> tunnel-eduroam_monitor_30000:62066:stdout [ -- empty -- ]
> May 16 14:20:35 [3052] bifur1       lrmd:    debug: log_finished:       
> finished - rsc:tunnel-eduroam action:monitor call_id:1045 pid:62066 
> exit-code:0 exec-time:0ms queue-time:0ms
> May 16 14:21:04 [3054] bifur1    pengine:     info: native_print:       
> tunnel-eduroam  (ocf::local:tunnel):    Started bifur1
> May 16 14:21:04 [3054] bifur1    pengine:     info: 
> check_action_definition:
>     Parameters to tunnel-eduroam_start_0 on bifur1 changed: was 
> 94afff0ff7cfc62f7cb1d5bf5b4d83aa vs. now f2317cad3d54cec5d7d7aa7d0bf35cf8 
> (restart:3.0.11) 0:0;48:3:0:73562fd6-1fe2-4930-8c6e-5953b82ebb32

This means that instance attributes changed in this case pacemaker
restarts resource to apply new values. Turning on trace level hopefully
will show what exactly is being changed. You can also dump CIB before
and after restart to compare current information.

> May 16 14:21:04 [3054] bifur1    pengine:    debug: native_assign_node: 
> Assigning bifur1 to tunnel-eduroam
> May 16 14:21:04 [3054] bifur1    pengine:     info: RecurringOp:         
> Start recurring monitor (30s) for tunnel-eduroam on bifur1
> May 16 14:21:04 [3054] bifur1    pengine:   notice: LogActions: Restart 
> tunnel-eduroam  (Started bifur1)
> May 16 14:21:04 [3055] bifur1       crmd:   notice: te_rsc_command:     
> Initiating stop operation tunnel-eduroam_stop_0 locally on bifur1 | action 
> 50
> May 16 14:21:04 [3055] bifur1       crmd:    debug: 
> stop_recurring_action_by_rsc:       Cancelling op 1045 for tunnel-eduroam 
> (tunnel-eduroam:1045)
> May 16 14:21:04 [3055] bifur1       crmd:    debug: cancel_op:  Cancelling 
> op 1045 for tunnel-eduroam (tunnel-eduroam:1045)
> May 16 14:21:04 [3052] bifur1       lrmd:     info: 
> cancel_recurring_action:    Cancelling ocf operation 
> tunnel-eduroam_monitor_30000
> May 16 14:21:04 [3052] bifur1       lrmd:    debug: log_finished:       
> finished - rsc:tunnel-eduroam action:monitor call_id:1045  exit-code:0 
> exec-time:0ms queue-time:0ms
> May 16 14:21:04 [3055] bifur1       crmd:    debug: cancel_op:  Op 1045 
> for tunnel-eduroam (tunnel-eduroam:1045): cancelled
> May 16 14:21:04 [3055] bifur1       crmd:     info: do_lrm_rsc_op:      
> Performing key=50:4:0:73562fd6-1fe2-4930-8c6e-5953b82ebb32 
> op=tunnel-eduroam_stop_0
> May 16 14:21:04 [3052] bifur1       lrmd:     info: log_execute:        
> executing - rsc:tunnel-eduroam action:stop call_id:1047
> May 16 14:21:04 [3055] bifur1       crmd:     info: process_lrm_event:  
> Result of monitor operation for tunnel-eduroam on bifur1: Cancelled | 
> call=1045 key=tunnel-eduroam_monitor_30000 confirmed=true
> ...
> 
> From where does the restart operation come? Why does it happen? The IP 
> address is at the same node where the tunnel resource is already running.
> 
> Best regards,
> Jozsef
> --
> E-mail : kadlecsik.jozsef at wigner.mta.hu
> PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address: Wigner Research Centre for Physics, Hungarian Academy of Sciences
>          H-1525 Budapest 114, POB. 49, Hungary
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 



More information about the Users mailing list