[ClusterLabs] A bug? (SLES15 SP2 with "crm resource refresh")

Ken Gaillot kgaillot at redhat.com
Fri Jan 8 11:38:49 EST 2021


On Fri, 2021-01-08 at 11:46 +0100, Ulrich Windl wrote:
> Hi!
> 
> Trying to reproduce a problem that had occurred in the past after a
> "crm resource refresh" ("reprobe"), I noticed something on the
> DC  that looks odd to me:
> 
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Forcing the
> status of all resources to be redetected
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  warning:
> new_event_notification (4478-26817-13): Broken pipe (32)

As an aside, the "Broken pipe" means the client disconnected before
getting all results back from the controller. It's not really a
problem. There has been some discussion about changing "Broken pipe" to
something like "Other side disconnected".

> ### We had that before, already...
> 
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: State
> transition S_IDLE -> S_POLICY_ENGINE
> Jan 08 11:13:21 h16 pacemaker-schedulerd[4477]:  notice: Watchdog
> will be used via SBD if fencing is required and stonith-watchdog-
> timeout is nonzero
> Jan 08 11:13:21 h16 pacemaker-schedulerd[4477]:  notice:  *
> Start      prm_stonith_sbd                      (             h16 )
> Jan 08 11:13:21 h16 pacemaker-schedulerd[4477]:  notice:  *
> Start      prm_DLM:0                            (             h18 )
> Jan 08 11:13:21 h16 pacemaker-schedulerd[4477]:  notice:  *
> Start      prm_DLM:1                            (             h19 )
> Jan 08 11:13:21 h16 pacemaker-schedulerd[4477]:  notice:  *
> Start      prm_DLM:2                            (             h16 )
> ...
> 
> ## So basically an announcemt to START everything that's running
> (everything is running); shouldn't that be "monitoring" (probe)
> instead?

Pacemaker schedules all actions that could be needed to bring the
cluster to the desired state (per the configuration). However later
actions depend on earlier actions getting certain results, and
everything will be recalculated if they don't.

For clean-ups, Pacemaker schedules probes, and assumes they will all
return "not running", so it schedules starts to occur after them. The
logs above are indicating that.

However:

> 
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Initiating
> monitor operation prm_stonith_sbd_monitor_0 on h19
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Initiating
> monitor operation prm_stonith_sbd_monitor_0 on h18
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Initiating
> monitor operation prm_stonith_sbd_monitor_0 locally on h16
> ...
> ### So _probes_ are started,
> 
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Transition 139
> aborted by operation prm_testVG_testLV_activate_monitor_0 'modify' on
> h16: Event failed
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Transition 139
> action 7 (prm_testVG_testLV_activate_monitor_0 on h16): expected 'not
> running' but got 'ok'
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Transition 139
> action 19 (prm_testVG_testLV_activate_monitor_0 on h18): expected
> 'not running' but got 'ok'
> Jan 08 11:13:21 h16 pacemaker-controld[4478]:  notice: Transition 139
> action 31 (prm_testVG_testLV_activate_monitor_0 on h19): expected
> 'not running' but got 'ok'
> ...
> ### That's odd, because the clone WAS running on each node. (Similar 
> results were reported for other clones)

The probes don't return "not running", they return "ok" since the
resources are actually running. That prevents the starts from actually
happening, and everything is recalculated with the new probe results.

Pacemaker doesn't expect the probes to return "ok" because the clean-up 
cleared all information that would lead it to do so. In other words,
Pacemaker doesn't remember any state from before the clean-up. That's
why it needs the probes, to find out what the current state actually
is.

To change it to be more "human", we'd have to change clean-up to mark
existing state as obsolete rather than remove it entirely. Then
Pacemaker could see that new probes are needed, but use the last known
result as the expected result. That could save some recalculation and
make the logs easier to follow, but it would complicate the resource
history handling and wouldn't change the end result.

> Jan 08 11:13:43 h16 pacemaker-controld[4478]:  notice: Transition 140
> (Complete=34, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-79.bz2): Complete
> Jan 08 11:13:43 h16 pacemaker-controld[4478]:  notice: State
> transition S_TRANSITION_ENGINE -> S_IDLE
> ### So in the end nothing was actually started, but those messages
> are quite confusing.
> 
> Pacemaker version was "(version 2.0.4+20200616.2deceaa3a-3.3.1-
> 2.0.4+20200616.2deceaa3a)" on all three nodes (latest for SLES).
> 
> 
> For reference here are the primitives that had odd result:
> primitive prm_testVG_testLV_activate LVM-activate \
>         params vgname=testVG lvname=testLV vg_access_mode=lvmlockd
> activation_mode=shared \
>         op start timeout=90s interval=0 \
>         op stop timeout=90s interval=0 \
>         op monitor interval=60s timeout=90s \
>         meta priority=9000
> clone cln_testVG_activate prm_testVG_testLV_activate \
>         meta interleave=true priority=9800 target-role=Started
> primitive prm_lvmlockd lvmlockd \
>         op start timeout=90 interval=0 \
>         op stop timeout=100 interval=0 \
>         op monitor interval=60 timeout=90 \
>         meta priority=9800
> clone cln_lvmlockd prm_lvmlockd \
>         meta interleave=true priority=9800
> order ord_lvmlockd__lvm_activate Mandatory: cln_lvmlockd (
> cln_testVG_activate )
> colocation col_lvm_activate__lvmlockd inf: ( cln_testVG_activate )
> cln_lvmlockd
> ### lvmlockd similarly depends on DLM (order, colocation), so I don't
> see a problem
> 
> Finally:
> h16:~ # vgs
>   VG      #PV #LV #SN Attr   VSize   VFree
>   sys       1   3   0 wz--n- 222.50g      0
>   testVG    1   1   0 wz--ns 299.81g 289.81g
> 
> 
> Regards,
> Ulrich
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list