[ClusterLabs] resource going to blocked status while we restart service via systemctl twice

Mon Apr 17 05:16:34 EDT 2023

On Mon, Apr 17, 2023 at 9:25 AM S Sathish S via Users <users at clusterlabs.org>
wrote:

> Hi Team,
>
>
>
> TEST_node1 resource going to blocked status while we restart service via
> systemctl twice in less time/before completion of 1st systemctl command.
>
> In older pacemaker version 2.0.2 we don’t see this issue, only observing
> this issue on latest pacemaker version 2.1.15.
>

I'm not sure which change in particular with 2.1.5 would have created the
behavioral
change in your configuration. (rember discussion about reacting to
systemd-events
in pacemaker but didn't find anything already implemented on a quick check
of the
sources)
But basically afaik you are not expected to interfere with resources that
are under
pacemaker-control via anything else than pacemaker administration tooling
(high
or low level like e.g. pcs, crmsh, crm_resource, ...).
Otherwise you will see unexpected behavior. If you manage to do a restart
within
a monitoring-interval from the pacemaker-side you may get away without any
impact on the pacemaker-side though.

Klaus

>
> [root at node1 ~]# pcs resource status TEST_node1
>
>   * TEST_node1      (ocf::provider:TEST_RA):  Started node1
>
> [root at node1 ~]# systemctl restart TESTec
>
> [root at node1 ~]# cat /var/pid/TEST.pid
>
> 271466
>
> [root at node1 ~]# systemctl restart TESTec
>
> [root at node1 ~]# cat /var/pid/TEST.pid
>
> 271466
>
> [root at node1 ~]# pcs resource status TEST_node1
>
>   * TEST_node1      (ocf::provider:TEST_RA):  FAILED node1 (blocked)
>
> [root at node1 ~]#
>
>
>
>
>
> [root at node1 ~]# pcs resource config TEST_node1
>
> Resource: TEST_node1 (class=ocf provider=provider type=TEST_RA)
>
>   Meta Attributes: TEST_node1-meta_attributes
>
>     failure-timeout=120s
>
>     migration-threshold=5
>
>     priority=60
>
>   Operations:
>
>     migrate_from: TEST_node1-migrate_from-interval-0s
>
>       interval=0s
>
>       timeout=20
>
>     migrate_to: TEST_node1-migrate_to-interval-0s
>
>       interval=0s
>
>       timeout=20
>
>     monitor: TEST_node1-monitor-interval-10s
>
>       interval=10s
>
>       timeout=120s
>
>       on-fail=restart
>
>     reload: TEST_node1-reload-interval-0s
>
>       interval=0s
>
>       timeout=20
>
>     start: TEST_node1-start-interval-0s
>
>       interval=0s
>
>       timeout=120s
>
>       on-fail=restart
>
>     stop: TEST_node1-stop-interval-0s
>
>       interval=0s
>
>       timeout=120s
>
>       on-fail=block
>
> [root at node1 ~]#
>
>
>
> Thanks and Regards,
>
> S Sathish S
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230417/90578f46/attachment.htm>