[ClusterLabs] Pacemaker multi-state resource stop not running although "pcs status" indicates "Stopped"
ChittaNagaraj, Raghav
Raghav.ChittaNagaraj at dell.com
Fri Aug 13 15:46:47 EDT 2021
Hello Team,
Hope you doing well.
Running into an issue with multi-state resources not running stop function on a node but failing over to start the resource on another node part of the cluster when corosync process is killed.
Note, in the below, actual resource names/hostnames have been changed from the original.
Snippet of pcs status before corosync is killed:
$ hostname
pace_node_a
snippet of "pcs status"
colocated-resource (ocf::xxx:colocated-resource): Started pace_node_a
Master/Slave Set: main-multi-state-resource [main-multi]
Masters: [ pace_node_a ]
Stopped: [ pace_node_b ]
Now executed action to kill corosync process using kill -9 on "pace_node_a"
Resulting snippet of "pcs status"
colocated-resource (ocf::xxx:colocated-resource): Started pace_node_b
Master/Slave Set: main-multi-state-resource [main-multi]
Stopped: [ pace_node_a ]
Masters: [ pace_node_b ]
As you can see, pcs status indicates that "main-multi-state-resource" stopped where corosync was killed on "pace_node_a" and started on "pace_node_b". Although, this indication is right, the underlying resource managed by "main-multi-state-resource" never stopped on "pace_node_a". Also, there were no logs from crmd and other components stating it even attempted to stop on "pace_node_a". Interestingly, crmd logs indicated that the colocated resource - "colocated-resource" was being stopped and there is evidence that the resource managed by "colocated-resource" actually stopped.
Is this a known issue?
Please let us know if any additional information is needed.
Thanks for your help!
-Raghav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210813/c1edefb4/attachment.htm>
More information about the Users
mailing list