[ClusterLabs] pacemaker-controld getting respawned

S Sathish S s.s.sathish at ericsson.com
Fri Jan 3 08:23:48 EST 2020


Hi Team,

Pacemaker-controld process is getting restarted frequently reason for failure disconnect from CIB/Internal Error (or) high cpu on the system, same has been recorded in our system logs, Please find the pacemaker and corosync version installed on the system.

Kindly let us know why we are getting below error on the system.

corosync-2.4.4 -->  https://github.com/corosync/corosync/tree/v2.4.4
pacemaker-2.0.2 --> https://github.com/ClusterLabs/pacemaker/tree/Pacemaker-2.0.2

[root at vmc0621 ~]# ps -eo pid,lstart,cmd  | grep -iE 'corosync|pacemaker' | grep -v grep
2039 Wed Dec 25 15:56:15 2019 corosync
3048 Wed Dec 25 15:56:15 2019 /usr/sbin/pacemakerd -f
3101 Wed Dec 25 15:56:15 2019 /usr/libexec/pacemaker/pacemaker-based
3102 Wed Dec 25 15:56:15 2019 /usr/libexec/pacemaker/pacemaker-fenced
3103 Wed Dec 25 15:56:15 2019 /usr/libexec/pacemaker/pacemaker-execd
3104 Wed Dec 25 15:56:15 2019 /usr/libexec/pacemaker/pacemaker-attrd
3105 Wed Dec 25 15:56:15 2019 /usr/libexec/pacemaker/pacemaker-schedulerd
25371 Tue Dec 31 17:38:53 2019 /usr/libexec/pacemaker/pacemaker-controld


In system message logs :

Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: Node update 4419 failed: Timer expired (-62)
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: Node update 4420 failed: Timer expired (-62)
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: Input I_ERROR received in state S_IDLE from crmd_node_update_complete
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: State transition S_IDLE -> S_RECOVERY
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: warning: Fast-tracking shutdown in response to errors
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: warning: Not voting in election, we're in state S_RECOVERY
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: Input I_ERROR received in state S_RECOVERY from node_list_update_callback
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: Input I_TERMINATE received in state S_RECOVERY from do_recover
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Stopped 0 recurring operations at shutdown (12 remaining)
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:241 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:261 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:249 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:258 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:253 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:250 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:244 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_OCC:237 (XXX_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:264 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:270 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:238 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Recurring action XXX_vmc0621:267 (XXX_vmc0621_monitor_10000) incomplete at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: 12 resources were active at shutdown
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Disconnected from the executor
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Disconnected from Corosync
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: notice: Disconnected from the CIB manager
Dec 30 10:02:37 vmc0621 pacemaker-controld[7517]: error: Could not recover from internal error
Dec 30 10:02:37 vmc0621 pacemakerd[3048]: error: pacemaker-controld[7517] exited with status 1 (Error occurred)
Dec 30 10:02:37 vmc0621 pacemakerd[3048]: notice: Respawning failed child process: pacemaker-controld

Please let us know if any further logs required from our end.

Thanks and Regards,
S Sathish S
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20200103/00239dab/attachment.html>


More information about the Users mailing list