[ClusterLabs] Missing success log message for resource migration
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Fri Feb 19 05:06:43 EST 2021
Hi!
Inspecting the logs after the cluster had rebalanced resources, I'm wondering:
It looks as if pacemaker-controld does log a success message when a local migration succeeded, but not if a remote one did.
Actions planned:
Migrate prm_xen_test-jeos1 ( h16 -> h18 )
Migrate prm_xen_test-jeos2 ( h18 -> h19 )
Migrate prm_xen_test-jeos4 ( h16 -> h18 )
Like this (h19 being DC):
h19 pacemaker-controld[7539]: notice: Initiating migrate_from operation prm_xen_test-jeos2_migrate_from_0 locally on h19
h19 pacemaker-controld[7539]: notice: Result of migrate_from operation for prm_xen_test-jeos2 on h19: ok
But for a non-local migration it looks quite different:
h19 pacemaker-controld[7539]: notice: Initiating migrate_to operation prm_xen_test-jeos1_migrate_to_0 on h16
h19 pacemaker-controld[7539]: notice: Initiating migrate_to operation prm_xen_test-jeos4_migrate_to_0 on h16
h19 pacemaker-controld[7539]: notice: Initiating migrate_from operation prm_xen_test-jeos4_migrate_from_0 on h18
h19 pacemaker-controld[7539]: notice: Initiating migrate_from operation prm_xen_test-jeos1_migrate_from_0 on h18
h19 pacemaker-controld[7539]: notice: Initiating stop operation prm_xen_test-jeos4_stop_0 on h16
h19 pacemaker-controld[7539]: notice: Initiating stop operation prm_xen_test-jeos1_stop_0 on h16
h19 pacemaker-controld[7539]: notice: Initiating monitor operation prm_xen_test-jeos4_monitor_600000 on h18
h19 pacemaker-controld[7539]: notice: Initiating monitor operation prm_xen_test-jeos1_monitor_600000 on h18
h19 pacemaker-controld[7539]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE
(no more messages for a while)
On the remote nodes I see:
h18 pacemaker-controld[7131]: notice: Result of migrate_from operation for prm_xen_test-jeos4 on h18: ok
h18 pacemaker-controld[7131]: notice: Result of migrate_from operation for prm_xen_test-jeos1 on h18: ok
h16 pacemaker-controld[7223]: notice: Result of migrate_to operation for prm_xen_test-jeos4 on h16: ok
h16 pacemaker-controld[7223]: notice: Result of migrate_to operation for prm_xen_test-jeos1 on h16: ok
The other thing I noticed is that a start operation is logged with execution time like this:
h19 pacemaker-execd[7536]: notice: prm_cron_snap_test-jeos2 start (call 170, PID 11079) exited with status 0 (execution time 55ms, queue time 0ms)
But for a migration there is no such timing info (logged by the DC; it's logged on the remote node). Considering that migrations can time-out, wouldn't that be a useful info to have as well?
Regards,
Ulrich
More information about the Users
mailing list