[ClusterLabs] Missing success log message for resource migration

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Fri Feb 19 05:06:43 EST 2021


Hi!

Inspecting the logs after the cluster had rebalanced resources, I'm wondering:
It looks as if pacemaker-controld does log a success message when a local migration succeeded, but not if a remote one did.

Actions planned:
Migrate    prm_xen_test-jeos1            ( h16 -> h18 )
Migrate    prm_xen_test-jeos2            ( h18 -> h19 )
Migrate    prm_xen_test-jeos4            ( h16 -> h18 )

Like this (h19 being DC):

h19 pacemaker-controld[7539]:  notice: Initiating migrate_from operation prm_xen_test-jeos2_migrate_from_0 locally on h19
h19 pacemaker-controld[7539]:  notice: Result of migrate_from operation for prm_xen_test-jeos2 on h19: ok

But for a non-local migration it looks quite different:
h19 pacemaker-controld[7539]:  notice: Initiating migrate_to operation prm_xen_test-jeos1_migrate_to_0 on h16
h19 pacemaker-controld[7539]:  notice: Initiating migrate_to operation prm_xen_test-jeos4_migrate_to_0 on h16

h19 pacemaker-controld[7539]:  notice: Initiating migrate_from operation prm_xen_test-jeos4_migrate_from_0 on h18
h19 pacemaker-controld[7539]:  notice: Initiating migrate_from operation prm_xen_test-jeos1_migrate_from_0 on h18

h19 pacemaker-controld[7539]:  notice: Initiating stop operation prm_xen_test-jeos4_stop_0 on h16
h19 pacemaker-controld[7539]:  notice: Initiating stop operation prm_xen_test-jeos1_stop_0 on h16

h19 pacemaker-controld[7539]:  notice: Initiating monitor operation prm_xen_test-jeos4_monitor_600000 on h18
h19 pacemaker-controld[7539]:  notice: Initiating monitor operation prm_xen_test-jeos1_monitor_600000 on h18

h19 pacemaker-controld[7539]:  notice: State transition S_TRANSITION_ENGINE -> S_IDLE
(no more messages for a while)

On the remote nodes I see:
h18 pacemaker-controld[7131]:  notice: Result of migrate_from operation for prm_xen_test-jeos4 on h18: ok
h18 pacemaker-controld[7131]:  notice: Result of migrate_from operation for prm_xen_test-jeos1 on h18: ok

h16 pacemaker-controld[7223]:  notice: Result of migrate_to operation for prm_xen_test-jeos4 on h16: ok
h16 pacemaker-controld[7223]:  notice: Result of migrate_to operation for prm_xen_test-jeos1 on h16: ok

The other thing I noticed is that a start operation is logged with execution time like this:
h19 pacemaker-execd[7536]:  notice: prm_cron_snap_test-jeos2 start (call 170, PID 11079) exited with status 0 (execution time 55ms, queue time 0ms)

But for a migration there is no such timing info (logged by the DC; it's logged on the remote node). Considering that migrations can time-out, wouldn't that be a useful info to have as well?

Regards,
Ulrich





More information about the Users mailing list