[ClusterLabs] Antw: Resources wont start on new node unless it is the only active node
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Wed Nov 9 09:33:02 CET 2016
>>> Ryan Anstey <ryan at treasuremart.net> schrieb am 08.11.2016 um 19:54 in Nachricht
<CAPj0oHxtumVgCbGO7ff6xTtd+=3FZbWiSBVmrHOCFuBRLMR-pg at mail.gmail.com>:
> I've been running a ceph cluster with pacemaker for a few months now.
> Everything has been working normally, but when I added a fourth node it
> won't work like the others, even though their OS is the same and the
Welcome to the club ;-)
> configs are all synced via salt. I also don't understand pacemaker that
> well since I followed a guide for it. If anyone could steer me in the right
> direction I would greatly appreciate it. Thank you!
I would start examining/showing the cluster status (I use "crm_mon -1Arfj"). Everything online? Same status from each node?
>
> - My resources only start if the new node is the only active node.
> - Once started on new node, if they are moved back to one of the original
> nodes, it won't go back to the new one.
> - My resources work 100% if I start them manually (without pacemaker).
> - (In the logs/configs below, my resources are named "unifi", "rbd_unifi"
> being the main one that's not working.)
>
> Log when running cleaning up the resource on the NEW node:
>
> Nov 08 09:25:20 h4 Filesystem(fs_unifi)[18044]: WARNING: Couldn't find
> device [/dev/rbd/rbd/unifi]. Expected /dev/??? to exist
> Nov 08 09:25:20 h4 lrmd[3564]: notice: lxc_unifi_monitor_0:18018:stderr [
> unifi doesn't exist ]
> Nov 08 09:25:20 h4 crmd[3567]: notice: Operation lxc_unifi_monitor_0: not
> running (node=h4, call=484, rc=7, cib-update=390, confirmed=true)
> Nov 08 09:25:20 h4 crmd[3567]: notice: h4-lxc_unifi_monitor_0:484 [ unifi
> doesn't exist\n ]
> Nov 08 09:25:20 h4 crmd[3567]: notice: Operation fs_unifi_monitor_0: not
> running (node=h4, call=480, rc=7, cib-update=391, confirmed=true)
> Nov 08 09:25:20 h4 crmd[3567]: notice: Operation rbd_unifi_monitor_0: not
> running (node=h4, call=476, rc=7, cib-update=392, confirmed=true)
>
> Log when running cleaning up the resource on the OLD node:
>
> Nov 08 09:21:18 h3 crmd[11394]: warning: No match for shutdown action on
> 167838209
This indicates a node communication problem!
[...]
Regards,
Ulrich
More information about the Users
mailing list