[ClusterLabs] Pacemaker shutting down peer node
Ken Gaillot
kgaillot at redhat.com
Thu Jun 15 14:53:00 EDT 2017
On 06/15/2017 12:38 AM, Jaz Khan wrote:
> Hi,
>
> I have been encountering this serious issue from past couple of months.
> I really have no idea that why pacemaker sends shutdown signal to peer
> node and it goes down. This is very strange and I am too much worried .
>
> This is not happening daily, but it surely does this kind of behavior
> after every few days.
>
> Version:
> Pacemaker 1.1.16
> Corosync 2.4.2
>
> Please help me out with this bug! Below is the log message.
>
>
>
> Jun 14 15:52:23 apex1 crmd[18733]: notice: State transition S_IDLE ->
> S_POLICY_ENGINE
> Jun 14 15:52:23 apex1 pengine[18732]: notice: On loss of CCM Quorum: Ignore
>
> Jun 14 15:52:23 apex1 pengine[18732]: notice: Scheduling Node ha-apex2
> for shutdown
This is not a fencing, but a clean shutdown. Normally this only happens
in response to a user request.
Check the logs on both nodes before this point, to try to see what was
the first indication that it would shut down.
>
> Jun 14 15:52:23 apex1 pengine[18732]: notice: Move vip#011(Started
> ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]: notice: Move
> filesystem#011(Started ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]: notice: Move samba#011(Started
> ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]: notice: Move
> database#011(Started ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]: notice: Calculated transition
> 1744, saving inputs in /var/lib/pacemaker/pengine/pe-input-123.bz2
> Jun 14 15:52:23 apex1 crmd[18733]: notice: Initiating stop operation
> vip_stop_0 on ha-apex2
> Jun 14 15:52:23 apex1 crmd[18733]: notice: Initiating stop operation
> samba_stop_0 on ha-apex2
> Jun 14 15:52:23 apex1 crmd[18733]: notice: Initiating stop operation
> database_stop_0 on ha-apex2
> Jun 14 15:52:26 apex1 crmd[18733]: notice: Initiating stop operation
> filesystem_stop_0 on ha-apex2
> Jun 14 15:52:27 apex1 kernel: drbd apexdata apex2.br <http://apex2.br>:
> peer( Primary -> Secondary )
> Jun 14 15:52:27 apex1 crmd[18733]: notice: Initiating start operation
> filesystem_start_0 locally on ha-apex1
>
> Jun 14 15:52:27 apex1 crmd[18733]: notice: do_shutdown of peer ha-apex2
> is complete
>
> Jun 14 15:52:27 apex1 attrd[18731]: notice: Node ha-apex2 state is now lost
> Jun 14 15:52:27 apex1 attrd[18731]: notice: Removing all ha-apex2
> attributes for peer loss
> Jun 14 15:52:27 apex1 attrd[18731]: notice: Lost attribute writer ha-apex2
> Jun 14 15:52:27 apex1 attrd[18731]: notice: Purged 1 peers with id=2
> and/or uname=ha-apex2 from the membership cache
> Jun 14 15:52:27 apex1 stonith-ng[18729]: notice: Node ha-apex2 state is
> now lost
> Jun 14 15:52:27 apex1 stonith-ng[18729]: notice: Purged 1 peers with
> id=2 and/or uname=ha-apex2 from the membership cache
> Jun 14 15:52:27 apex1 cib[18728]: notice: Node ha-apex2 state is now lost
> Jun 14 15:52:27 apex1 cib[18728]: notice: Purged 1 peers with id=2
> and/or uname=ha-apex2 from the membership cache
>
>
>
> Best regards,
> Jaz. K
More information about the Users
mailing list