[ClusterLabs] corosync doesn't start any resource

Mon Jun 18 10:03:40 EDT 2018

On Fri, 2018-06-15 at 14:45 +0200, Stefan Krueger wrote:
> Hello, 
> 
> corosync doesn't start any ressource and I don't know why. I tried to
> stop/start the cluster, I also tried to reboot it but it doesn't
> help. also in the logs I dont find nothing that could be useful IMHO.
> 
> It would be very nice if someone can help me.
> 
> pcs status
> Cluster name: zfs-vmstorage
> Stack: corosync
> Current DC: zfs-serv3 (version 1.1.16-94ff4df) - partition with
> quorum
> Last updated: Fri Jun 15 14:42:32 2018
> Last change: Fri Jun 15 14:17:23 2018 by root via cibadmin on zfs-
> serv3
> 
> 2 nodes configured
> 3 resources configured
> 
> Online: [ zfs-serv3 zfs-serv4 ]
> 
> Full list of resources:
> 
>  nfs-server     (systemd:nfs-server):   Stopped
>  vm_storage     (ocf::heartbeat:ZFS):   Stopped
>  ha-ip  (ocf::heartbeat:IPaddr2):       Stopped
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> 
> 
> 
> pcs config
> Cluster Name: zfs-vmstorage
> Corosync Nodes:
>  zfs-serv3 zfs-serv4
> Pacemaker Nodes:
>  zfs-serv3 zfs-serv4
> 
> Resources:
>  Resource: nfs-server (class=systemd type=nfs-server)
>   Operations: start interval=0s timeout=100 (nfs-server-start-
> interval-0s)
>               stop interval=0s timeout=100 (nfs-server-stop-interval-
> 0s)
>               monitor interval=60 timeout=100 (nfs-server-monitor-
> interval-60)
>  Resource: vm_storage (class=ocf provider=heartbeat type=ZFS)
>   Attributes: pool=vm_storage importargs="-d /dev/disk/by-vdev/"
>   Operations: monitor interval=5s timeout=30s (vm_storage-monitor-
> interval-5s)
>               start interval=0s timeout=90 (vm_storage-start-
> interval-0s)
>               stop interval=0s timeout=90 (vm_storage-stop-interval-
> 0s)
>  Resource: ha-ip (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=172.16.101.73 cidr_netmask=16
>   Operations: start interval=0s timeout=20s (ha-ip-start-interval-0s)
>               stop interval=0s timeout=20s (ha-ip-stop-interval-0s)
>               monitor interval=10s timeout=20s (ha-ip-monitor-
> interval-10s)
> 
> Stonith Devices:
> Fencing Levels:
> 
> Location Constraints:
> Ordering Constraints:
>   Resource Sets:
>     set nfs-server vm_storage ha-ip action=start (id:pcs_rsc_set_nfs-
> server_vm_storage_ha-ip) (id:pcs_rsc_order_set_nfs-
> server_vm_storage_ha-ip)
>     set ha-ip nfs-server vm_storage action=stop (id:pcs_rsc_set_ha-
> ip_nfs-server_vm_storage) (id:pcs_rsc_order_set_ha-ip_nfs-
> server_vm_storage)
> Colocation Constraints:
>   Resource Sets:
>     set ha-ip nfs-server vm_storage (id:colocation-ha-ip-nfs-server-
> INFINITY-0) setoptions score=INFINITY (id:colocation-ha-ip-nfs-
> server-INFINITY)
> Ticket Constraints:
> 
> Alerts:
>  No alerts defined
> 
> Resources Defaults:
>  resource-stickiness: 100
> Operations Defaults:
>  No defaults set
> 
> Cluster Properties:
>  cluster-infrastructure: corosync
>  cluster-name: zfs-vmstorage
>  dc-version: 1.1.16-94ff4df
>  have-watchdog: false
>  last-lrm-refresh: 1528814481
>  no-quorum-policy: ignore

It's recommended to let no-quorum-policy default when using corosync 2,
and instead set "two_node: 1" in corosync.conf. In the old days, it was
necessary for pacemaker to ignore quorum with two nodes, but now,
corosync handles it better. With two_node, both nodes will need to be
online before the cluster can run, but once up, either node can go down
and the cluster will maintain quorum.

>  stonith-enabled: false

Without stonith, the cluster will be unable to recover from certain
failure scenarios, and there is a possibility of data corruption from a
split-brain situation. It's a good idea to get stonith configured and
tested before adding any resources to a cluster.

> 
> Quorum:
>   Options:
> 
> 
> 
> and here are the Log-files
> 
> https://paste.debian.net/hidden/9376add7/
> 
> best regards
> Stefan

As of the end of that log file, the cluster does intend to start the
resources:

Jun 15 14:29:11 [5623] zfs-serv3    pengine:   notice: LogActions:	
Start   nfs-server	(zfs-serv3)
Jun 15 14:29:11 [5623] zfs-serv3    pengine:   notice: LogActions:	
Start   vm_storage	(zfs-serv3)
Jun 15 14:29:11 [5623] zfs-serv3    pengine:   notice: LogActions:	
Start   ha-ip	(zfs-serv3)

Later logs would show whether the start was successful or not.
-- 
Ken Gaillot <kgaillot at redhat.com>