[ClusterLabs] How many nodes redhat cluster does supports
Umar Draz
unix.co at gmail.com
Wed Apr 27 15:25:37 EDT 2022
Hi
I am running a 3 nodes cluster on my AWS vms where I plan to use 3 nodes
for my websites. Now the issue is only 2 nodes at a time can mount the lvm
not all the 3 nodes. Here is the pcs status output.
[root at g2fs-1 ~]# pcs status --full
Cluster name: wp-cluster
Cluster Summary:
* Stack: corosync
* Current DC: g2fs-1 (1) (version 2.1.2-4.el8-ada5c3b36e2) - partition
with quorum
* Last updated: Wed Apr 27 19:12:48 2022
* Last change: Tue Apr 26 01:07:34 2022 by root via cibadmin on g2fs-1
* 3 nodes configured
* 13 resource instances configured
Node List:
* Online: [ g2fs-1 (1) g2fs-2 (2) g2fs-3 (3) ]
Full List of Resources:
* Clone Set: locking-clone [locking]:
* Resource Group: locking:0:
* dlm (ocf::pacemaker:controld): Started g2fs-1
* lvmlockd (ocf::heartbeat:lvmlockd): Started g2fs-1
* Resource Group: locking:1:
* dlm (ocf::pacemaker:controld): Started g2fs-3
* lvmlockd (ocf::heartbeat:lvmlockd): Started g2fs-3
* Resource Group: locking:2:
* dlm (ocf::pacemaker:controld): Started g2fs-2
* lvmlockd (ocf::heartbeat:lvmlockd): Started g2fs-2
* Clone Set: shared_vg1-clone [shared_vg1]:
* Resource Group: shared_vg1:0:
* sharedlv1 (ocf::heartbeat:LVM-activate): Started g2fs-3
* sharedfs1 (ocf::heartbeat:Filesystem): Started g2fs-3
* Resource Group: shared_vg1:1:
* sharedlv1 (ocf::heartbeat:LVM-activate): Started g2fs-2
* sharedfs1 (ocf::heartbeat:Filesystem): Started g2fs-2
* Resource Group: shared_vg1:2:
* sharedlv1 (ocf::heartbeat:LVM-activate): Stopped
* sharedfs1 (ocf::heartbeat:Filesystem): Stopped
* wpfence (stonith:fence_aws): Started g2fs-1
Migration Summary:
* Node: g2fs-1 (1):
* sharedfs1: migration-threshold=1000000 fail-count=1000000
last-failure='Tue Apr 26 01:07:46 2022'
Failed Resource Actions:
* sharedfs1_start_0 on g2fs-1 'error' (1): call=158, status='complete',
exitreason='Couldn't mount device [/dev/shared_vg1/shared_lv1] as
/mnt/webgfs', last-rc-change='Tue Apr 26 01:07:45 2022', queued=0ms,
exec=806ms
Tickets:
PCSD Status:
g2fs-1: Online
g2fs-2: Online
g2fs-3: Online
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root at g2fs-1 ~]#
Now if I just stop g2fs-2 or g2fs-3 then node g2fs-1 successfully mount the
lvm volume, but if I again power on g2fs-3 then g2fs-3 will not mount lvm
volume until I shutdown either g2fs-2 or g2fs-1.
Here is resource config
[root at g2fs-1 ~]# pcs resource config
Clone: locking-clone
Meta Attrs: interleave=true
Group: locking
Resource: dlm (class=ocf provider=pacemaker type=controld)
Operations: monitor interval=30s on-fail=fence
(dlm-monitor-interval-30s)
start interval=0s timeout=90s (dlm-start-interval-0s)
stop interval=0s timeout=100s (dlm-stop-interval-0s)
Resource: lvmlockd (class=ocf provider=heartbeat type=lvmlockd)
Operations: monitor interval=30s on-fail=fence
(lvmlockd-monitor-interval-30s)
start interval=0s timeout=90s (lvmlockd-start-interval-0s)
stop interval=0s timeout=90s (lvmlockd-stop-interval-0s)
Clone: shared_vg1-clone
Meta Attrs: interleave=true
Group: shared_vg1
Resource: sharedlv1 (class=ocf provider=heartbeat type=LVM-activate)
Attributes: activation_mode=shared lvname=shared_lv1
vg_access_mode=lvmlockd vgname=shared_vg1
Operations: monitor interval=30s timeout=90s
(sharedlv1-monitor-interval-30s)
start interval=0s timeout=90s (sharedlv1-start-interval-0s)
stop interval=0s timeout=90s (sharedlv1-stop-interval-0s)
Resource: sharedfs1 (class=ocf provider=heartbeat type=Filesystem)
Attributes: device=/dev/shared_vg1/shared_lv1 directory=/mnt/webgfs
fstype=gfs2 options=noatime
Operations: monitor interval=10s on-fail=fence
(sharedfs1-monitor-interval-10s)
start interval=0s timeout=60s (sharedfs1-start-interval-0s)
stop interval=0s timeout=60s (sharedfs1-stop-interval-0s)
[root at g2fs-1 ~]#
Here is the stonith config
Resource: wpfence (class=stonith type=fence_aws)
Attributes: access_key=AKIA5CLSSLOEXEKUNMXI
pcmk_host_map=g2fs-1:i-021b24d1343c1d5ea;g2fs-2:i-0015e5229b0139462;g2fs-3:i-0381c42de4515696f
pcmk_reboot_retries=4 pcmk_reboot_timeout=480 power_timeout=240
region=us-east-1 secret_key=IWKug66AZwb/q7PM00bJb2QtGlfceumdz3eO8TIF
Operations: monitor interval=60s (wpfence-monitor-interval-60s)
So the question is that redhat cluster only supports 2 nodes at a time or I
have not configured it properly?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20220428/2574bcb3/attachment-0001.htm>
More information about the Users
mailing list