<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi, all dear friends,</p>
<p>i need your help to enable the hot switch of iSCSI under a
Pacemaker/Corosync Cluster, which has a iSCSI Device based on a
two node DRBD Replication.</p>
<p>I've got the Pacemaker/Corosync cluster working, DRBD replication
also working, but it stuck at iSCSI, i can manually start a tgtd
on one node, so the VCSA can recognize the iSCSI Disk and create
VMFS/StorageObject on it, and then i can create a test VM on that
VMFS. <br>
</p>
<p>But when i switch the Primary/Secondary of DRBD, although the
test VM still running, but the underlying Disk became read-only.
As far as i know, the tgtd should be handled by Pacemaker so it
will automatically start on the Primary DRBD Instance, but in my
situation it's sadly NOT.</p>
<p><br>
</p>
<p>I've tried all kinds of resources/manuals/documents, but they all
mixed with extra information, other system, other software
version.<br>
</p>
<p>And one of my BEST reference (the closest configuration to mein)
is this url: <a class="moz-txt-link-freetext" href="https://nnc3.com/mags/LJ_1994-2014/LJ/217/11275.html">https://nnc3.com/mags/LJ_1994-2014/LJ/217/11275.html</a></p>
<p>The difference betwee me and this article, i think is i don't
have LVM Volume but only raw iSCSI Disk, and i have to translate
CRM commands into PCS commands</p>
<p>But after i "copied" the configuration from this article, my
cluster can not start anymore, i've tried remove the LVM resource
(which caused a "device not found" error), but the resource group
still can't start and without any explicit "reason" from
Pacemaker.<br>
</p>
<p><br>
</p>
<font size="+3"><b>1</b></font>. The whole configuration is under a
two node ESXi 6.5 Cluster, which has a VCSA one one ESXi host
installed. <br>
<p>I have a simple diagram in attachment, which may state the
deployment better.</p>
<p><font size="+3">2</font>. start point:<br>
</p>
<p>The involved hosts are all with mapped through local dns, which
also includes the floating vip, the local domain is s-ka.local:<br>
</p>
<hr width="100%" size="2">
<p>firwall: fw01.s-ka.local. IN A 192.168.95.249<br>
<br>
vcsa: vc01.s-ka.local. IN A 192.168.95.30 <br>
esxi: esx01.s-ka.local. IN A 192.168.95.5<br>
esxi: esx02.s-ka.local. IN A 192.168.95.7<br>
<br>
drbd: drbd0.s-ka.local. IN A 192.168.95.45<br>
drbd: drbd1.s-ka.local. IN A 192.168.95.47<br>
vip: ipstor0.s-ka.local. IN A 192.168.95.48<br>
<br>
heartbeat: drbd0-ha.s-ka.local. IN A 192.168.96.45<br>
heartbeat: drbd1-ha.s-ka.local. IN A 192.168.96.47<br>
</p>
<hr width="100%" size="2">
<p><br>
</p>
<p>The both drbd server are CentOS 7.5, the installed packages are
here:</p>
<hr width="100%" size="2">
<p>[root@drbd0 ~]# cat /etc/centos-release<br>
CentOS Linux release 7.5.1804 (Core) <br>
</p>
<p>[root@drbd0 ~]# uname -a<br>
Linux drbd0.s-ka.local 3.10.0-862.9.1.el7.x86_64 #1 SMP Mon Jul 16
16:29:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux<br>
</p>
<p>[root@drbd1 ~]# yum list installed|grep pacemaker<br>
pacemaker.x86_64
1.1.18-11.el7_5.3 @updates <br>
pacemaker-cli.x86_64
1.1.18-11.el7_5.3 @updates <br>
pacemaker-cluster-libs.x86_64
1.1.18-11.el7_5.3 @updates <br>
pacemaker-libs.x86_64
1.1.18-11.el7_5.3 @updates <br>
</p>
<p>[root@drbd1 ~]# yum list installed|grep coro<br>
corosync.x86_64
2.4.3-2.el7_5.1 @updates <br>
corosynclib.x86_64
2.4.3-2.el7_5.1 @updates <br>
</p>
<p>[root@drbd1 ~]# yum list installed|grep drbd<br>
drbd90-utils.x86_64
9.3.1-1.el7.elrepo @elrepo <br>
kmod-drbd90.x86_64
9.0.14-1.el7_5.elrepo @elrepo <br>
</p>
<p>[root@drbd1 ~]# yum list installed|grep -i scsi<br>
lsscsi.x86_64
0.27-6.el7 @anaconda<br>
scsi-target-utils.x86_64
1.0.55-4.el7 @epel <br>
<br>
</p>
<hr width="100%" size="2">
<p><br>
</p>
<p><font size="+3">3</font>. configurations</p>
<p><font size="+2">3.1</font> ok first the drbd configuration</p>
<hr width="100%" size="2">
<p>[root@drbd1 ~]# cat /etc/drbd.conf <br>
# You can find an example in
/usr/share/doc/drbd.../drbd.conf.example<br>
<br>
include "drbd.d/global_common.conf";<br>
include "drbd.d/*.res";</p>
<p>[root@drbd1 ~]# cat /etc/drbd.d/r0.res <br>
resource iscsivg01 {<br>
protocol C;<br>
device /dev/drbd0;<br>
disk /dev/vg0/ipstor0;<br>
flexible-meta-disk internal;<br>
on drbd0.s-ka.local {<br>
#volume 0 {<br>
#device /dev/drbd0;<br>
#disk /dev/vg0/ipstor0;<br>
#flexible-meta-disk internal;<br>
#}<br>
address 192.168.96.45:7788;<br>
}<br>
on drbd1.s-ka.local {<br>
#volume 0 {<br>
#device /dev/drbd0;<br>
#disk /dev/vg0/ipstor0;<br>
#flexible-meta-disk internal;<br>
#}<br>
address 192.168.96.47:7788;<br>
}<br>
}<br>
<br>
</p>
<hr width="100%" size="2">
<p><font size="+2">3.2</font> then the drbd device</p>
<hr width="100%" size="2">
<p>[root@drbd1 ~]# lsblk<br>
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT<br>
sda 8:0 0 25G 0 disk <br>
├─sda1 8:1 0 1G 0 part /boot<br>
└─sda2 8:2 0 24G 0 part <br>
├─centos-root 253:0 0 22G 0 lvm /<br>
└─centos-swap 253:1 0 2G 0 lvm [SWAP]<br>
sdb 8:16 0 500G 0 disk <br>
└─sdb1 8:17 0 500G 0 part <br>
└─vg0-ipstor0 253:2 0 500G 0 lvm <br>
└─drbd0 147:0 0 500G 1 disk <br>
sr0 11:0 1 1024M 0 rom <br>
</p>
<p>[root@drbd1 ~]# tree /dev/drbd<br>
drbd/ drbd0 <br>
[root@drbd1 ~]# tree /dev/drbd<br>
/dev/drbd<br>
├── by-disk<br>
│ └── vg0<br>
│ └── ipstor0 -> ../../../drbd0<br>
└── by-res<br>
└── iscsivg01<br>
└── 0 -> ../../../drbd0<br>
<br>
4 directories, 2 files<br>
</p>
<hr width="100%" size="2">
<p><font size="+2">3.3</font>drbd status</p>
<hr width="100%" size="2">
<p>[root@drbd1 ~]# drbdadm status<br>
iscsivg01 role:Secondary<br>
disk:UpToDate<br>
drbd0.s-ka.local role:Primary<br>
peer-disk:UpToDate</p>
<p>[root@drbd0 ~]# drbdadm status<br>
iscsivg01 role:Primary<br>
disk:UpToDate<br>
drbd1.s-ka.local role:Secondary<br>
peer-disk:UpToDate</p>
<p>[root@drbd0 ~]# cat /proc/drbd<br>
version: 9.0.14-1 (api:2/proto:86-113)<br>
GIT-hash: 62f906cf44ef02a30ce0c148fec223b40c51c533 build by
mockbuild@, 2018-05-04 03:32:42<br>
Transports (api:16): tcp (9.0.14-1)<br>
</p>
<hr width="100%" size="2">
<p><font size="+2">3.4</font> Corosync configuration</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# cat /etc/corosync/corosync.conf<br>
totem {<br>
version: 2<br>
cluster_name: cluster1<br>
secauth: off<br>
transport: udpu<br>
}<br>
<br>
nodelist {<br>
node {<br>
ring0_addr: drbd0-ha.s-ka.local<br>
nodeid: 1<br>
}<br>
<br>
node {<br>
ring0_addr: drbd1-ha.s-ka.local<br>
nodeid: 2<br>
}<br>
}<br>
<br>
quorum {<br>
provider: corosync_votequorum<br>
two_node: 1<br>
}<br>
<br>
logging {<br>
to_logfile: yes<br>
logfile: /var/log/cluster/corosync.log<br>
to_syslog: yes<br>
}<br>
</p>
<hr width="100%" size="2">
<p><br>
</p>
<p><font size="+2">3.5</font> Corosync status:</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# systemctl status corosync<br>
● corosync.service - Corosync Cluster Engine<br>
Loaded: loaded (/usr/lib/systemd/system/corosync.service;
enabled; vendor preset: disabled)<br>
Active: active (running) since Sun 2018-10-14 02:58:01 CEST; 2
days ago<br>
Docs: man:corosync<br>
man:corosync.conf<br>
man:corosync_overview<br>
Process: 1095 ExecStart=/usr/share/corosync/corosync start
(code=exited, status=0/SUCCESS)<br>
Main PID: 1167 (corosync)<br>
CGroup: /system.slice/corosync.service<br>
└─1167 corosync<br>
<br>
Oct 14 02:58:00 drbd0.s-ka.local corosync[1167]: [MAIN ]
Completed service synchronization, ready to provide service.<br>
Oct 14 02:58:01 drbd0.s-ka.local corosync[1095]: Starting Corosync
Cluster Engine (corosync): [ OK ]<br>
Oct 14 02:58:01 drbd0.s-ka.local systemd[1]: Started Corosync
Cluster Engine.<br>
Oct 14 10:46:03 drbd0.s-ka.local corosync[1167]: [TOTEM ] A new
membership (192.168.96.45:384) was formed. Members left: 2<br>
Oct 14 10:46:03 drbd0.s-ka.local corosync[1167]: [QUORUM]
Members[1]: 1<br>
Oct 14 10:46:03 drbd0.s-ka.local corosync[1167]: [MAIN ]
Completed service synchronization, ready to provide service.<br>
Oct 14 10:46:22 drbd0.s-ka.local corosync[1167]: [TOTEM ] A new
membership (192.168.96.45:388) was formed. Members joined: 2<br>
Oct 14 10:46:22 drbd0.s-ka.local corosync[1167]: [CPG ]
downlist left_list: 0 received in state 0<br>
Oct 14 10:46:22 drbd0.s-ka.local corosync[1167]: [QUORUM]
Members[2]: 1 2<br>
Oct 14 10:46:22 drbd0.s-ka.local corosync[1167]: [MAIN ]
Completed service synchronization, ready to provide service.</p>
<hr width="100%" size="2">
<p><font size="+2">3.6</font> tgtd configuration:</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# cat /etc/tgt/targets.conf <br>
# This is a sample config file for tgt-admin.<br>
#<br>
# The "#" symbol disables the processing of a line.<br>
<br>
# Set the driver. If not specified, defaults to "iscsi".<br>
default-driver iscsi<br>
<br>
# Set iSNS parameters, if needed<br>
#iSNSServerIP 192.168.111.222<br>
#iSNSServerPort 3205<br>
#iSNSAccessControl On<br>
#iSNS On<br>
<br>
# Continue if tgtadm exits with non-zero code (equivalent of<br>
# --ignore-errors command line option)<br>
#ignore-errors yes<br>
<br>
<br>
<target iqn.2018-08.s-ka.local:disk.1><br>
lun 10<br>
backing-store /dev/drbd0<br>
initiator-address 192.168.96.0/24<br>
initiator-address 192.168.95.0/24<br>
target-address 192.168.95.48<br>
</target><br>
</p>
<hr width="100%" size="2">
<p><br>
</p>
<p><font size="+2">3.7</font> tgtd has been on both server disabled,
only startable from current Primary DRBD Node.</p>
<hr width="100%" size="2">
<p>Secondary Node:</p>
<p>[root@drbd1 ~]# systemctl status tgtd<br>
● tgtd.service - tgtd iSCSI target daemon<br>
Loaded: loaded (/usr/lib/systemd/system/tgtd.service; disabled;
vendor preset: disabled)<br>
Active: inactive (dead)<br>
[root@drbd1 ~]# systemctl restart tgtd<br>
Job for tgtd.service failed because the control process exited
with error code. See "systemctl status tgtd.service" and
"journalctl -xe" for details.</p>
<p><br>
</p>
<p>Primary Node:</p>
<p>[root@drbd0 corosync]# systemctl status tgtd<br>
● tgtd.service - tgtd iSCSI target daemon<br>
Loaded: loaded (/usr/lib/systemd/system/tgtd.service; disabled;
vendor preset: disabled)<br>
Active: inactive (dead)<br>
[root@drbd0 corosync]# systemctl restart tgtd<br>
[root@drbd0 corosync]# systemctl status tgtd<br>
● tgtd.service - tgtd iSCSI target daemon<br>
Loaded: loaded (/usr/lib/systemd/system/tgtd.service; disabled;
vendor preset: disabled)<br>
Active: active (running) since Tue 2018-10-16 14:09:47 CEST;
2min 29s ago<br>
Process: 22300 ExecStartPost=/usr/sbin/tgtadm --op update --mode
sys --name State -v ready (code=exited, status=0/SUCCESS)<br>
Process: 22272 ExecStartPost=/usr/sbin/tgt-admin -e -c
$TGTD_CONFIG (code=exited, status=0/SUCCESS)<br>
Process: 22271 ExecStartPost=/usr/sbin/tgtadm --op update --mode
sys --name State -v offline (code=exited, status=0/SUCCESS)<br>
Process: 22270 ExecStartPost=/bin/sleep 5 (code=exited,
status=0/SUCCESS)<br>
Main PID: 22269 (tgtd)<br>
CGroup: /system.slice/tgtd.service<br>
└─22269 /usr/sbin/tgtd -f<br>
<br>
Oct 16 14:09:42 drbd0.s-ka.local systemd[1]: Starting tgtd iSCSI
target daemon...<br>
Oct 16 14:09:42 drbd0.s-ka.local tgtd[22269]: tgtd:
iser_ib_init(3436) Failed to initialize RDMA; load kernel modules?<br>
Oct 16 14:09:42 drbd0.s-ka.local tgtd[22269]: tgtd:
work_timer_start(146) use timer_fd based scheduler<br>
Oct 16 14:09:42 drbd0.s-ka.local tgtd[22269]: tgtd:
bs_init_signalfd(267) could not open backing-store module
directory /usr/lib64/tgt/backing-store<br>
Oct 16 14:09:42 drbd0.s-ka.local tgtd[22269]: tgtd: bs_init(386)
use signalfd notification<br>
Oct 16 14:09:47 drbd0.s-ka.local tgtd[22269]: tgtd:
device_mgmt(246) sz:16 params:path=/dev/drbd0<br>
Oct 16 14:09:47 drbd0.s-ka.local tgtd[22269]: tgtd:
bs_thread_open(408) 16<br>
Oct 16 14:09:47 drbd0.s-ka.local systemd[1]: Started tgtd iSCSI
target daemon.<br>
</p>
<hr width="100%" size="2">
<p><font size="+2">3.8</font> it was until this point all working,
but if i switched the DRBD Primary Node, it won't work anymore
(FileSystem of test Node became read-only)</p>
<p>so i changed the pcs configuration according to the previously
mentioned article: <br>
</p>
<hr width="100%" size="2">
<p>> pcs resource create p_iscsivg01 ocf:heartbeat:LVM
volgrpname="vg0" op monitor interval="30"</p>
<p>> pcs resource group add p_iSCSI p_iscsivg01 p_iSCSITarget
p_iSCSILogicalUnit ClusterIP</p>
<p>> pcs constraint order start ipstor0Clone then start p_iSCSI
then start ipstor0Clone:Master</p>
<p><br>
</p>
<p>[root@drbd0 ~]# pcs status<br>
Cluster name: cluster1<br>
Stack: corosync<br>
Current DC: drbd0-ha.s-ka.local (version
1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum<br>
Last updated: Sun Oct 14 01:38:18 2018<br>
Last change: Sun Oct 14 01:37:58 2018 by root via cibadmin on
drbd0-ha.s-ka.local<br>
<br>
2 nodes configured<br>
6 resources configured<br>
<br>
Online: [ drbd0-ha.s-ka.local drbd1-ha.s-ka.local ]<br>
<br>
Full list of resources:<br>
<br>
Master/Slave Set: ipstor0Clone [ipstor0]<br>
Masters: [ drbd0-ha.s-ka.local ]<br>
Slaves: [ drbd1-ha.s-ka.local ]<br>
Resource Group: p_iSCSI<br>
p_iscsivg01 (ocf::heartbeat:LVM): Stopped<br>
p_iSCSITarget (ocf::heartbeat:iSCSITarget): Stopped<br>
p_iSCSILogicalUnit
(ocf::heartbeat:iSCSILogicalUnit): Stopped<br>
ClusterIP (ocf::heartbeat:IPaddr2): Stopped<br>
<br>
Failed Actions:<br>
* p_iSCSILogicalUnit_start_0 on drbd0-ha.s-ka.local 'unknown
error' (1): call=42, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 01:20:38 2018', queued=0ms,
exec=28ms<br>
* p_iSCSITarget_start_0 on drbd0-ha.s-ka.local 'unknown error'
(1): call=40, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 00:54:36 2018', queued=0ms,
exec=23ms<br>
* p_iscsivg01_start_0 on drbd0-ha.s-ka.local 'unknown error'
(1): call=48, status=complete, exitreason='Volume group
[iscsivg01] does not exist or contains error! Volume group
"iscsivg01" not found',<br>
last-rc-change='Sun Oct 14 01:32:49 2018', queued=0ms,
exec=47ms<br>
* p_iSCSILogicalUnit_start_0 on drbd1-ha.s-ka.local 'unknown
error' (1): call=41, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 01:20:38 2018', queued=0ms,
exec=31ms<br>
* p_iSCSITarget_start_0 on drbd1-ha.s-ka.local 'unknown error'
(1): call=39, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 00:54:36 2018', queued=0ms,
exec=24ms<br>
* p_iscsivg01_start_0 on drbd1-ha.s-ka.local 'unknown error'
(1): call=47, status=complete, exitreason='Volume group
[iscsivg01] does not exist or contains error! Volume group
"iscsivg01" not found',<br>
last-rc-change='Sun Oct 14 01:32:49 2018', queued=0ms,
exec=50ms<br>
<br>
<br>
Daemon Status:<br>
corosync: active/enabled<br>
pacemaker: active/enabled<br>
pcsd: active/enabled<br>
[root@drbd0 ~]# <br>
</p>
<hr width="100%" size="2">
<p><br>
</p>
<p><font size="+2">3.9</font> since the "device not found" error, so
i remove the LVM, it looks like this now: <br>
</p>
<p>actually it was changed between /dev/drbd/by-disk and
/dev/drbd/by-res, but no effects<br>
</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# pcs status<br>
Cluster name: cluster1<br>
Stack: corosync<br>
Current DC: drbd0-ha.s-ka.local (version
1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum<br>
Last updated: Tue Oct 16 14:18:09 2018<br>
Last change: Sun Oct 14 02:06:36 2018 by root via cibadmin on
drbd0-ha.s-ka.local<br>
<br>
2 nodes configured<br>
5 resources configured<br>
<br>
Online: [ drbd0-ha.s-ka.local drbd1-ha.s-ka.local ]<br>
<br>
Full list of resources:<br>
<br>
Master/Slave Set: ipstor0Clone [ipstor0]<br>
Masters: [ drbd0-ha.s-ka.local ]<br>
Slaves: [ drbd1-ha.s-ka.local ]<br>
Resource Group: p_iSCSI<br>
p_iSCSITarget (ocf::heartbeat:iSCSITarget): Stopped<br>
p_iSCSILogicalUnit (ocf::heartbeat:iSCSILogicalUnit):
Stopped<br>
ClusterIP (ocf::heartbeat:IPaddr2): Stopped<br>
<br>
Failed Actions:<br>
* p_iSCSITarget_start_0 on drbd0-ha.s-ka.local 'unknown error'
(1): call=12, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 02:58:04 2018', queued=1ms,
exec=58ms<br>
* p_iSCSITarget_start_0 on drbd1-ha.s-ka.local 'unknown error'
(1): call=12, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 10:47:06 2018', queued=0ms,
exec=22ms<br>
<br>
<br>
Daemon Status:<br>
corosync: active/enabled<br>
pacemaker: active/enabled<br>
pcsd: active/enabled<br>
[root@drbd0 corosync]# <br>
</p>
<hr width="100%" size="2">
<p><font size="+2">3.10</font> i've tried with "pcs resouce
debug-start xxx --full" on the DRBD Primary Node, <br>
</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# pcs resource debug-start p_iSCSI --full<br>
Error: unable to debug-start a group, try one of the group's
resource(s) (p_iSCSITarget,p_iSCSILogicalUnit,ClusterIP)<br>
</p>
<p>[root@drbd0 corosync]# pcs resource debug-start p_iSCSITarget
--full<br>
Operation start for p_iSCSITarget (ocf:heartbeat:iSCSITarget)
returned: 'ok' (0)<br>
> stderr: DEBUG: p_iSCSITarget start : 0<br>
</p>
<p>[root@drbd0 corosync]# pcs resource debug-start
p_iSCSILogicalUnit --full<br>
Operation start for p_iSCSILogicalUnit
(ocf:heartbeat:iSCSILogicalUnit) returned: 'unknown error' (1)<br>
> stderr: ERROR: tgtadm: this logical unit number already
exists<br>
</p>
<p>[root@drbd0 corosync]# pcs resource debug-start ClusterIP --full<br>
Operation start for ClusterIP (ocf:heartbeat:IPaddr2) returned:
'ok' (0)<br>
> stderr: INFO: Adding inet address 192.168.95.48/32 with
broadcast address 192.168.95.255 to device ens192<br>
> stderr: INFO: Bringing device ens192 up<br>
> stderr: INFO: /usr/libexec/heartbeat/send_arp -i 200 -c 5
-p /var/run/resource-agents/send_arp-192.168.95.48 -I ens192 -m
auto 192.168.95.48<br>
[root@drbd0 corosync]# <br>
<br>
</p>
<hr width="100%" size="2">
<p><font size="+2">3.11</font> as you may seen, there are errors,
but "p_iSCSITarget" was successfully startet. but "pcs status"
show still "stopped"</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# pcs status<br>
Cluster name: cluster1<br>
Stack: corosync<br>
Current DC: drbd0-ha.s-ka.local (version
1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum<br>
Last updated: Tue Oct 16 14:22:38 2018<br>
Last change: Sun Oct 14 02:06:36 2018 by root via cibadmin on
drbd0-ha.s-ka.local<br>
<br>
2 nodes configured<br>
5 resources configured<br>
<br>
Online: [ drbd0-ha.s-ka.local drbd1-ha.s-ka.local ]<br>
<br>
Full list of resources:<br>
<br>
Master/Slave Set: ipstor0Clone [ipstor0]<br>
Masters: [ drbd0-ha.s-ka.local ]<br>
Slaves: [ drbd1-ha.s-ka.local ]<br>
Resource Group: p_iSCSI<br>
p_iSCSITarget (ocf::heartbeat:iSCSITarget): Stopped<br>
p_iSCSILogicalUnit (ocf::heartbeat:iSCSILogicalUnit):
Stopped<br>
ClusterIP (ocf::heartbeat:IPaddr2): Stopped<br>
<br>
Failed Actions:<br>
* p_iSCSITarget_start_0 on drbd0-ha.s-ka.local 'unknown error'
(1): call=12, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 02:58:04 2018', queued=1ms,
exec=58ms<br>
* p_iSCSITarget_start_0 on drbd1-ha.s-ka.local 'unknown error'
(1): call=12, status=complete, exitreason='',<br>
last-rc-change='Sun Oct 14 10:47:06 2018', queued=0ms,
exec=22ms<br>
<br>
<br>
Daemon Status:<br>
corosync: active/enabled<br>
pacemaker: active/enabled<br>
pcsd: active/enabled<br>
[root@drbd0 corosync]# <br>
</p>
<hr width="100%" size="2">
<p>3.12 the pcs config is:</p>
<hr width="100%" size="2">
<p>[root@drbd0 corosync]# pcs config<br>
Cluster Name: cluster1<br>
Corosync Nodes:<br>
drbd0-ha.s-ka.local drbd1-ha.s-ka.local<br>
Pacemaker Nodes:<br>
drbd0-ha.s-ka.local drbd1-ha.s-ka.local<br>
<br>
Resources:<br>
Master: ipstor0Clone<br>
Meta Attrs: master-node-max=1 clone-max=2 notify=true
master-max=1 clone-node-max=1 <br>
Resource: ipstor0 (class=ocf provider=linbit type=drbd)<br>
Attributes: drbd_resource=iscsivg01<br>
Operations: demote interval=0s timeout=90
(ipstor0-demote-interval-0s)<br>
monitor interval=60s (ipstor0-monitor-interval-60s)<br>
notify interval=0s timeout=90
(ipstor0-notify-interval-0s)<br>
promote interval=0s timeout=90
(ipstor0-promote-interval-0s)<br>
reload interval=0s timeout=30
(ipstor0-reload-interval-0s)<br>
start interval=0s timeout=240
(ipstor0-start-interval-0s)<br>
stop interval=0s timeout=100
(ipstor0-stop-interval-0s)<br>
Group: p_iSCSI<br>
Resource: p_iSCSITarget (class=ocf provider=heartbeat
type=iSCSITarget)<br>
Attributes: implementation=tgt
iqn=iqn.2018-08.s-ka.local:disk.1 tid=1<br>
Operations: monitor interval=30 timeout=60
(p_iSCSITarget-monitor-interval-30)<br>
start interval=0 timeout=60
(p_iSCSITarget-start-interval-0)<br>
stop interval=0 timeout=60
(p_iSCSITarget-stop-interval-0)<br>
Resource: p_iSCSILogicalUnit (class=ocf provider=heartbeat
type=iSCSILogicalUnit)<br>
Attributes: implementation=tgt lun=10
path=/dev/drbd/by-disk/vg0/ipstor0
target_iqn=iqn.2018-08.s-ka.local:disk.1<br>
Operations: monitor interval=30 timeout=60
(p_iSCSILogicalUnit-monitor-interval-30)<br>
start interval=0 timeout=60
(p_iSCSILogicalUnit-start-interval-0)<br>
stop interval=0 timeout=60
(p_iSCSILogicalUnit-stop-interval-0)<br>
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)<br>
Attributes: cidr_netmask=32 ip=192.168.95.48<br>
Operations: monitor interval=30s
(ClusterIP-monitor-interval-30s)<br>
start interval=0s timeout=20s
(ClusterIP-start-interval-0s)<br>
stop interval=0s timeout=20s
(ClusterIP-stop-interval-0s)<br>
<br>
Stonith Devices:<br>
Fencing Levels:<br>
<br>
Location Constraints:<br>
Ordering Constraints:<br>
start ipstor0Clone then start p_iSCSI (kind:Mandatory)<br>
Colocation Constraints:<br>
Ticket Constraints:<br>
<br>
Alerts:<br>
No alerts defined<br>
<br>
Resources Defaults:<br>
migration-threshold: 1<br>
Operations Defaults:<br>
No defaults set<br>
<br>
Cluster Properties:<br>
cluster-infrastructure: corosync<br>
cluster-name: cluster1<br>
dc-version: 1.1.18-11.el7_5.3-2b07d5c5a9<br>
have-watchdog: false<br>
last-lrm-refresh: 1539474248<br>
no-quorum-policy: ignore<br>
stonith-enabled: false<br>
<br>
Quorum:<br>
Options:<br>
[root@drbd0 corosync]# <br>
<br>
</p>
<hr width="100%" size="2">
<p><br>
</p>
<p><font size="+3">4</font>. so i am out of hands. don't what to do,
may just dive into pacemaker's source code?? <br>
</p>
<p>Hope to get any feedback or tips from you, thank you very much in
advance :)<br>
</p>
<p><br>
</p>
<p>Best Regards</p>
<p>Zhang<br>
</p>
</body>
</html>