[ClusterLabs] How to set up fencing/stonith

Casey & Gina caseyandgina at icloud.com
Fri May 18 18:33:40 UTC 2018


I think that I finally managed to get fencing working!  To do this, I've (for now) used the approach of using the stock Ubuntu package for pcs, but using crmsh to create the fencing resource.  I've had a lot of trouble trying to get a newer pcs compiled and working, and don't really know what I'm doing well enough to overcome that as of yet.  I'm thinking of reporting the issue of pcs not supporting the external stonith plugins as a bug with Ubuntu, in hopes that they can update the version of the package available.  I think I should also be able to just edit the relevant XML into the cib without the help of crmsh, although I'll have to research how to do that more later.  It may well be easier to use fence_vmware_soap instead if I can figure out how to make that work at some point, so I'm still keen to figure out the problems with that too.

I wish that I knew how to diagnose what was going wrong from the logs (quoted below) or some debugging mode of pacemaker, but I took a wild guess and installed vCLI to a prefix of /usr (it's default) instead of /usr/local (where I'd prefer it since it's not installed by apt).  Once this was done, I added the fencing resource with crmsh, and it didn't start failing between the nodes as it had before.  I was then able to use `stonith_admin -F` and `stonith_admin -U` to power off and on a node in the cluster.  I can't tell you how exciting that was to finally see!

Sadly, my excitement was quickly squashed.  I proceeded to add PostgreSQL and VIP resources to the cluster as per how I had done them before without fencing, and everything looked good when I checked `pcs status`.  So then I logged in to vSphere and powered off the primary node, expecting the VIP and PostgreSQL to come up on one of the standby nodes.  Instead, I ended up with this:

------
Node d-gp2-dbpg0-1: UNCLEAN (offline)
Online: [ d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]

Full list of resources:

 vfencing       (stonith:external/vcenter):     Started[ d-gp2-dbpg0-1 d-gp2-dbpg0-2 ]
 postgresql-master-vip  (ocf::heartbeat:IPaddr2):       Started d-gp2-dbpg0-1 (UNCLEAN)
 Master/Slave Set: postgresql-ha [postgresql-10-main]
     postgresql-10-main (ocf::heartbeat:pgsqlms):       Master d-gp2-dbpg0-1 (UNCLEAN)
     Slaves: [ d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]
------

Why does it show above that the vfencing resource is started on nodes 1 and 2, when node 1 is down?  Why is it not started on node 3?  Prior to powering off node 1, it said that it was only started on node 1 - is that a misconfiguration on my part or normal?

Most importantly, what's keeping a standby from taking over after the primary is powered off?

Strangely, when I power back on node 1 and `pcs cluster start` on it, the cluster ends up promoting node 2 as the primary, but with errors reported on node 1:

------
Online: [ d-gp2-dbpg0-1 d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]

Full list of resources:

 vfencing       (stonith:external/vcenter):     Started d-gp2-dbpg0-2
 postgresql-master-vip  (ocf::heartbeat:IPaddr2):       Started d-gp2-dbpg0-2
 Master/Slave Set: postgresql-ha [postgresql-10-main]
     postgresql-10-main (ocf::heartbeat:pgsqlms):       FAILED Master d-gp2-dbpg0-1
     Masters: [ d-gp2-dbpg0-2 ]
     Slaves: [ d-gp2-dbpg0-3 ]

Failed Actions:
* postgresql-10-main_monitor_0 on d-gp2-dbpg0-1 'master (failed)' (9): call=14, status=complete, exitreason='Instance "postgresql-10-main" controldata indicates a running primary instance, the instance
has probably crashed',
    last-rc-change='Fri May 18 18:29:51 2018', queued=0ms, exec=90ms
------

Here is the full list of commands that I have used to configure the cluster after a fresh installation:

------
crm configure primitive vfencing stonith::external/vcenter params VI_SERVER="10.124.137.100" VI_CREDSTORE="/etc/pacemaker/vicredentials.xml" HOSTLIST="d-gp2-dbpg0-1=d-gp2-dbpg0-1;d-gp2-dbpg0-2=d-gp2-dbpg0-2;d-gp2-dbpg0-3=d-gp2-dbpg0-3" RESETPOWERON="0" op monitor interval="60s"
pcs cluster cib /tmp/dbpg.xml
pcs -f /tmp/dbpg.xml property set stonith-enabled=true
pcs -f /tmp/dbpg.xml resource defaults migration-threshold=5
pcs -f /tmp/dbpg.xml resource defaults resource-stickiness=10
pcs -f /tmp/dbpg.xml resource create postgresql-master-vip ocf:heartbeat:IPaddr2 ip=10.124.164.250 cidr_netmask=22 op monitor interval=10s
pcs -f /tmp/dbpg.xml resource create postgresql-10-main ocf:heartbeat:pgsqlms bindir="/usr/lib/postgresql/10/bin" pgdata="/var/lib/postgresql/10/main" pghost="/var/run/postgresql" pgport=5432 recovery_template="/etc/postgresql/10/main/recovery.conf" start_opts="-c config_file=/etc/postgresql/10/main/postgresql.conf" op start timeout=60s op stop timeout=60s op promote timeout=30s op demote timeout=120s op monitor interval=15s timeout=10s role="Master" op monitor interval=16s timeout=10s role="Slave" op notify timeout=60s
pcs -f /tmp/dbpg.xml resource master postgresql-ha postgresql-10-main notify=true
pcs -f /tmp/dbpg.xml constraint colocation add postgresql-master-vip with master postgresql-ha INFINITY
pcs -f /tmp/dbpg.xml constraint order promote postgresql-ha then start postgresql-master-vip symmetrical=false kind=Mandatory
pcs -f /tmp/dbpg.xml constraint order demote postgresql-ha then stop postgresql-master-vip symmetrical=false kind=Mandatory
pcs cluster cib-push /tmp/dbpg.xml
------

Here is the output of `pcs status` before powering off the primary:

------
Online: [ d-gp2-dbpg0-1 d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]

Full list of resources:

 vfencing       (stonith:external/vcenter):     Started d-gp2-dbpg0-1
 postgresql-master-vip  (ocf::heartbeat:IPaddr2):       Started d-gp2-dbpg0-1
 Master/Slave Set: postgresql-ha [postgresql-10-main]
     Masters: [ d-gp2-dbpg0-1 ]
     Slaves: [ d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]
------

As always, thank you all for any help that you can provide,
-- 
Casey

> On May 16, 2018, at 12:41 PM, Casey & Gina <caseyandgina at icloud.com> wrote:
> 
> I tried adding the stonith configuration with the crmsh command as follows:
> 
> crm configure primitive vfencing stonith::external/vcenter params VI_SERVER="vcenter.imovetv.com" VI_CREDSTORE="/etc/pacemaker/vicredentials.xml" HOSTLIST="d-gp2-dbpg0-1=d-gp2-dbpg0-1;d-gp2-dbpg0-2=d-gp2-dbpg0-2;d-gp2-dbpg0-3=d-gp2-dbpg0-3" RESETPOWERON="0" op monitor interval="60s"
> 
> ...along with `pcs property set stonith-enabled=true`
> 
> I really don't want to use the crm command as ultimately I don't want to have both pcs and crmsh installed, but am just trying to keep moving forward in whatever way I can figure out for now - I hoped that I could move forward using this command temporarily and figure out the proper pcs equivalent later.  It looks as though it's working for a short period, but then fails.  When it first starts up, it says that it's started on a different node than the primary, for some reason.  In this case, my primary node is 1, and vfencing initially attempts to start on node 2.  Here's the initial state:
> 
> ------
> Master/Slave Set: postgresql-ha [postgresql-10-main]
>     Masters: [ d-gp2-dbpg0-1 ]
>     Slaves: [ d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]
> postgresql-master-vip  (ocf::heartbeat:IPaddr2):       Started d-gp2-dbpg0-1
> vfencing       (stonith:external/vcenter):     Started d-gp2-dbpg0-2
> ------
> 
> 
> After a few seconds, it changes to this:
> 
> ------
> Master/Slave Set: postgresql-ha [postgresql-10-main]
>     Masters: [ d-gp2-dbpg0-1 ]
>     Slaves: [ d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]
> postgresql-master-vip  (ocf::heartbeat:IPaddr2):       Started d-gp2-dbpg0-1
> vfencing       (stonith:external/vcenter):     Stopped
> 
> Failed Actions:
> * vfencing_monitor_60000 on d-gp2-dbpg0-2 'unknown error' (1): call=19, status=Timed Out, exitreason='none',
>    last-rc-change='Wed May 16 18:07:23 2018', queued=0ms, exec=20005ms
> ------
> 
> 
> After that, it attempts to start on node 3, with the same problem, then lastly it tries to start on node 1.  Ultimately it ends up looking like this:
> 
> ------
> Master/Slave Set: postgresql-ha [postgresql-10-main]
>     Masters: [ d-gp2-dbpg0-1 ]
>     Slaves: [ d-gp2-dbpg0-2 d-gp2-dbpg0-3 ]
> postgresql-master-vip  (ocf::heartbeat:IPaddr2):       Started d-gp2-dbpg0-1
> vfencing       (stonith:external/vcenter):     Stopped
> 
> Failed Actions:
> * vfencing_start_0 on d-gp2-dbpg0-1 'unknown error' (1): call=27, status=Timed Out, exitreason='none',
>    last-rc-change='Wed May 16 18:10:08 2018', queued=1ms, exec=20005ms
> * vfencing_start_0 on d-gp2-dbpg0-2 'unknown error' (1): call=24, status=Timed Out, exitreason='none',
>    last-rc-change='Wed May 16 18:07:43 2018', queued=0ms, exec=20008ms
> * vfencing_start_0 on d-gp2-dbpg0-3 'unknown error' (1): call=24, status=Timed Out, exitreason='none',
>    last-rc-change='Wed May 16 18:08:24 2018', queued=0ms, exec=20006ms
> ------
> 
> 
> I'm attempting to have a simple 3-node cluster wherein PostgreSQL is accessed via a secondary IP that is assigned to the master.  That configuration seemed to be working fine without stonith enabled.  Just trying to get stonith added to the setup.  Here is the output of `pcs config` after stonith is enabled with the aforementioned crm command:
> 
> ------
> Cluster Name: d-gp2-dbpg0
> Corosync Nodes:
> d-gp2-dbpg0-1 d-gp2-dbpg0-2 d-gp2-dbpg0-3
> Pacemaker Nodes:
> d-gp2-dbpg0-1 d-gp2-dbpg0-2 d-gp2-dbpg0-3
> 
> Resources:
> Master: postgresql-ha
>  Meta Attrs: notify=true 
>  Resource: postgresql-10-main (class=ocf provider=heartbeat type=pgsqlms)
>   Attributes: bindir=/usr/lib/postgresql/10/bin pgdata=/var/lib/postgresql/10/main pghost=/var/run/postgresql pgport=5432 recovery_template=/etc/postgresql/10/main/recovery.conf start_opts="-c config_file=/etc/postgresql/10/main/postgresql.conf"
>   Operations: start interval=0s timeout=60s (postgresql-10-main-start-interval-0s)
>               stop interval=0s timeout=60s (postgresql-10-main-stop-interval-0s)
>               promote interval=0s timeout=30s (postgresql-10-main-promote-interval-0s)
>               demote interval=0s timeout=120s (postgresql-10-main-demote-interval-0s)
>               monitor interval=15s role=Master timeout=10s (postgresql-10-main-monitor-interval-15s)
>               monitor interval=16s role=Slave timeout=10s (postgresql-10-main-monitor-interval-16s)
>               notify interval=0s timeout=60s (postgresql-10-main-notify-interval-0s)
> Resource: postgresql-master-vip (class=ocf provider=heartbeat type=IPaddr2)
>  Attributes: ip=10.124.164.250 cidr_netmask=22
>  Operations: start interval=0s timeout=20s (postgresql-master-vip-start-interval-0s)
>              stop interval=0s timeout=20s (postgresql-master-vip-stop-interval-0s)
>              monitor interval=10s (postgresql-master-vip-monitor-interval-10s)
> 
> Stonith Devices:
> Resource: vfencing (class=stonith type=external/vcenter)
>  Attributes: VI_SERVER=vcenter.imovetv.com VI_CREDSTORE=/etc/pacemaker/vicredentials.xml HOSTLIST=d-gp2-dbpg0-1=d-gp2-dbpg0-1;d-gp2-dbpg0-2=d-gp2-dbpg0-2;d-gp2-dbpg0-3=d-gp2-dbpg0-3 RESETPOWERON=0
>  Operations: monitor interval=60s (vfencing-monitor-60s)
> Fencing Levels:
> 
> Location Constraints:
>  Resource: postgresql-ha
>    Enabled on: d-gp2-dbpg0-1 (score:INFINITY) (role: Master) (id:cli-prefer-postgresql-ha)
> Ordering Constraints:
>  promote postgresql-ha then start postgresql-master-vip (kind:Mandatory) (non-symmetrical) (id:order-postgresql-ha-postgresql-master-vip-Mandatory)
>  demote postgresql-ha then stop postgresql-master-vip (kind:Mandatory) (non-symmetrical) (id:order-postgresql-ha-postgresql-master-vip-Mandatory-1)
> Colocation Constraints:
>  postgresql-master-vip with postgresql-ha (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-postgresql-master-vip-postgresql-ha-INFINITY)
> 
> Resources Defaults:
> migration-threshold: 5
> resource-stickiness: 10
> Operations Defaults:
> No defaults set
> 
> Cluster Properties:
> cluster-infrastructure: corosync
> cluster-name: d-gp2-dbpg0
> dc-version: 1.1.14-70404b0
> have-watchdog: false
> stonith-enabled: true
> Node Attributes:
> d-gp2-dbpg0-1: master-postgresql-10-main=1001
> d-gp2-dbpg0-2: master-postgresql-10-main=1000
> d-gp2-dbpg0-3: master-postgresql-10-main=990
> ------
> 
> 
> Here's the data from corosync.log on node 2 when it initially says that vfencing is started there but then fails.  As far as I can see this doesn't give any clue as to why vfencing fails:
> 
> ------
> May 16 18:12:48 [7507] d-gp2-dbpg0-2       crmd:   notice: process_lrm_event:   Operation vfencing_start_0: ok (node=d-gp2-dbpg0-2, call=17, rc=0, cib-update=16, confirmed=true)
> May 16 18:12:48 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.23 2
> May 16 18:12:48 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.24 (null)
> May 16 18:12:48 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=24
> May 16 18:12:48 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='vfencing']/lrm_rsc_op[@id='vfencing_last_0']:  @operation_key=vfencing_start_0, @operation=start, @transition-key=44:0:0:7a1320b5-e505-49da-aabb-09102396e6e3, @transition-magic=0:0;44:0:0:7a1320b5-e505-49da-aabb-09102396e6e3, @call-id=17, @rc-code=0, @last-run=1526494367, @last-rc-change=1526494367, @exec-time=1445
> May 16 18:12:48 [7504] d-gp2-dbpg0-2       lrmd:     info: log_execute: executing - rsc:postgresql-10-main action:notify call_id:18
> May 16 18:12:48 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-2/crmd/16, version=0.55.24)
> pgsqlms(postgresql-10-main)[7788]:      2018/05/16_18:12:48  INFO: Promoting instance on node "d-gp2-dbpg0-1"
> pgsqlms(postgresql-10-main)[7788]:      2018/05/16_18:12:48  INFO: Current node TL#LSN: 7#83886080
> May 16 18:12:48 [7504] d-gp2-dbpg0-2       lrmd:     info: log_finished:        finished - rsc:postgresql-10-main action:notify call_id:18 pid:7788 exit-code:0 exec-time:164ms queue-time:0ms
> May 16 18:12:48 [7507] d-gp2-dbpg0-2       crmd:   notice: process_lrm_event:   Operation postgresql-10-main_notify_0: ok (node=d-gp2-dbpg0-2, call=18, rc=0, cib-update=0, confirmed=true)
> May 16 18:12:49 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.24 2
> May 16 18:12:49 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.25 (null)
> May 16 18:12:49 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=25
> May 16 18:12:49 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postgresql-10-main']/lrm_rsc_op[@id='postgresql-10-main_last_0']:  @operation_key=postgresql-10-main_promote_0, @operation=promote, @transition-key=12:0:0:7a1320b5-e505-49da-aabb-09102396e6e3, @transition-magic=0:0;12:0:0:7a1320b5-e505-49da-aabb-09102396e6e3, @call-id=18, @last-run=1526494369, @last-rc-change=1526494369, @exec-time=381
> May 16 18:12:49 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-1/crmd/16, version=0.55.25)                                                                                                                                                                                                                                                                                                                                                                                                                    May 16 18:12:54 [7507] d-gp2-dbpg0-2       crmd:     info: crm_procfs_pid_of:   Found cib active as process 7502
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:     info: st_child_term:       Child 7828 timed out, sending SIGTERM                                                                    
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:   notice: stonith_action_async_done:   Child process 7828 performing action 'monitor' timed out with signal 15
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_action:  fence_legacy[7828] stderr: [ Smartmatch is experimental at /usr/lib/stonith/plugins/external/vcenter line 34. ]
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_action:  fence_legacy[7828] stderr: [ Smartmatch is experimental at /usr/lib/stonith/plugins/external/vcenter line 115. ]
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_action:  fence_legacy[7828] stderr: [ Smartmatch is experimental at /usr/lib/stonith/plugins/external/vcenter line 152. ]                  May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_action:  fence_legacy[7828] stderr: [ Smartmatch is experimental at /usr/lib/stonith/plugins/external/vcenter line 34. ]
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_action:  fence_legacy[7828] stderr: [ Smartmatch is experimental at /usr/lib/stonith/plugins/external/vcenter line 115. ]         
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_action:  fence_legacy[7828] stderr: [ Smartmatch is experimental at /usr/lib/stonith/plugins/external/vcenter line 152. ]
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:   notice: log_operation:       Operation 'monitor' [7828] for device 'vfencing' returned: -62 (Timer expired)
> May 16 18:13:08 [7503] d-gp2-dbpg0-2 stonith-ng:  warning: log_operation:       vfencing:7828 [ Performing: stonith -t external/vcenter -S ]
> May 16 18:13:08 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/crmd/17)
> May 16 18:13:08 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.25 2
> May 16 18:13:08 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.26 (null)
> May 16 18:13:08 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=26
> May 16 18:13:08 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='vfencing']:  <lrm_rsc_op id="vfencing_last
> _failure_0" operation_key="vfencing_monitor_60000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="45:0:0:7a1320b5-e505-49da-aabb-09102396e6e3" transit
> ion-magic="2:1;45:0:0:7a1320b5-e505-49da-aabb-09102396e6e3" on_node="d-gp2-dbpg0-2" call-id="19" rc-code="1" op-
> May 16 18:13:08 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-2/crmd/17, version=0.55.26)
> May 16 18:13:09 [7504] d-gp2-dbpg0-2       lrmd:     info: log_execute: executing - rsc:postgresql-10-main action:notify call_id:20
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.26 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.27 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=27
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='2']/transient_attributes[@id='2']/instance_attributes[@id='status-2']:  <nvpair id="status-
> 2-fail-count-vfencing" name="fail-count-vfencing" value="1"/>
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-1/attrd/4, version=0.55.27)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.27 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.28 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=28
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='2']/transient_attributes[@id='2']/instance_attributes[@id='status-2']:  <nvpair id="status-
> 2-last-failure-vfencing" name="last-failure-vfencing" value="1526494389"/>
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-1/attrd/5, version=0.55.28)
> May 16 18:13:09 [7504] d-gp2-dbpg0-2       lrmd:     info: log_finished:        finished - rsc:postgresql-10-main action:notify call_id:20 pid:7932 exit-code:0 exec-time:131ms queue-time:0ms
> May 16 18:13:09 [7507] d-gp2-dbpg0-2       crmd:   notice: process_lrm_event:   Operation postgresql-10-main_notify_0: ok (node=d-gp2-dbpg0-2, call=20, rc=0, cib-update=0, confirmed=true)
> May 16 18:13:09 [7504] d-gp2-dbpg0-2       lrmd:     info: log_execute: executing - rsc:vfencing action:stop call_id:23
> May 16 18:13:09 [7504] d-gp2-dbpg0-2       lrmd:     info: log_finished:        finished - rsc:vfencing action:stop call_id:23  exit-code:0 exec-time:1ms queue-time:0ms
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/crmd/18)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.28 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.29 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=29
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='vfencing']/lrm_rsc_op[@id='vfencing_last_0
> ']:  @operation_key=vfencing_stop_0, @operation=stop, @transition-key=2:1:0:7a1320b5-e505-49da-aabb-09102396e6e3, @transition-magic=0:0;2:1:0:7a1320b5-e505-49da-aabb-09102396e6e3, @call-id=23, @last-run
> =1526494389, @last-rc-change=1526494389, @exec-time=1
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-2/crmd/18, version=0.55.29)
> May 16 18:13:09 [7507] d-gp2-dbpg0-2       crmd:   notice: process_lrm_event:   Operation vfencing_stop_0: ok (node=d-gp2-dbpg0-2, call=23, rc=0, cib-update=18, confirmed=true)
> May 16 18:13:09 [7504] d-gp2-dbpg0-2       lrmd:     info: log_execute: executing - rsc:vfencing action:start call_id:24
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.29 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.30 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=30
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='3']/lrm[@id='3']/lrm_resources/lrm_resource[@id='postgresql-10-main']:  <lrm_rsc_op id="pos
> tgresql-10-main_monitor_16000" operation_key="postgresql-10-main_monitor_16000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="6:1:0:7a1320b5-e505-49d
> a-aabb-09102396e6e3" transition-magic="0:0;6:1:0:7a1320b5-e505-49da-aabb-09102396e6e3" on_node="d-gp2-dbpg0-3" c
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-3/crmd/46, version=0.55.30)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.30 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.31 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=31
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postgresql-master-vip']/lrm_rsc_op[@id='po
> stgresql-master-vip_last_0']:  @operation_key=postgresql-master-vip_start_0, @operation=start, @transition-key=39:1:0:7a1320b5-e505-49da-aabb-09102396e6e3, @transition-magic=0:0;39:1:0:7a1320b5-e505-49d
> a-aabb-09102396e6e3, @call-id=21, @rc-code=0, @last-run=1526494389, @last-rc-change=1526494389, @exec-time=151
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-1/crmd/17, version=0.55.31)
> May 16 18:13:09 [7507] d-gp2-dbpg0-2       crmd:     info: process_lrm_event:   Operation postgresql-10-main_monitor_16000: ok (node=d-gp2-dbpg0-2, call=21, rc=0, cib-update=19, confirmed=false)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/crmd/19)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.31 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.55.32 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=32
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='postgresql-10-main']:  <lrm_rsc_op id="pos
> tgresql-10-main_monitor_16000" operation_key="postgresql-10-main_monitor_16000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="14:1:0:7a1320b5-e505-49
> da-aabb-09102396e6e3" transition-magic="0:0;14:1:0:7a1320b5-e505-49da-aabb-09102396e6e3" on_node="d-gp2-dbpg0-2"
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-2/crmd/19, version=0.55.32)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.55.32 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.56.0 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @epoch=56, @num_updates=0
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib/configuration/nodes/node[@id='3']/instance_attributes[@id='nodes-3']/nvpair[@id='nodes-3-master-postgresql-10-main
> ']:  @value=-1000
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section nodes: OK (rc=0, origin=d-gp2-dbpg0-1/crm_attribute/4, version=0.56.0)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_file_backup:     Archived previous version as /var/lib/pacemaker/cib/cib-58.raw
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_file_write_with_digest:  Wrote version 0.56.0 of the CIB to disk (digest: 97896bb83f45f54a5a1bd7cd10546536)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_file_write_with_digest:  Reading cluster configuration file /var/lib/pacemaker/cib/cib.4Dt4WC (digest: /var/lib/pacemaker/cib/cib.rCAONF)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.56.0 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.56.1 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=1
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postgresql-master-vip']:  <lrm_rsc_op id="
> postgresql-master-vip_monitor_10000" operation_key="postgresql-master-vip_monitor_10000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="40:1:0:7a1320b
> 5-e505-49da-aabb-09102396e6e3" transition-magic="0:0;40:1:0:7a1320b5-e505-49da-aabb-09102396e6e3" on_node="d-gp2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-1/crmd/18, version=0.56.1)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.56.1 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.57.0 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @epoch=57, @num_updates=0
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib/configuration/nodes/node[@id='2']/instance_attributes[@id='nodes-2']/nvpair[@id='nodes-2-master-postgresql-10-main
> ']:  @value=-1000
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section nodes: OK (rc=0, origin=d-gp2-dbpg0-1/crm_attribute/4, version=0.57.0)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_file_backup:     Archived previous version as /var/lib/pacemaker/cib/cib-59.raw
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_file_write_with_digest:  Wrote version 0.57.0 of the CIB to disk (digest: 450d6e0939f45b7034a091cd792a16c1)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_file_write_with_digest:  Reading cluster configuration file /var/lib/pacemaker/cib/cib.kEOMJI (digest: /var/lib/pacemaker/cib/cib.GUtkPL)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: --- 0.57.0 2
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      Diff: +++ 0.57.1 (null)
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      +  /cib:  @num_updates=1
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_perform_op:      ++ /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postgresql-10-main']:  <lrm_rsc_op id="pos
> tgresql-10-main_monitor_15000" operation_key="postgresql-10-main_monitor_15000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="11:1:8:7a1320b5-e505-49
> da-aabb-09102396e6e3" transition-magic="0:8;11:1:8:7a1320b5-e505-49da-aabb-09102396e6e3" on_node="d-gp2-dbpg0-1"
> May 16 18:13:09 [7502] d-gp2-dbpg0-2        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=d-gp2-dbpg0-1/crmd/19, version=0.57.1)
> ------
> 
> 
>> On May 16, 2018, at 11:01 AM, Casey & Gina <caseyandgina at icloud.com> wrote:
>> 
>>> On May 16, 2018, at 10:43 AM, Casey & Gina <caseyandgina at icloud.com> wrote:
>>> 
>>> Thank you and Andrei for the advice...
>>> 
>>>> the pcs alternative commands are:
>>>> 
>>>> pcs stonith create vfencing external/vcenter \
>>>> VI_SERVER=10.1.1.1 VI_CREDSTORE=/etc/vicredentials.xml \
>>>> HOSTLIST="hostname1=vmname1;hostname2=vmname2" RESETPOWERON=0 \
>>>> op monitor interval=60s
>>> 
>>> When I attempt the above (with different server, credstore path, and hostlist of course), I get the following error:
>>> 
>>> Error: Unable to create resource 'stonith:external/vcenter', it is not installed on this system (use --force to override)
>> 
>> If I try with --force just to see what would end up different in the xml versus after the crm command, I get this error, quite ironically:
>> 
>> Error: unable to create resource/fence device 'vfencing', 'vfencing' already exists on this system
>> 
>> So it complains that it's not on the system when I don't try to force it, and that it is if I try to force it?!
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> https://lists.clusterlabs.org/mailman/listinfo/users
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



More information about the Users mailing list