[ClusterLabs] fence_vbox Unable to connect/login to fencing device

Ken Gaillot kgaillot at redhat.com
Thu Jul 6 15:29:12 UTC 2017


On 07/06/2017 10:13 AM, ArekW wrote:
> Hi,
> 
> It seems that my the fence_vbox is running but there are errors in
> logs every few minutes like:
> 
> Jul  6 12:51:12 nfsnode1 fence_vbox: Unable to connect/login to fencing device
> Jul  6 12:51:13 nfsnode1 stonith-ng[7899]: warning: fence_vbox[30220]
> stderr: [ Unable to connect/login to fencing device ]
> Jul  6 12:51:13 nfsnode1 stonith-ng[7899]: warning: fence_vbox[30220]
> stderr: [  ]
> Jul  6 12:51:13 nfsnode1 stonith-ng[7899]: warning: fence_vbox[30220]
> stderr: [  ]
> 
> Eventually after fome time the pcs status shows Failed Actions:
> 
> # pcs status --full
> Cluster name: nfscluster
> Stack: corosync
> Current DC: nfsnode1 (1) (version 1.1.15-11.el7_3.5-e174ec8) -
> partition with quorum
> Last updated: Thu Jul  6 13:02:52 2017          Last change: Thu Jul
> 6 13:00:33 2017 by root via crm_resource on nfsnode1
> 
> 2 nodes and 11 resources configured
> 
> Online: [ nfsnode1 (1) nfsnode2 (2) ]
> 
> Full list of resources:
> 
> Master/Slave Set: StorageClone [Storage]
>      Storage    (ocf::linbit:drbd):     Master nfsnode1
>      Storage    (ocf::linbit:drbd):     Master nfsnode2
>      Masters: [ nfsnode1 nfsnode2 ]
> Clone Set: dlm-clone [dlm]
>      dlm        (ocf::pacemaker:controld):      Started nfsnode1
>      dlm        (ocf::pacemaker:controld):      Started nfsnode2
>      Started: [ nfsnode1 nfsnode2 ]
> vbox-fencing   (stonith:fence_vbox):   Started nfsnode1
> Clone Set: ClusterIP-clone [ClusterIP] (unique)
>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started nfsnode1
>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started nfsnode2
> Clone Set: StorageFS-clone [StorageFS]
>      StorageFS  (ocf::heartbeat:Filesystem):    Started nfsnode1
>      StorageFS  (ocf::heartbeat:Filesystem):    Started nfsnode2
>      Started: [ nfsnode1 nfsnode2 ]
> Clone Set: WebSite-clone [WebSite]
>      WebSite    (ocf::heartbeat:apache):        Started nfsnode1
>      WebSite    (ocf::heartbeat:apache):        Started nfsnode2
>      Started: [ nfsnode1 nfsnode2 ]
> 
> Failed Actions:
> * vbox-fencing_start_0 on nfsnode1 'unknown error' (1): call=157,
> status=Error, exitreason='none',
>     last-rc-change='Thu Jul  6 13:58:04 2017', queued=0ms, exec=11947ms
> * vbox-fencing_start_0 on nfsnode2 'unknown error' (1): call=57,
> status=Error, exitreason='none',
>     last-rc-change='Thu Jul  6 13:58:16 2017', queued=0ms, exec=11953ms
> 
> The fence was created with command:
> pcs -f stonith_cfg stonith create vbox-fencing fence_vbox ip=10.0.2.2
> ipaddr=10.0.2.2 login=AW23321 username=AW23321
> identity_file=/root/.ssh/id_rsa host_os=windows
> pcmk_host_check=static-list pcmk_host_list="centos1 centos2"
> vboxmanage_path="/cygdrive/c/Program\
> Files/Oracle/VirtualBox/VBoxManage" op monitor interval=5
> 
> where centos1 and centos2 are VBox machines names (not hostnames). I
> used duplicated login/username parameters as it is indicated as
> required in stonith description fence_vbox.
> 
> Then I updated the configuration and set:
> 
> pcs stonith update vbox-fencing  pcmk_host_list="nfsnode1 nfsnode2"
> pcs stonith update vbox-fencing
> pcmk_host_map="nfsnode1:centos1;nfsnode2:centos2"
> 
> where nfsnode1 and nfsnode2 are the hostnames
> 
> I'not sure which config is correct but both shows Failed Actions after
> some time.

You only need one of pcmk_host_list or pcmk_host_map. Use pcmk_host_list
if fence_vbox recognizes the node names used by the cluster, or
pcmk_host_map if fence_vbox knows the nodes by other names. In this
case, it looks like you want to tell fence_vbox to use "centos2" when
the cluster wants to fence nfsnode2, so your pcmk_host_map is the right
choice.

> I've successfully tested the fence connection to the VBox host with:
> fence_vbox --ip 10.0.2.2 --username=AW23321
> --identity-file=/root/.ssh/id_rsa --plug=centos2 --host-os=windows
> --action=status --vboxmanage-path="/cygdrive/c/Program\
> Files/Oracle/VirtualBox/VBoxManage"
> 
> Why the above configuration work as standalone command and does not
> work in pcs ?
Two main possibilities: you haven't expressed those identical options in
the cluster configuration correctly; or, you have some permissions on
the command line that the cluster doesn't have (maybe SELinux, or file
permissions, or ...).




More information about the Users mailing list