[ClusterLabs] fence_vbox Unable to connect/login to fencing device

ArekW arkaduis at gmail.com
Fri Jul 7 06:02:55 UTC 2017


Hi,
I did a small research on the scripts

/usr/sbin/fence_vbox
def main():
...
conn = fence_login(options)

The fence_loging is scripted in the fencing.py and it should invoke
function: _login_ssh_with_identity_file

/usr/share/fence/fencing.py
def _login_ssh_with_identity_file:
...
command = '%s %s %s@%s -i %s -p %s' % \
                (options["--ssh-path"], force_ipvx, options["--username"],
options["--ip"], \
                options["--identity-file"], options["--ipport"])

There are username and ip parameter used here (not login and ipaddr as in
fence description) so I used:

pcs stonith create vbox-fencing fence_vbox ip=10.0.2.2 username=AW23321
identity_file=/root/.ssh/id_rsa host_os=windows
vboxmanage_path="/cygdrive/c/Program\ Files/Oracle/VirtualBox/VBoxManage"
pcmk_host_map="nfsnode1:centos1;nfsnode2:centos2" ssh=true inet4_only=true
op monitor interval=5 -force

I still got the same warning in messages:
Jul  7 07:52:24 nfsnode1 stonith-ng[6244]: warning: fence_vbox[21564]
stderr: [ Unable to connect/login to fencing device ]
Jul  7 07:52:24 nfsnode1 stonith-ng[6244]: warning: fence_vbox[21564]
stderr: [  ]
Jul  7 07:52:24 nfsnode1 stonith-ng[6244]: warning: fence_vbox[21564]
stderr: [  ]

"Standalone" test is working with the same parameters:
[root at nfsnode1 nfsinfo]# fence_vbox --ip 10.0.2.2 --username=AW23321
--identity-file=/root/.ssh/id_rsa --plug=centos2 --host-os=windows
--action=status --vboxmanage-path="/cygdrive/c/Program\
Files/Oracle/VirtualBox/VBoxManage" -4 -x
Status: ON

I could use more debug in the scripts.


Pozdrawiam,
Arek

2017-07-06 17:31 GMT+02:00 Ken Gaillot <kgaillot at redhat.com>:

> On 07/06/2017 10:29 AM, Ken Gaillot wrote:
> > On 07/06/2017 10:13 AM, ArekW wrote:
> >> Hi,
> >>
> >> It seems that my the fence_vbox is running but there are errors in
> >> logs every few minutes like:
> >>
> >> Jul  6 12:51:12 nfsnode1 fence_vbox: Unable to connect/login to fencing
> device
> >> Jul  6 12:51:13 nfsnode1 stonith-ng[7899]: warning: fence_vbox[30220]
> >> stderr: [ Unable to connect/login to fencing device ]
> >> Jul  6 12:51:13 nfsnode1 stonith-ng[7899]: warning: fence_vbox[30220]
> >> stderr: [  ]
> >> Jul  6 12:51:13 nfsnode1 stonith-ng[7899]: warning: fence_vbox[30220]
> >> stderr: [  ]
> >>
> >> Eventually after fome time the pcs status shows Failed Actions:
> >>
> >> # pcs status --full
> >> Cluster name: nfscluster
> >> Stack: corosync
> >> Current DC: nfsnode1 (1) (version 1.1.15-11.el7_3.5-e174ec8) -
> >> partition with quorum
> >> Last updated: Thu Jul  6 13:02:52 2017          Last change: Thu Jul
> >> 6 13:00:33 2017 by root via crm_resource on nfsnode1
> >>
> >> 2 nodes and 11 resources configured
> >>
> >> Online: [ nfsnode1 (1) nfsnode2 (2) ]
> >>
> >> Full list of resources:
> >>
> >> Master/Slave Set: StorageClone [Storage]
> >>      Storage    (ocf::linbit:drbd):     Master nfsnode1
> >>      Storage    (ocf::linbit:drbd):     Master nfsnode2
> >>      Masters: [ nfsnode1 nfsnode2 ]
> >> Clone Set: dlm-clone [dlm]
> >>      dlm        (ocf::pacemaker:controld):      Started nfsnode1
> >>      dlm        (ocf::pacemaker:controld):      Started nfsnode2
> >>      Started: [ nfsnode1 nfsnode2 ]
> >> vbox-fencing   (stonith:fence_vbox):   Started nfsnode1
> >> Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started nfsnode1
> >>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started nfsnode2
> >> Clone Set: StorageFS-clone [StorageFS]
> >>      StorageFS  (ocf::heartbeat:Filesystem):    Started nfsnode1
> >>      StorageFS  (ocf::heartbeat:Filesystem):    Started nfsnode2
> >>      Started: [ nfsnode1 nfsnode2 ]
> >> Clone Set: WebSite-clone [WebSite]
> >>      WebSite    (ocf::heartbeat:apache):        Started nfsnode1
> >>      WebSite    (ocf::heartbeat:apache):        Started nfsnode2
> >>      Started: [ nfsnode1 nfsnode2 ]
> >>
> >> Failed Actions:
> >> * vbox-fencing_start_0 on nfsnode1 'unknown error' (1): call=157,
> >> status=Error, exitreason='none',
> >>     last-rc-change='Thu Jul  6 13:58:04 2017', queued=0ms, exec=11947ms
> >> * vbox-fencing_start_0 on nfsnode2 'unknown error' (1): call=57,
> >> status=Error, exitreason='none',
> >>     last-rc-change='Thu Jul  6 13:58:16 2017', queued=0ms, exec=11953ms
> >>
> >> The fence was created with command:
> >> pcs -f stonith_cfg stonith create vbox-fencing fence_vbox ip=10.0.2.2
> >> ipaddr=10.0.2.2 login=AW23321 username=AW23321
> >> identity_file=/root/.ssh/id_rsa host_os=windows
> >> pcmk_host_check=static-list pcmk_host_list="centos1 centos2"
> >> vboxmanage_path="/cygdrive/c/Program\
> >> Files/Oracle/VirtualBox/VBoxManage" op monitor interval=5
> >>
> >> where centos1 and centos2 are VBox machines names (not hostnames). I
> >> used duplicated login/username parameters as it is indicated as
> >> required in stonith description fence_vbox.
> >>
> >> Then I updated the configuration and set:
> >>
> >> pcs stonith update vbox-fencing  pcmk_host_list="nfsnode1 nfsnode2"
> >> pcs stonith update vbox-fencing
> >> pcmk_host_map="nfsnode1:centos1;nfsnode2:centos2"
> >>
> >> where nfsnode1 and nfsnode2 are the hostnames
> >>
> >> I'not sure which config is correct but both shows Failed Actions after
> >> some time.
> >
> > You only need one of pcmk_host_list or pcmk_host_map. Use pcmk_host_list
> > if fence_vbox recognizes the node names used by the cluster, or
> > pcmk_host_map if fence_vbox knows the nodes by other names. In this
> > case, it looks like you want to tell fence_vbox to use "centos2" when
> > the cluster wants to fence nfsnode2, so your pcmk_host_map is the right
> > choice.
> >
> >> I've successfully tested the fence connection to the VBox host with:
> >> fence_vbox --ip 10.0.2.2 --username=AW23321
> >> --identity-file=/root/.ssh/id_rsa --plug=centos2 --host-os=windows
> >> --action=status --vboxmanage-path="/cygdrive/c/Program\
> >> Files/Oracle/VirtualBox/VBoxManage"
> >>
> >> Why the above configuration work as standalone command and does not
> >> work in pcs ?
> > Two main possibilities: you haven't expressed those identical options in
> > the cluster configuration correctly; or, you have some permissions on
> > the command line that the cluster doesn't have (maybe SELinux, or file
> > permissions, or ...).
>
> Forgot one other possibility: the status shows that the *start* action
> is what failed, not a fence action. Check the fence_vbox source code to
> see what start does, and try to do that manually step by step.
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170707/6b542558/attachment-0002.html>


More information about the Users mailing list