[Pacemaker] Stonith issue with fence_virsh

Beo Banks beo.banks at googlemail.com
Thu Oct 24 03:38:02 EDT 2013


hi,

i have enable the debug option and i use the ip instead of hostname

primitive stonith-zarafa02 stonith:fence_virsh \
        params pcmk_host_list="zarafa02" pcmk_host_check="static-list"
action="reboot" ipaddr="*.*.*.*" secure="true" login="root"
identity_file="/root/.ssh/id_rsa" debug="/var/log/stonith.log"
verbose="true" \


it seems that the fence_virsh can establish the connection but he don´t
start the reboot command via pacemaker
is there any option to find out how pacemaker runs the command?

/var/log/stonith.log
[EXPECT]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 16    zarafa02                   running

[EXPECT]# ^C



it should be (i saw it via comandline)

EXPECT]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 16    zarafa02                   running

[EXPECT]# virsh destroy zarafa02
Domain zarafa02 destroyed

[EXPECT]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     zarafa02                   shut off

[EXPECT]# virsh start zarafa02
Domain zarafa02 started

[EXPECT]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 17    zarafa02                   running

[EXPECT]# Success: Rebooted





2013/10/23 Beo Banks <beo.banks at googlemail.com>

> i use /etc/hosts
>
> ping host01 is correct.
>
>
> maybe that´s the reason i test the fence_virsh as root.
>
> primitive stonith-zarafa02 stonith:fence_virsh \
>         params pcmk_host_list="zarafa02" pcmk_host_check="static-list"
> action="reboot" ipaddr="host02" secure="true" login="root"
> identity_file="/root/.ssh/id_rsa" \
>
>         op monitor interval="300s" \
>         op start interval="0" timeout="60s" \
>         meta failure-timeout="180s"
>
>
> as which user fence pacemaker the device? as root?
>
> via commanline (works)
> node1# fence_virsh -a host02 -l root -x -k /root/.ssh/id_rsa -o reboot -v
> -n zarafa02
>
> is there a mistake in my config?
>
>
>
>
>
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [MAIN  ] Corosync Cluster
> Engine ('1.4.1'): started and ready to provide service.
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [MAIN  ] Corosync built-in
> features: nss dbus rdma snmp
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [MAIN  ] Successfully read main
> configuration file '/etc/corosync/corosync.conf'.
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [TOTEM ] Initializing transport
> (UDP/IP Multicast).
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [TOTEM ] Initializing
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [TOTEM ] Initializing transport
> (UDP/IP Multicast).
> Oct 23 13:23:54 zarafa02 corosync[1700]:   [TOTEM ] Initializing
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [TOTEM ] The network interface
> [10.0.0.22] is now up.
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> process_ais_conf: Reading configure
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] ERROR:
> process_ais_conf: You have configured a cluster using the Pacemaker plugin
> for Corosync. The plugin is not supported in this environment and will be
> removed very soon.
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] ERROR:
> process_ais_conf:  Please see Chapter 8 of 'Clusters from Scratch' (
> http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> config_find_init: Local handle: 5880381755227111425 for logging
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> config_find_next: Processing additional logging options...
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Found 'off' for option: debug
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Found 'yes' for option: to_logfile
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Found '/var/log/cluster/corosync.log' for option: logfile
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Found 'yes' for option: to_syslog
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Defaulting to 'daemon' for option: syslog_facility
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> config_find_init: Local handle: 4835695805891346434 for quorum
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> config_find_next: No additional configuration supplied for: quorum
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> No default for option: provider
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> config_find_init: Local handle: 4552499517957603331 for service
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> config_find_next: Processing additional service options...
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Found '1' for option: ver
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> process_ais_conf: Enabling MCP mode: Use the Pacemaker init script to
> complete Pacemaker startup
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Defaulting to 'pcmk' for option: clustername
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Defaulting to 'no' for option: use_logd
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: get_config_opt:
> Defaulting to 'no' for option: use_mgmtd
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_startup:
> CRM: Initialized
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] Logging: Initialized
> pcmk_startup
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_startup:
> Maximum core file size is: 18446744073709551615
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_startup:
> Service: 9
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_startup:
> Local hostname: zarafa02
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_update_nodeid: Local node id: 369098762
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> Creating entry for node 369098762 born on 0
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> 0x141afc0 Node 369098762 now known as zarafa02 (was: (null))
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> Node zarafa02 now has 1 quorum votes (was 0)
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> Node 369098762/zarafa02 is now: member
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> Pacemaker Cluster Manager 1.1.8
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> corosync extended virtual synchrony service
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> corosync configuration service
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> corosync cluster closed process group service v1.01
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> corosync cluster config database access v1.01
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> corosync profile loading service
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [SERV  ] Service engine loaded:
> corosync cluster quorum service v0.1
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [MAIN  ] Compatibility mode set
> to whitetank.  Using V1 and V2 of the synchronization engine.
> Oct 23 13:23:55 zarafa02 corosync[1700]:   [TOTEM ] The network interface
> [0.0.0.0] is now up.
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [TOTEM ] Incrementing problem
> counter for seqid 1 iface 0.0.0.0to [1 of 10]
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [TOTEM ] Incrementing problem
> counter for seqid 1 iface 0.0.0.0to [1 of 10]
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] notice:
> pcmk_peer_update: Transitional membership event on ring 268: memb=0, new=0,
> lost=0
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] notice:
> pcmk_peer_update: Stable membership event on ring 268: memb=1, new=1, lost=0
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_peer_update: NEW:  zarafa02 369098762
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_peer_update: MEMB: zarafa02 369098762
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [TOTEM ] A processor joined or
> left the membership and a new membership was formed.
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [CPG   ] chosen downlist:
> sender r(0) ip(10.0.0.22) r(1) ip(0.0.0.0) ; members(old:0 left:0)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [MAIN  ] Completed service
> synchronization, ready to provide service.
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] notice:
> pcmk_peer_update: Transitional membership event on ring 272: memb=1, new=0,
> lost=0
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_peer_update: memb: zarafa02 369098762
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] notice:
> pcmk_peer_update: Stable membership event on ring 272: memb=2, new=1, lost=0
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> Creating entry for node 352321546 born on 272
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> Node 352321546/unknown is now: member
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_peer_update: NEW:  .pending. 352321546
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_peer_update: MEMB: .pending. 352321546
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> pcmk_peer_update: MEMB: zarafa02 369098762
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> send_member_notification: Sending membership update 272 to 0 children
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> 0x141afc0 Node 369098762 ((null)) born on: 272
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [TOTEM ] A processor joined or
> left the membership and a new membership was formed.
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> 0x14247d0 Node 352321546 (zarafa01) born on: 248
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> 0x14247d0 Node 352321546 now known as zarafa01 (was: (null))
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info: update_member:
> Node zarafa01 now has 1 quorum votes (was 0)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] info:
> send_member_notification: Sending membership update 272 to 0 children
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.crmd failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [CPG   ] chosen downlist:
> sender r(0) ip(10.0.0.21) r(1) ip(0.0.0.0) ; members(old:1 left:0)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [MAIN  ] Completed service
> synchronization, ready to provide service.
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.crmd failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:56 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:23:58 zarafa02 corosync[1700]:   [TOTEM ] ring 1 active with no
> faults(1)
> Oct 23 13:23:59 zarafa02 kernel: drbd: initialized. Version: 8.3.15
> (api:88/proto:86-97)
> Oct 23 13:23:59 zarafa02 kernel: drbd: GIT-hash:
> 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by phil at Build64R6,
> 2012-12-20 20:09:51
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: Starting worker thread (from
> cqueue [1987])
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: disk( Diskless -> Attaching )
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: Found 4 transactions (11
> active extents) in activity log.
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: Method to ensure write
> ordering: flush
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: max BIO size = 131072
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: drbd_bm_resize called with
> capacity == 524270872
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: resync bitmap: bits=65533859
> words=1023967 pages=2000
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: size = 250 GB (262135436 KB)
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: bitmap READ of 2000 pages
> took 11 jiffies
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: recounting of set bits took
> additional 8 jiffies
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: 0 KB (0 bits) marked
> out-of-sync by on disk bit-map.
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: disk( Attaching ->
> Consistent )
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: attached to UUIDs
> 455F3C39E924BE3A:0000000000000000:6F4C52EE2A21D036:6F4B52EE2A21D037
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: Starting worker thread (from
> cqueue [1987])
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: disk( Diskless -> Attaching )
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: Found 4 transactions (6
> active extents) in activity log.
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: Method to ensure write
> ordering: flush
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: max BIO size = 131072
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: drbd_bm_resize called with
> capacity == 1258251896
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: resync bitmap:
> bits=157281487 words=2457524 pages=4800
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: size = 600 GB (629125948 KB)
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: bitmap READ of 4800 pages
> took 26 jiffies
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: recounting of set bits took
> additional 21 jiffies
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: 0 KB (0 bits) marked
> out-of-sync by on disk bit-map.
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: disk( Attaching ->
> Consistent )
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: attached to UUIDs
> C3F30C77423EA3D4:0000000000000000:6120BBFBBB30210C:611FBBFBBB30210D
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: conn( StandAlone ->
> Unconnected )
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: Starting receiver thread
> (from drbd0_worker [1998])
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: receiver (re)started
> Oct 23 13:23:59 zarafa02 kernel: block drbd0: conn( Unconnected ->
> WFConnection )
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: conn( StandAlone ->
> Unconnected )
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: Starting receiver thread
> (from drbd1_worker [2017])
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: receiver (re)started
> Oct 23 13:23:59 zarafa02 kernel: block drbd1: conn( Unconnected ->
> WFConnection )
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: Handshake successful: Agreed
> network protocol version 97
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: conn( WFConnection ->
> WFReportParams )
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: Starting asender thread
> (from drbd0_receiver [2033])
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: data-integrity-alg:
> <not-used>
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: drbd_sync_handshake:
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: self
> 455F3C39E924BE3A:0000000000000000:6F4C52EE2A21D036:6F4B52EE2A21D037 bits:0
> flags:0
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: peer
> 857412ABB0CE0739:455F3C39E924BE3B:6F4C52EE2A21D037:6F4B52EE2A21D037 bits:0
> flags:0
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: uuid_compare()=-1 by rule 50
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: peer( Unknown -> Primary )
> conn( WFReportParams -> WFBitMapT ) disk( Consistent -> Outdated ) pdsk(
> DUnknown -> UpToDate )
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: Handshake successful: Agreed
> network protocol version 97
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: conn( WFConnection ->
> WFReportParams )
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: Starting asender thread
> (from drbd1_receiver [2037])
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: data-integrity-alg:
> <not-used>
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: drbd_sync_handshake:
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: self
> C3F30C77423EA3D4:0000000000000000:6120BBFBBB30210C:611FBBFBBB30210D bits:0
> flags:0
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: peer
> 227D17F2074428C9:C3F30C77423EA3D5:6120BBFBBB30210D:611FBBFBBB30210D bits:0
> flags:0
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: uuid_compare()=-1 by rule 50
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: peer( Unknown -> Primary )
> conn( WFReportParams -> WFBitMapT ) disk( Consistent -> Outdated ) pdsk(
> DUnknown -> UpToDate )
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: conn( WFBitMapT ->
> WFSyncUUID )
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: updated sync uuid
> 45603C39E924BE3A:0000000000000000:6F4C52EE2A21D036:6F4B52EE2A21D037
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: helper command:
> /sbin/drbdadm before-resync-target minor-0
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: helper command:
> /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> ct 23 13:24:00 zarafa02 kernel: block drbd0: Began resync as SyncTarget
> (will sync 0 KB [0 bits set]).
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: Resync done (total 1 sec;
> paused 0 sec; 0 K/sec)
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: updated UUIDs
> 857412ABB0CE0738:0000000000000000:45603C39E924BE3A:455F3C39E924BE3B
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: conn( SyncTarget ->
> Connected ) disk( Inconsistent -> UpToDate )
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: helper command:
> /sbin/drbdadm after-resync-target minor-0
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2108]: invoked for mysql
> Oct 23 13:24:00 zarafa02 cibadmin[2112]:   notice: crm_log_args: Invoked:
> cibadmin -Ql
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2108]: Could not establish
> cib_rw connection: Connection refused (111)
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2108]: Signon to CIB failed:
> Transport endpoint is not connected
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2108]: Init failed, could not
> perform requested operations
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: helper command:
> /sbin/drbdadm after-resync-target minor-0 exit code 1 (0x100)
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: bitmap WRITE of 2000 pages
> took 20 jiffies
> Oct 23 13:24:00 zarafa02 kernel: block drbd0: 0 KB (0 bits) marked
> out-of-sync by on disk bit-map.
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: conn( WFBitMapT ->
> WFSyncUUID )
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: updated sync uuid
> C3F40C77423EA3D4:0000000000000000:6120BBFBBB30210C:611FBBFBBB30210D
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: helper command:
> /sbin/drbdadm before-resync-target minor-1
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: helper command:
> /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: conn( WFSyncUUID ->
> SyncTarget ) disk( Outdated -> Inconsistent )
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: Began resync as SyncTarget
> (will sync 0 KB [0 bits set]).
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: Resync done (total 1 sec;
> paused 0 sec; 0 K/sec)
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: updated UUIDs
> 227D17F2074428C8:0000000000000000:C3F40C77423EA3D4:C3F30C77423EA3D5
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: conn( SyncTarget ->
> Connected ) disk( Inconsistent -> UpToDate )
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: helper command:
> /sbin/drbdadm after-resync-target minor-1
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2121]: invoked for zarafa
> Oct 23 13:24:00 zarafa02 cibadmin[2129]:   notice: crm_log_args: Invoked:
> cibadmin -Ql
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2121]: Could not establish
> cib_rw connection: Connection refused (111)
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2121]: Signon to CIB failed:
> Transport endpoint is not connected
> Oct 23 13:24:00 zarafa02 crm-unfence-peer.sh[2121]: Init failed, could not
> perform requested operations
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: helper command:
> /sbin/drbdadm after-resync-target minor-1 exit code 1 (0x100)
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: bitmap WRITE of 4800 pages
> took 183 jiffies
> Oct 23 13:24:00 zarafa02 kernel: block drbd1: 0 KB (0 bits) marked
> out-of-sync by on disk bit-map.
> Oct 23 13:24:00 zarafa02 abrtd: Init complete, entering main loop
> Oct 23 13:24:01 zarafa02 pacemakerd[2252]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 pacemakerd[2252]:   notice: main: Starting
> Pacemaker 1.1.8-7.el6 (Build: 394e906):  generated-manpages agent-manpages
> ascii-docs publican-docs ncurses libqb-logging libqb-ipc  corosync-plugin
> cman
> Oct 23 13:24:01 zarafa02 pacemakerd[2252]:   notice: get_local_node_name:
> Defaulting to uname(2).nodename for the local classic openais (with plugin)
> node name
> Oct 23 13:24:01 zarafa02 pacemakerd[2252]:   notice:
> update_node_processes: 0x8e76f0 Node 369098762 now known as zarafa02, was:
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 pacemakerd[2252]:   notice:
> update_node_processes: 0x8e82a0 Node 352321546 now known as zarafa01, was:
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.crmd failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.stonith-ng failed: ipc delivery
> failed (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:24:01 zarafa02 cib[2258]:   notice: crm_add_logfile: Additional
> logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 stonith-ng[2259]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 stonith-ng[2259]:   notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Oct 23 13:24:01 zarafa02 pengine[2262]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 attrd[2261]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Recorded connection 0x1430ee0 for stonith-ng/0
> Oct 23 13:24:01 zarafa02 attrd[2261]:   notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Oct 23 13:24:01 zarafa02 lrmd[2260]:   notice: crm_add_logfile: Additional
> logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Recorded connection 0x1430ee0 for stonith-ng/0
> Oct 23 13:24:01 zarafa02 attrd[2261]:   notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Oct 23 13:24:01 zarafa02 lrmd[2260]:   notice: crm_add_logfile: Additional
> logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] WARN:
> route_ais_message: Sending message to local.cib failed: ipc delivery failed
> (rc=-2)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Recorded connection 0x1435260 for attrd/0
> Oct 23 13:24:01 zarafa02 crmd[2263]:   notice: crm_add_logfile: Additional
> logging available in /var/log/cluster/corosync.log
> Oct 23 13:24:01 zarafa02 crmd[2263]:   notice: main: CRM Git Version:
> 394e906
> Oct 23 13:24:01 zarafa02 attrd[2261]:   notice: main: Starting mainloop...
> Oct 23 13:24:01 zarafa02 cib[2258]:   notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Recorded connection 0x14395e0 for cib/0
> Oct 23 13:24:01 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Sending membership update 272 to cib
> Oct 23 13:24:01 zarafa02 cib[2258]:   notice: ais_dispatch_message:
> Membership 272: quorum acquired
> Oct 23 13:24:01 zarafa02 cib[2258]:   notice: crm_update_peer_state:
> crm_update_ais_node: Node zarafa01[352321546] - state is now member
> Oct 23 13:24:01 zarafa02 cib[2258]:   notice: crm_update_peer_state:
> crm_update_ais_node: Node zarafa02[369098762] - state is now member
> Oct 23 13:24:02 zarafa02 stonith-ng[2259]:   notice: setup_cib: Watching
> for stonith topology changes
> Oct 23 13:24:02 zarafa02 crmd[2263]:   notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Oct 23 13:24:02 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Recorded connection 0x1440080 for crmd/0
> Oct 23 13:24:02 zarafa02 corosync[1700]:   [pcmk  ] info: pcmk_ipc:
> Sending membership update 272 to crmd
> Oct 23 13:24:02 zarafa02 crmd[2263]:   notice: ais_dispatch_message:
> Membership 272: quorum acquired
> Oct 23 13:24:02 zarafa02 crmd[2263]:   notice: crm_update_peer_state:
> crm_update_ais_node: Node zarafa01[352321546] - state is now member
> Oct 23 13:24:02 zarafa02 crmd[2263]:   notice: crm_update_peer_state:
> crm_update_ais_node: Node zarafa02[369098762] - state is now member
> Oct 23 13:24:02 zarafa02 crmd[2263]:   notice: do_started: The local CRM
> is operational
> Oct 23 13:24:03 zarafa02 stonith-ng[2259]:   notice:
> stonith_device_register: Added 'stonith-zarafa01' to the device list (1
> active devices)
> Oct 23 13:24:04 zarafa02 stonith-ng[2259]:   notice:
> stonith_device_register: Added 'stonith-zarafa02' to the device list (2
> active devices)
> Oct 23 13:24:04 zarafa02 crmd[2263]:   notice: do_state_transition: State
> transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE
> origin=do_cl_join_finalize_respond ]
>
>
>
> 2013/10/23 Michael Schwartzkopff <ms at sys4.de>
>
>> Am Mittwoch, 23. Oktober 2013, 12:39:35 schrieb Beo Banks:
>> > hi,
>> >
>> > thanks for answer.
>> >
>> > the pacemaker|corosync is running on both nodes.
>> >
>> > [chkconfig | grep corosync
>> > corosync        0:Aus   1:Aus   2:Ein   3:Ein   4:Ein   5:Ein   6:Aus
>> > chkconfig | grep pacemaker
>> > pacemaker       0:Aus   1:Aus   2:Ein   3:Ein   4:Ein   5:Ein   6:Aus
>>
>> Looks good. What does the log of the boot say? Any signs of corosync?
>>
>> > @ssh key
>> > no, i created the keys without passphrase.
>> > maybe the config is wrong but i have checked a lot times and i can´t
>> find
>> > any issue.
>>
>> Is the key readable for the user that executes the fencing command? host
>> list
>> correct? How does the fencing agent know to fence "host2" (see your
>> command
>> line). How does it know what IP address to use? Is your /etc/hosts
>> correct?
>>
>> > btw. your book "clusterbau: hochverfügbarkeit mit linux version3" is
>> very
>> > helpfull.
>>
>> Thanks. If your like it, you could write a 5-star review at amazon!
>>
>> > btw. selinux is disabled and iptables can´t be the reason because
>> > fence_virsh works via commandline,or?
>>
>> it uses the virsh communication. It depends how you did set it up, but
>> normally you use SSH. But see your setup.
>>
>> Mit freundlichen Grüßen,
>>
>> Michael Schwartzkopff
>>
>> --
>> [*] sys4 AG
>>
>> http://sys4.de, +49 (89) 30 90 46 64, +49 (162) 165 0044
>> Franziskanerstraße 15, 81669 München
>>
>> Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
>> Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
>> Aufsichtsratsvorsitzender: Florian Kirstein
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131024/aab41a6d/attachment-0003.html>


More information about the Pacemaker mailing list