Dear Dejan,<br><br>The test sequence is:<br><br>1.Service is running on ServerA(tibcodb)<br>2.The Network Cable on ServerA is pulled out<br>3.ServerB(tibcodb2) fenced ServerA,ServerA reboot<br>4.ServerB take over the service<br>
5.ServerA restart and the network is up<br>6.ServerA fence ServerB and take over the service<br>7.ServerB reboot<br><br><br>My question is : after step5:ServerA restart and the network is up, it should not fence ServerB again but it did.<br>
What happened between step5 and step6 ?<br><br>Here's the log of ServerA between step5 and step6:<br><br>Jun 1 11:30:27 tibcodb syslog-ng[3361]: syslog-ng starting up; version='2.0.9'<br>Jun 1 11:30:30 tibcodb ifup: lo <br>
Jun 1 11:30:30 tibcodb ifup: lo <br>Jun 1 11:30:30 tibcodb ifup: IP address: <a href="http://127.0.0.1/8">127.0.0.1/8</a> <br>Jun 1 11:30:30 tibcodb ifup: <br>Jun 1 11:30:30 tibcodb ifup: <br>
Jun 1 11:30:30 tibcodb ifup: IP address: <a href="http://127.0.0.2/8">127.0.0.2/8</a> <br>Jun 1 11:30:30 tibcodb ifup: <br>Jun 1 11:30:30 tibcodb ifup: eth0 device: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)<br>
Jun 1 11:30:30 tibcodb ifup: eth0 <br>Jun 1 11:30:30 tibcodb ifup: IP address: <a href="http://10.224.1.89/24">10.224.1.89/24</a> <br>Jun 1 11:30:30 tibcodb ifup: <br>Jun 1 11:30:31 tibcodb SuSEfirewall2: SuSEfirewall2 not active<br>
Jun 1 11:30:31 tibcodb ifup: eth1 device: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)<br>Jun 1 11:30:31 tibcodb ifup: No configuration found for eth1<br>Jun 1 11:30:32 tibcodb kernel: klogd 1.4.1, log source = /proc/kmsg started.<br>
Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363023.852:2): operation="profile_load" name="/bin/ping" name2="default" pid=3061<br>Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363023.872:3): operation="profile_load" name="/sbin/klogd" name2="default" pid=3094<br>
Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363023.912:4): operation="profile_load" name="/sbin/syslog-ng" name2="default" pid=3102<br>Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363023.952:5): operation="profile_load" name="/sbin/syslogd" name2="default" pid=3116<br>
Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363024.196:6): operation="profile_load" name="/usr/sbin/avahi-daemon" name2="default" pid=3133<br>Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363024.256:7): operation="profile_load" name="/usr/sbin/identd" name2="default" pid=3134<br>
Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363024.452:8): operation="profile_load" name="/usr/sbin/mdnsd" name2="default" pid=3135<br>Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363024.521:9): operation="profile_load" name="/usr/sbin/nscd" name2="default" pid=3136<br>
Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363024.584:10): operation="profile_load" name="/usr/sbin/ntpd" name2="default" pid=3137<br>Jun 1 11:30:32 tibcodb kernel: type=1505 audit(1275363024.620:11): operation="profile_load" name="/usr/sbin/traceroute" name2="default" pid=3138<br>
Jun 1 11:30:32 tibcodb kernel: SoftDog: cannot register miscdev on minor=130 (err=-16)<br>Jun 1 11:30:32 tibcodb kernel: IA-32 Microcode Update Driver: v1.14a <<a href="mailto:tigran@aivazian.fsnet.co.uk">tigran@aivazian.fsnet.co.uk</a>><br>
Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>
Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>
Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>Jun 1 11:30:32 tibcodb kernel: firmware: requesting intel-ucode/06-17-0a<br>Jun 1 11:30:32 tibcodb kernel: NET: Registered protocol family 10<br>
Jun 1 11:30:32 tibcodb kernel: lo: Disabled Privacy Extensions<br>Jun 1 11:30:32 tibcodb kernel: bnx2: eth0: using MSI<br>Jun 1 11:30:32 tibcodb kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready<br>Jun 1 11:30:33 tibcodb kernel: bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON<br>
Jun 1 11:30:33 tibcodb kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready<br>Jun 1 11:30:35 tibcodb auditd[4875]: Started dispatcher: /sbin/audispd pid: 4877<br>Jun 1 11:30:35 tibcodb smartd[4848]: smartd 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM) Copyright (C) 2002-8 by Bruce Allen, <a href="http://smartmontools.sourceforge.net">http://smartmontools.sourceforge.net</a><br>
Jun 1 11:30:35 tibcodb smartd[4848]: Opened configuration file /etc/smartd.conf<br>Jun 1 11:30:35 tibcodb audispd: priority_boost_parser called with: 4<br>Jun 1 11:30:35 tibcodb smartd[4848]: Drive: DEVICESCAN, implied '-a' Directive on line 26 of file /etc/smartd.conf<br>
Jun 1 11:30:35 tibcodb smartd[4848]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sda, opened<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sda, Bad IEC (SMART) mode page, err=5, skip device<br>
Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sdb, opened<br>Jun 1 11:30:35 tibcodb audispd: af_unix plugin initialized<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sdb, IE (SMART) not enabled, skip device Try 'smartctl -s on /dev/sdb' to turn on SMART features<br>
Jun 1 11:30:35 tibcodb audispd: audispd initialized with q_depth=80 and 1 active plugins<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sdc, opened<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sdc, IE (SMART) not enabled, skip device Try 'smartctl -s on /dev/sdc' to turn on SMART features<br>
Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sdd, opened<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sdd, IE (SMART) not enabled, skip device Try 'smartctl -s on /dev/sdd' to turn on SMART features<br>
Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sde, opened<br>Jun 1 11:30:35 tibcodb smartd[4848]: Device: /dev/sde, IE (SMART) not enabled, skip device Try 'smartctl -s on /dev/sde' to turn on SMART features<br>
Jun 1 11:30:35 tibcodb smartd[4848]: Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting...<br>Jun 1 11:30:36 tibcodb kernel: device-mapper: table: 253:2: multipath: error getting device<br>Jun 1 11:30:36 tibcodb kernel: device-mapper: ioctl: error adding target to table<br>
Jun 1 11:30:36 tibcodb auditd[4875]: Init complete, auditd 1.7.7 listening for events (startup state disable)<br>Jun 1 11:30:36 tibcodb kernel: device-mapper: table: 253:2: multipath: error getting device<br>Jun 1 11:30:36 tibcodb kernel: device-mapper: ioctl: error adding target to table<br>
Jun 1 11:30:36 tibcodb sshd[4959]: Server listening on 0.0.0.0 port 22.<br>Jun 1 11:30:36 tibcodb sshd[4959]: Server listening on :: port 22.<br>Jun 1 11:30:37 tibcodb slapd[4951]: @(#) $OpenLDAP: slapd 2.4.12 (Feb 23 2009 18:39:24) $ abuild@crumb:/usr/src/packages/BUILD/openldap-2.4.12/servers/slapd<br>
Jun 1 11:30:38 tibcodb logger: /etc/init.d/xdm: No changes for /etc/X11/xdm/Xservers<br>Jun 1 11:30:38 tibcodb logger: /etc/init.d/xdm: No changes for /etc/X11/xdm/xdm-config<br>Jun 1 11:30:39 tibcodb sbd: [5004]: info: tibcodb owns slot 0<br>
Jun 1 11:30:39 tibcodb sbd: [5004]: info: Monitoring slot 0<br>Jun 1 11:30:39 tibcodb sbd: [5011]: notice: Using watchdog device: /dev/watchdog<br>Jun 1 11:30:39 tibcodb sbd: [5011]: info: Set watchdog timeout to 5 seconds.<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] AIS Executive Service RELEASE 'subrev 1152 version 0.80'<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] Copyright (C) 2006 Red Hat, Inc.<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] AIS Executive Service: started and ready to provide service.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Token Timeout (5000 ms) retransmit timeout (490 ms)<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] token hold (382 ms) retransmits before loss (10 retrans)<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] join (1000 ms) send_join (45 ms) consensus (2500 ms) merge (200 ms)<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs)<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] window size per rotation (50 messages) maximum messages per rotation (20 messages)<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] send threads (0 threads)<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] RRP token expired timeout (490 ms)<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] RRP token problem counter (2000 ms)<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] RRP threshold (10 problem count)<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] RRP mode set to none.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] heartbeat_failures_allowed (0)<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] max_network_delay (50 ms)<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] The network interface [10.224.1.89] is now up.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Created or loaded sequence id 100.10.224.1.89 for this ring.<br>
Jun 1 11:30:39 tibcodb slapd[5010]: hdb_monitor_db_open: monitoring disabled; configure monitor database to enable<br>Jun 1 11:30:39 tibcodb slapd[5010]: slapd starting<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] entering GATHER state from 15.<br>
Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: process_ais_conf: Reading configure<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: config_find_next: Processing additional logging options...<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: get_config_opt: Found 'off' for option: debug<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: get_config_opt: Found 'yes' for option: to_syslog<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: get_config_opt: Found 'daemon' for option: syslog_facility<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: config_find_next: Processing additional service options...<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: get_config_opt: Found 'no' for option: use_logd<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: get_config_opt: Found 'yes' for option: use_mgmtd<br>Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_plugin_init: CRM: Initialized<br>Jun 1 11:30:39 tibcodb openais[5018]: [crm ] Logging: Initialized pcmk_plugin_init<br>
Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_plugin_init: Service: 9<br>Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_plugin_init: Local node id: 1<br>Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_plugin_init: Local hostname: tibcodb<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: update_member: Creating entry for node 1 born on 0<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: update_member: 0x73e2f0 Node 1 now known as tibcodb (was: (null))<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: update_member: Node tibcodb now has 1 quorum votes (was 0)<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: update_member: Node 1/tibcodb is now: member<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5026 for process stonithd<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5027 for process cib<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5028 for process lrmd<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5029 for process attrd<br>
Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5030 for process pengine<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5031 for process crmd<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: spawn_child: Forked child 5032 for process mgmtd<br>
Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_startup: CRM: Initialized<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] Service initialized 'Pacemaker Cluster Manager'<br>Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais extended virtual synchrony service'<br>
Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais cluster membership service B.01.01'<br>Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais availability management framework B.01.01'<br>
Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais checkpoint service B.01.01'<br>Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais event service B.01.01'<br>
Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais distributed locking service B.01.01'<br>Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais message service B.01.01'<br>
Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais configuration service'<br>Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais cluster closed process group service v1.01'<br>
Jun 1 11:30:39 tibcodb openais[5018]: [SERV ] Service initialized 'openais cluster config database access v1.01'<br>Jun 1 11:30:39 tibcodb openais[5018]: [SYNC ] Not using a virtual synchrony filter.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Creating commit token because I am the rep.<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Saving state aru 0 high seq received 0<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Storing new sequence id for ring 68<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] entering COMMIT state.<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] entering RECOVERY state.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] position [0] member <a href="http://10.224.1.89">10.224.1.89</a>:<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] previous ring seq 100 rep 10.224.1.89<br>
Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] aru 0 high delivered 0 received flag 1<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Did not need to originate any messages in recovery.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Sending initial ORF token<br>
Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] CLM CONFIGURATION CHANGE<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] New Configuration:<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] Members Left:<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] Members Joined:<br>
Jun 1 11:30:39 tibcodb openais[5018]: [crm ] notice: pcmk_peer_update: Transitional membership event on ring 104: memb=0, new=0, lost=0<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] CLM CONFIGURATION CHANGE<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] New Configuration:<br>
Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] r(0) ip(10.224.1.89) <br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] Members Left:<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] Members Joined:<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] r(0) ip(10.224.1.89) <br>
Jun 1 11:30:39 tibcodb openais[5018]: [crm ] notice: pcmk_peer_update: Stable membership event on ring 104: memb=1, new=1, lost=0<br>Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_peer_update: NEW: tibcodb 1<br>
Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_peer_update: MEMB: tibcodb 1<br>Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: update_member: Node tibcodb now has process list: 00000000000000000000000000053312 (340754)<br>
Jun 1 11:30:39 tibcodb openais[5018]: [SYNC ] This node is within the primary component and will provide service.<br>Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] entering OPERATIONAL state.<br>Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] got nodejoin message 10.224.1.89<br>
Jun 1 11:30:39 tibcodb lrmd: [5028]: info: G_main_add_SignalHandler: Added signal handler for signal 15<br>Jun 1 11:30:39 tibcodb mgmtd: [5032]: info: G_main_add_SignalHandler: Added signal handler for signal 15<br>Jun 1 11:30:39 tibcodb mgmtd: [5032]: debug: Enabling coredumps<br>
Jun 1 11:30:39 tibcodb mgmtd: [5032]: info: G_main_add_SignalHandler: Added signal handler for signal 10<br>Jun 1 11:30:39 tibcodb mgmtd: [5032]: info: G_main_add_SignalHandler: Added signal handler for signal 12<br>Jun 1 11:30:39 tibcodb pengine: [5030]: info: crm_log_init: Changed active directory to /var/lib/heartbeat/cores/hacluster<br>
Jun 1 11:30:39 tibcodb attrd: [5029]: info: crm_log_init: Changed active directory to /var/lib/heartbeat/cores/hacluster<br>Jun 1 11:30:39 tibcodb cib: [5027]: info: crm_log_init: Changed active directory to /var/lib/heartbeat/cores/hacluster<br>
Jun 1 11:30:39 tibcodb mgmtd: [5032]: WARN: lrm_signon: can not initiate connection<br>Jun 1 11:30:39 tibcodb mgmtd: [5032]: info: login to lrm: 0, ret:0<br>Jun 1 11:30:39 tibcodb attrd: [5029]: info: main: Starting up....<br>
Jun 1 11:30:39 tibcodb attrd: [5029]: info: init_ais_connection: Creating connection to our AIS plugin<br>Jun 1 11:30:39 tibcodb cib: [5027]: info: G_main_add_TriggerHandler: Added signal manual handler<br>Jun 1 11:30:39 tibcodb cib: [5027]: info: G_main_add_SignalHandler: Added signal handler for signal 17<br>
Jun 1 11:30:39 tibcodb attrd: [5029]: info: init_ais_connection: AIS connection established<br>Jun 1 11:30:39 tibcodb stonithd: [5026]: info: G_main_add_SignalHandler: Added signal handler for signal 10<br>Jun 1 11:30:39 tibcodb stonithd: [5026]: info: G_main_add_SignalHandler: Added signal handler for signal 12<br>
Jun 1 11:30:39 tibcodb stonithd: [5026]: info: init_ais_connection: Creating connection to our AIS plugin<br>Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_ipc: Recorded connection 0x748500 for attrd/5029<br>Jun 1 11:30:39 tibcodb attrd: [5029]: info: get_ais_nodeid: Server details: id=1 uname=tibcodb<br>
Jun 1 11:30:39 tibcodb attrd: [5029]: info: crm_new_peer: Node tibcodb now has id: 1<br>Jun 1 11:30:39 tibcodb attrd: [5029]: info: crm_new_peer: Node 1 is now known as tibcodb<br>Jun 1 11:30:39 tibcodb pengine: [5030]: info: main: Starting pengine<br>
Jun 1 11:30:39 tibcodb crmd: [5031]: info: crm_log_init: Changed active directory to /var/lib/heartbeat/cores/hacluster<br>Jun 1 11:30:40 tibcodb crmd: [5031]: info: main: CRM Hg Version: 0080ec086ae9c20ad5c4c3562000c0ad68374f0a<br>
Jun 1 11:30:39 tibcodb lrmd: [5028]: info: G_main_add_SignalHandler: Added signal handler for signal 17<br>Jun 1 11:30:39 tibcodb cib: [5027]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)<br>
Jun 1 11:30:40 tibcodb crmd: [5031]: info: crmd_init: Starting crmd<br>Jun 1 11:30:40 tibcodb crmd: [5031]: info: G_main_add_SignalHandler: Added signal handler for signal 17<br>Jun 1 11:30:39 tibcodb stonithd: [5026]: info: init_ais_connection: AIS connection established<br>
Jun 1 11:30:40 tibcodb openais[5018]: [crm ] info: pcmk_ipc: Recorded connection 0x7fc714034210 for stonithd/5026<br>Jun 1 11:30:40 tibcodb lrmd: [5028]: info: G_main_add_SignalHandler: Added signal handler for signal 10<br>
Jun 1 11:30:40 tibcodb lrmd: [5028]: info: G_main_add_SignalHandler: Added signal handler for signal 12<br>Jun 1 11:30:40 tibcodb lrmd: [5028]: info: Started.<br>Jun 1 11:30:40 tibcodb stonithd: [5026]: info: get_ais_nodeid: Server details: id=1 uname=tibcodb<br>
Jun 1 11:30:40 tibcodb stonithd: [5026]: info: crm_new_peer: Node tibcodb now has id: 1<br>Jun 1 11:30:40 tibcodb stonithd: [5026]: info: crm_new_peer: Node 1 is now known as tibcodb<br>Jun 1 11:30:40 tibcodb stonithd: [5026]: notice: /usr/lib64/heartbeat/stonithd start up successfully.<br>
Jun 1 11:30:40 tibcodb stonithd: [5026]: info: G_main_add_SignalHandler: Added signal handler for signal 17<br>Jun 1 11:30:40 tibcodb cib: [5027]: info: startCib: CIB Initialization completed successfully<br>Jun 1 11:30:40 tibcodb cib: [5027]: info: init_ais_connection: Creating connection to our AIS plugin<br>
Jun 1 11:30:40 tibcodb cib: [5027]: info: init_ais_connection: AIS connection established<br>Jun 1 11:30:40 tibcodb openais[5018]: [crm ] info: pcmk_ipc: Recorded connection 0x7fc714034540 for cib/5027<br>Jun 1 11:30:40 tibcodb cib: [5027]: info: get_ais_nodeid: Server details: id=1 uname=tibcodb<br>
Jun 1 11:30:40 tibcodb cib: [5027]: info: crm_new_peer: Node tibcodb now has id: 1<br>Jun 1 11:30:40 tibcodb cib: [5027]: info: crm_new_peer: Node 1 is now known as tibcodb<br>Jun 1 11:30:40 tibcodb openais[5018]: [crm ] info: pcmk_ipc: Sending membership update 104 to cib<br>
Jun 1 11:30:40 tibcodb cib: [5027]: info: cib_init: Starting cib mainloop<br>Jun 1 11:30:40 tibcodb cib: [5027]: info: ais_dispatch: Membership 104: quorum still lost<br>Jun 1 11:30:40 tibcodb cib: [5027]: info: crm_update_peer: Node tibcodb: id=1 state=member (new) addr=r(0) ip(10.224.1.89) (new) votes=1 (new) born=0 seen=104 proc=00000000000000000000000000053312 (new)<br>
Jun 1 11:30:40 tibcodb cib: [5043]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-39.raw<br>Jun 1 11:30:40 tibcodb mgmtd: [5032]: info: init_crm<br>Jun 1 11:30:41 tibcodb cib: [5043]: info: write_cib_contents: Wrote version 0.175.0 of the CIB to disk (digest: af622b2c9a5e54f7233a32c58f4dacfc)<br>
Jun 1 11:30:41 tibcodb crmd: [5031]: info: do_cib_control: CIB connection established<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: init_ais_connection: Creating connection to our AIS plugin<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: init_ais_connection: AIS connection established<br>
Jun 1 11:30:41 tibcodb openais[5018]: [crm ] info: pcmk_ipc: Recorded connection 0x7fc714034770 for crmd/5031<br>Jun 1 11:30:41 tibcodb openais[5018]: [crm ] info: pcmk_ipc: Sending membership update 104 to crmd<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: get_ais_nodeid: Server details: id=1 uname=tibcodb<br>
Jun 1 11:30:41 tibcodb crmd: [5031]: info: crm_new_peer: Node tibcodb now has id: 1<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: crm_new_peer: Node 1 is now known as tibcodb<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: do_ha_control: Connected to the cluster<br>
Jun 1 11:30:41 tibcodb crmd: [5031]: info: do_started: Delaying start, CCM (0000000000100000) not connected<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: crmd_init: Starting crmd's mainloop<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: config_query_callback: Checking for expired actions every 900000ms<br>
Jun 1 11:30:41 tibcodb openais[5018]: [crm ] info: update_expected_votes: Expected quorum votes 1024 -> 2<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: ais_dispatch: Membership 104: quorum still lost<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: crm_update_peer: Node tibcodb: id=1 state=member (new) addr=r(0) ip(10.224.1.89) (new) votes=1 (new) born=0 seen=104 proc=00000000000000000000000000053312 (new)<br>
Jun 1 11:30:41 tibcodb mgmtd: [5032]: debug: main: run the loop...<br>Jun 1 11:30:41 tibcodb mgmtd: [5032]: info: Started.<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: do_started: The local CRM is operational<br>Jun 1 11:30:41 tibcodb crmd: [5031]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]<br>
Jun 1 11:30:41 tibcodb cib: [5043]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.Tt2bKz (digest: /var/lib/heartbeat/crm/cib.V4G9lD)<br>Jun 1 11:30:42 tibcodb crmd: [5031]: info: ais_dispatch: Membership 104: quorum still lost<br>
Jun 1 11:30:44 tibcodb kernel: eth0: no IPv6 routers present<br>Jun 1 11:30:44 tibcodb /usr/sbin/cron[5157]: (CRON) STARTUP (V5.0)<br>Jun 1 11:30:44 tibcodb kernel: bootsplash: status on console 0 changed to on<br>Jun 1 11:30:48 tibcodb gdm-simple-greeter[5275]: libglade-WARNING: Unexpected element <requires-version> inside <glade-interface>.<br>
Jun 1 11:30:49 tibcodb attrd: [5029]: info: main: Sending full refresh<br>Jun 1 11:30:49 tibcodb attrd: [5029]: info: main: Starting mainloop...<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped!<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ]<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ]<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_te_control: Registering TE UUID: 90a94734-8dd3-438a-b180-9e6ff221cde9<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: WARN: cib_client_add_notify_callback: Callback already present<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: set_graph_functions: Setting custom graph functions<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: unpack_graph: Unpacked transition -1: 0 actions in 0 synapses<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_dc_takeover: Taking over DC status for this partition<br>Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_readwrite: We are now in R/W mode<br>Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_master for section 'all' (origin=local/crmd/6, version=0.175.0): ok (rc=0)<br>
Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/7, version=0.175.0): ok (rc=0)<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: join_make_offer: Making join offers based on membership 104<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_dc_join_offer_all: join-1: Waiting on 1 outstanding join acks<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: ais_dispatch: Membership 104: quorum still lost<br>Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/9, version=0.175.0): ok (rc=0)<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: info: config_query_callback: Checking for expired actions every 900000ms<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: update_dc: Set DC to tibcodb (3.0.1)<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: ais_dispatch: Membership 104: quorum still lost<br>
Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/12, version=0.175.0): ok (rc=0)<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_state_transition: All 1 cluster nodes responded to the join offer.<br>Jun 1 11:30:52 tibcodb crmd: [5031]: info: do_dc_join_finalize: join-1: Syncing the CIB from tibcodb to the rest of the cluster<br>
Jun 1 11:30:52 tibcodb crmd: [5031]: info: te_connect_stonith: Attempting connection to fencing daemon...<br>Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/15, version=0.175.0): ok (rc=0)<br>
Jun 1 11:30:52 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/16, version=0.175.0): ok (rc=0)<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: te_connect_stonith: Connected<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: update_attrd: Connecting to attrd...<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: update_attrd: Updating terminate=<none> via attrd for tibcodb<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: update_attrd: Updating shutdown=<none> via attrd for tibcodb<br>
Jun 1 11:30:53 tibcodb attrd: [5029]: info: find_hash_entry: Creating hash entry for terminate<br>Jun 1 11:30:53 tibcodb attrd: [5029]: info: find_hash_entry: Creating hash entry for shutdown<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_dc_join_ack: join-1: Updating node state to member for tibcodb<br>
Jun 1 11:30:53 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/17, version=0.175.0): ok (rc=0)<br>Jun 1 11:30:53 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='tibcodb']/transient_attributes (origin=local/crmd/18, version=0.175.0): ok (rc=0)<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: erase_xpath_callback: Deletion of "//node_state[@uname='tibcodb']/transient_attributes": ok (rc=0)<br>Jun 1 11:30:53 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='tibcodb']/lrm (origin=local/crmd/19, version=0.175.0): ok (rc=0)<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: erase_xpath_callback: Deletion of "//node_state[@uname='tibcodb']/lrm": ok (rc=0)<br>Jun 1 11:30:53 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='tibcodb']/lrm (origin=local/crmd/20, version=0.175.0): ok (rc=0)<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: erase_xpath_callback: Deletion of "//node_state[@uname='tibcodb']/lrm": ok (rc=0)<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ]<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_state_transition: All 1 cluster nodes are eligible to run resources.<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date<br>
Jun 1 11:30:53 tibcodb attrd: [5029]: info: attrd_local_callback: Sending full refresh (origin=crmd)<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: crm_update_quorum: Updating quorum status to false (call=24)<br>Jun 1 11:30:53 tibcodb attrd: [5029]: info: attrd_trigger_update: Sending flush op to all hosts for: terminate<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: abort_transition_graph: do_te_invoke:190 - Triggered transition abort (complete=1) : Peer Cancelled<br>Jun 1 11:30:53 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/22, version=0.175.1): ok (rc=0)<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_pe_invoke: Query 25: Requesting the current CIB: S_POLICY_ENGINE<br>Jun 1 11:30:53 tibcodb cib: [5027]: info: log_data_element: cib:diff: - <cib have-quorum="1" admin_epoch="0" epoch="175" num_updates="1" /><br>
Jun 1 11:30:53 tibcodb cib: [5027]: info: log_data_element: cib:diff: + <cib have-quorum="0" dc-uuid="tibcodb" admin_epoch="0" epoch="176" num_updates="1" /><br>Jun 1 11:30:53 tibcodb cib: [5027]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/24, version=0.176.1): ok (rc=0)<br>
Jun 1 11:30:53 tibcodb attrd: [5029]: info: attrd_trigger_update: Sending flush op to all hosts for: shutdown<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: need_abort: Aborting on change to have-quorum<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1275363053-7, seq=104, quorate=0<br>
Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_pe_invoke: Query 26: Requesting the current CIB: S_POLICY_ENGINE<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: unpack_config: On loss of CCM Quorum: Ignore<br>Jun 1 11:30:53 tibcodb crmd: [5031]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1275363053-8, seq=104, quorate=0<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: info: determine_online_status: Node tibcodb is online<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: clone_print: Clone Set: Connected<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: print_list: Stopped: [ ping:0 ping:1 ]<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: group_print: Resource Group: DB<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: native_print: FileSystem (ocf::heartbeat:Filesystem): Stopped <br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: native_print: ServiceIP (ocf::heartbeat:IPaddr): Stopped <br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: native_print: Instance (ocf::heartbeat:oracle): Stopped <br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: native_print: Listener (ocf::heartbeat:oralsnr): Stopped <br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: clone_print: Clone Set: Fence<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: print_list: Stopped: [ sbd-stonith:0 sbd-stonith:1 ]<br>Jun 1 11:30:53 tibcodb pengine: [5030]: WARN: native_color: Resource ping:1 cannot run anywhere<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: WARN: native_color: Resource sbd-stonith:1 cannot run anywhere<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (10s) for ping:0 on tibcodb<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (20s) for FileSystem on tibcodb<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (5s) for ServiceIP on tibcodb<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (30s) for Instance on tibcodb<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (10s) for Listener on tibcodb<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (15s) for sbd-stonith:0 on tibcodb<br>Jun 1 11:30:53 tibcodb pengine: [5030]: WARN: stage6: Scheduling Node tibcodb2 for STONITH<br>Jun 1 11:30:53 tibcodb pengine: [5030]: info: native_start_constraints: Ordering ping:0_start_0 after tibcodb2 recovery<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: info: native_start_constraints: Ordering FileSystem_start_0 after tibcodb2 recovery<br>Jun 1 11:30:53 tibcodb pengine: [5030]: info: native_start_constraints: Ordering ServiceIP_start_0 after tibcodb2 recovery<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: info: native_start_constraints: Ordering Instance_start_0 after tibcodb2 recovery<br>Jun 1 11:30:53 tibcodb pengine: [5030]: info: native_start_constraints: Ordering Listener_start_0 after tibcodb2 recovery<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: LogActions: Start ping:0 (tibcodb)<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: LogActions: Leave resource ping:1 (Stopped)<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: LogActions: Start FileSystem (tibcodb)<br>
Jun 1 11:30:53 tibcodb pengine: [5030]: notice: LogActions: Start ServiceIP (tibcodb)<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: LogActions: Start Instance (tibcodb)<br>Jun 1 11:30:53 tibcodb pengine: [5030]: notice: LogActions: Start Listener (tibcodb)<br>
Jun 1 11:30:53 tibcodb cib: [5287]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-40.raw<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: LogActions: Start sbd-stonith:0 (tibcodb)<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: LogActions: Leave resource sbd-stonith:1 (Stopped)<br>Jun 1 11:30:54 tibcodb crmd: [5031]: info: handle_response: pe_calc calculation pe_calc-dc-1275363053-7 is obsolete<br>
Jun 1 11:30:54 tibcodb cib: [5287]: info: write_cib_contents: Wrote version 0.176.0 of the CIB to disk (digest: 2b59e1bbb7de885b6f9dbf5fd1a1d1cc)<br>Jun 1 11:30:54 tibcodb pengine: [5030]: WARN: process_pe_message: Transition 0: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-265.bz2<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues.<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: unpack_config: On loss of CCM Quorum: Ignore<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: info: determine_online_status: Node tibcodb is online<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: clone_print: Clone Set: Connected<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: print_list: Stopped: [ ping:0 ping:1 ]<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: group_print: Resource Group: DB<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: native_print: FileSystem (ocf::heartbeat:Filesystem): Stopped <br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: native_print: ServiceIP (ocf::heartbeat:IPaddr): Stopped <br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: native_print: Instance (ocf::heartbeat:oracle): Stopped <br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: native_print: Listener (ocf::heartbeat:oralsnr): Stopped <br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: clone_print: Clone Set: Fence<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: print_list: Stopped: [ sbd-stonith:0 sbd-stonith:1 ]<br>Jun 1 11:30:54 tibcodb pengine: [5030]: WARN: native_color: Resource ping:1 cannot run anywhere<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: WARN: native_color: Resource sbd-stonith:1 cannot run anywhere<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (10s) for ping:0 on tibcodb<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (20s) for FileSystem on tibcodb<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (5s) for ServiceIP on tibcodb<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (30s) for Instance on tibcodb<br>Jun 1 11:30:54 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (10s) for Listener on tibcodb<br>
Jun 1 11:30:54 tibcodb pengine: [5030]: notice: RecurringOp: Start recurring monitor (15s) for sbd-stonith:0 on tibcodb<br>Jun 1 11:30:54 tibcodb pengine: [5030]: WARN: stage6: Scheduling Node tibcodb2 for STONITH<br>Jun 1 11:30:55 tibcodb cib: [5287]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.cJgG8Z (digest: /var/lib/heartbeat/crm/cib.ILxKlF)<br>
Jun 1 11:30:55 tibcodb pengine: [5030]: info: native_start_constraints: Ordering ping:0_start_0 after tibcodb2 recovery<br>Jun 1 11:30:55 tibcodb pengine: [5030]: info: native_start_constraints: Ordering FileSystem_start_0 after tibcodb2 recovery<br>
Jun 1 11:30:55 tibcodb pengine: [5030]: info: native_start_constraints: Ordering ServiceIP_start_0 after tibcodb2 recovery<br>Jun 1 11:30:55 tibcodb pengine: [5030]: info: native_start_constraints: Ordering Instance_start_0 after tibcodb2 recovery<br>
Jun 1 11:30:55 tibcodb pengine: [5030]: info: native_start_constraints: Ordering Listener_start_0 after tibcodb2 recovery<br>Jun 1 11:30:55 tibcodb pengine: [5030]: notice: LogActions: Start ping:0 (tibcodb)<br>Jun 1 11:30:55 tibcodb pengine: [5030]: notice: LogActions: Leave resource ping:1 (Stopped)<br>
Jun 1 11:30:55 tibcodb pengine: [5030]: notice: LogActions: Start FileSystem (tibcodb)<br>Jun 1 11:30:55 tibcodb pengine: [5030]: notice: LogActions: Start ServiceIP (tibcodb)<br>Jun 1 11:30:56 tibcodb pengine: [5030]: notice: LogActions: Start Instance (tibcodb)<br>
Jun 1 11:30:56 tibcodb pengine: [5030]: notice: LogActions: Start Listener (tibcodb)<br>Jun 1 11:30:56 tibcodb pengine: [5030]: notice: LogActions: Start sbd-stonith:0 (tibcodb)<br>Jun 1 11:30:56 tibcodb pengine: [5030]: notice: LogActions: Leave resource sbd-stonith:1 (Stopped)<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: unpack_graph: Unpacked transition 1: 37 actions in 37 synapses<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_te_invoke: Processing graph 1 (ref=pe_calc-dc-1275363053-8) derived from /var/lib/pengine/pe-warn-266.bz2<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 4: monitor ping:0_monitor_0 on tibcodb (local)<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=4:1:7:90a94734-8dd3-438a-b180-9e6ff221cde9 op=ping:0_monitor_0 )<br>Jun 1 11:30:56 tibcodb lrmd: [5028]: info: rsc:ping:0: monitor<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 5: monitor FileSystem_monitor_0 on tibcodb (local)<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=5:1:7:90a94734-8dd3-438a-b180-9e6ff221cde9 op=FileSystem_monitor_0 )<br>Jun 1 11:30:56 tibcodb lrmd: [5028]: info: rsc:FileSystem: monitor<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 6: monitor ServiceIP_monitor_0 on tibcodb (local)<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=6:1:7:90a94734-8dd3-438a-b180-9e6ff221cde9 op=ServiceIP_monitor_0 )<br>Jun 1 11:30:56 tibcodb lrmd: [5028]: info: rsc:ServiceIP: monitor<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 7: monitor Instance_monitor_0 on tibcodb (local)<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=7:1:7:90a94734-8dd3-438a-b180-9e6ff221cde9 op=Instance_monitor_0 )<br>Jun 1 11:30:56 tibcodb lrmd: [5028]: info: rsc:Instance: monitor<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 8: monitor Listener_monitor_0 on tibcodb (local)<br>
Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=8:1:7:90a94734-8dd3-438a-b180-9e6ff221cde9 op=Listener_monitor_0 )<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 9: monitor sbd-stonith:0_monitor_0 on tibcodb (local)<br>
Jun 1 11:30:56 tibcodb lrmd: [5028]: notice: lrmd_rsc_new(): No lrm_rprovider field in message<br>Jun 1 11:30:56 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=9:1:7:90a94734-8dd3-438a-b180-9e6ff221cde9 op=sbd-stonith:0_monitor_0 )<br>
Jun 1 11:30:56 tibcodb pengine: [5030]: WARN: process_pe_message: Transition 1: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-266.bz2<br>Jun 1 11:30:56 tibcodb pengine: [5030]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues.<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation ping:0_monitor_0 (call=2, rc=7, cib-update=27, confirmed=true) complete not running<br>Jun 1 11:30:57 tibcodb lrmd: [5028]: info: RA output: (Instance:monitor:stderr) 2010/06/01_11:30:57 INFO: Oracle environment for SID BpmDBp does not exist<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation Instance_monitor_0 (call=5, rc=7, cib-update=28, confirmed=true) complete not running<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action ping:0_monitor_0 (4) confirmed on tibcodb (rc=0)<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action Instance_monitor_0 (7) confirmed on tibcodb (rc=0)<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation FileSystem_monitor_0 (call=3, rc=7, cib-update=29, confirmed=true) complete not running<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation ServiceIP_monitor_0 (call=4, rc=7, cib-update=30, confirmed=true) complete not running<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action FileSystem_monitor_0 (5) confirmed on tibcodb (rc=0)<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action ServiceIP_monitor_0 (6) confirmed on tibcodb (rc=0)<br>Jun 1 11:30:57 tibcodb lrmd: [5028]: info: rsc:Listener: monitor<br>Jun 1 11:30:57 tibcodb lrmd: [5028]: info: rsc:sbd-stonith:0: monitor<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation sbd-stonith:0_monitor_0 (call=7, rc=7, cib-update=31, confirmed=true) complete not running<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action sbd-stonith:0_monitor_0 (9) confirmed on tibcodb (rc=0)<br>
Jun 1 11:30:57 tibcodb lrmd: [5028]: info: RA output: (Listener:monitor:stderr) 2010/06/01_11:30:57 INFO: Oracle environment for SID BpmDBp does not exist<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation Listener_monitor_0 (call=6, rc=7, cib-update=32, confirmed=true) complete not running<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action Listener_monitor_0 (8) confirmed on tibcodb (rc=0)<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 3: probe_complete probe_complete on tibcodb (local) - no waiting<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 2 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 14 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 26 fired and confirmed<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 42 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 32 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 12 fired and confirmed<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 41 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 30 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 40 fired and confirmed<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 28: start sbd-stonith:0_start_0 on tibcodb (local)<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=28:1:0:90a94734-8dd3-438a-b180-9e6ff221cde9 op=sbd-stonith:0_start_0 )<br>
Jun 1 11:30:57 tibcodb lrmd: [5028]: info: rsc:sbd-stonith:0: start<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 39 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 27 fired and confirmed<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 24 fired and confirmed<br>Jun 1 11:30:57 tibcodb lrmd: [5383]: info: Try to start STONITH resource <rsc_id=sbd-stonith:0> : Device=external/sbd<br>
Jun 1 11:30:57 tibcodb stonithd: [5026]: info: sbd-stonith:0 stonith resource started<br>Jun 1 11:30:57 tibcodb lrmd: [5028]: debug: stonithRA plugin: provider attribute is not needed and will be ignored.<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: process_lrm_event: LRM operation sbd-stonith:0_start_0 (call=8, rc=0, cib-update=36, confirmed=true) complete ok<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: match_graph_event: Action sbd-stonith:0_start_0 (28) confirmed on tibcodb (rc=0)<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_rsc_command: Initiating action 29: monitor sbd-stonith:0_monitor_15000 on tibcodb (local)<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: do_lrm_rsc_op: Performing key=29:1:0:90a94734-8dd3-438a-b180-9e6ff221cde9 op=sbd-stonith:0_monitor_15000 )<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 31 fired and confirmed<br>
Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_pseudo_action: Pseudo action 34 fired and confirmed<br>Jun 1 11:30:57 tibcodb crmd: [5031]: info: te_fence_node: Executing reboot fencing operation (35) on tibcodb2 (timeout=60000)<br>
Jun 1 11:30:57 tibcodb stonithd: [5026]: info: client tengine [pid: 5031] requests a STONITH operation RESET on node tibcodb2<br>Jun 1 11:30:57 tibcodb stonithd: [5026]: info: stonith_operate_locally::2683: sending fencing op RESET for tibcodb2 to sbd-stonith:0 (external/sbd) (pid=5398)<br>
Jun 1 11:30:57 tibcodb sbd: [5400]: info: tibcodb2 owns slot 1<br>Jun 1 11:30:57 tibcodb sbd: [5400]: info: Writing reset to node slot tibcodb2<br>Jun 1 11:31:07 tibcodb sbd: [5400]: info: reset successfully delivered to tibcodb2<br>
Jun 1 11:31:07 tibcodb stonithd: [5026]: info: Succeeded to STONITH the node tibcodb2: optype=RESET. whodoit: tibcodb<br><br><br><br><br> <br><br><div class="gmail_quote">2010/6/9 Dejan Muhamedagic <span dir="ltr"><<a href="mailto:dejanmm@fastmail.fm">dejanmm@fastmail.fm</a>></span><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi,<br>
<div><div></div><div class="h5"><br>
On Tue, Jun 08, 2010 at 10:00:37AM +0800, ben180 wrote:<br>
> Dear all,<br>
><br>
> There are two nodes in my customer's environment. We installed SuSE<br>
> Linux Enterprise Server 11 and HAE on the two node. The cluster is for<br>
> oracle database service HA purpose.<br>
> We have set clone resource for pingd, and constraints for detecting if<br>
> network interface is down. And we added a resource group which<br>
> includes Filesystem, IPAddr, oracle, oralsnr primitives, and use sbd<br>
> for fence device.<br>
><br>
> There's our settings :<br>
><br>
> - <resources><br>
> - <clone id="Connected"><br>
> - <meta_attributes id="Connected-meta_attributes"><br>
> <nvpair id="nvpair-c25cd67c-f681-4652-8007-64c0e50fabe6"<br>
> name="clone-max" value="2" /><br>
> <nvpair id="nvpair-28b5d065-7984-424d-b5a8-fb8fc7b2f6dc"<br>
> name="target-role" value="Started" /><br>
> </meta_attributes><br>
> - <primitive class="ocf" id="ping" provider="pacemaker" type="pingd"><br>
> - <meta_attributes id="ping-meta_attributes"><br>
> <nvpair id="nvpair-e7f1cec6-f5a7-4db2-b20b-0002ad31d9fa"<br>
> name="target-role" value="Started" /><br>
> </meta_attributes><br>
> - <operations id="ping-operations"><br>
> <op id="ping-op-monitor-10" interval="10" name="monitor"<br>
> start-delay="1m" timeout="20" /><br>
> </operations><br>
> - <instance_attributes id="ping-instance_attributes"><br>
> <nvpair id="nvpair-cab0ee56-cd6c-47b7-b206-a19bce16d445"<br>
> name="dampen" value="5" /><br>
> <nvpair id="nvpair-c660e572-40ba-4166-9293-6e99f5d024e8"<br>
> name="host_list" value="10.224.1.254" /><br>
> </instance_attributes><br>
> </primitive><br>
> </clone><br>
> - <group id="DB"><br>
> - <meta_attributes id="DB-meta_attributes"><br>
> <nvpair id="nvpair-a0ce4033-555a-40c3-8e92-191552596a97"<br>
> name="target-role" value="Started" /><br>
> </meta_attributes><br>
> - <primitive class="ocf" id="FileSystem" provider="heartbeat" type="Filesystem"><br>
> - <meta_attributes id="FileSystem-meta_attributes"><br>
> <nvpair id="nvpair-6e46d65a-86d4-41d4-9c7f-7ea502ca9f36"<br>
> name="target-role" value="started" /><br>
> </meta_attributes><br>
> - <operations id="FileSystem-operations"><br>
> <op id="FileSystem-op-monitor-20" interval="20" name="monitor"<br>
> start-delay="10" timeout="40" /><br>
> </operations><br>
> - <instance_attributes id="FileSystem-instance_attributes"><br>
> <nvpair id="nvpair-99da66a3-ebdf-4c3b-9647-05a065ff8309"<br>
> name="device" value="/dev/dm-0" /><br>
> <nvpair id="nvpair-c882d532-3fc5-41a4-b1a3-6b03b2b3d54d"<br>
> name="directory" value="/oracle" /><br>
> <nvpair id="nvpair-643ad766-eb95-4667-8b33-452f8266ba10"<br>
> name="fstype" value="ext3" /><br>
> </instance_attributes><br>
> </primitive><br>
> - <primitive class="ocf" id="ServiceIP" provider="heartbeat" type="IPaddr"><br>
> - <meta_attributes id="ServiceIP-meta_attributes"><br>
> <nvpair id="nvpair-03afe5cc-226f-43db-b1e5-ee2f5f1cb66e"<br>
> name="target-role" value="Started" /><br>
> </meta_attributes><br>
> - <operations id="ServiceIP-operations"><br>
> <op id="ServiceIP-op-monitor-5s" interval="5s" name="monitor"<br>
> start-delay="1s" timeout="20s" /><br>
> </operations><br>
> - <instance_attributes id="ServiceIP-instance_attributes"><br>
> <nvpair id="nvpair-10b45737-aa05-4a7f-9469-b1f75e138834" name="ip"<br>
> value="10.224.1.138" /><br>
> </instance_attributes><br>
> </primitive><br>
> - <primitive class="ocf" id="Instance" provider="heartbeat" type="oracle"><br>
> - <meta_attributes id="Instance-meta_attributes"><br>
> <nvpair id="nvpair-2bbbe865-1339-4cbf-8add-8aa107736260"<br>
> name="target-role" value="Started" /><br>
> <nvpair id="nvpair-6cfd1675-ce77-4c05-8031-242ed176b890"<br>
> name="failure-timeout" value="1" /><br>
> <nvpair id="nvpair-e957ff0a-c40e-494d-b691-02d3ea67440b"<br>
> name="migration-threshold" value="1" /><br>
> <nvpair id="nvpair-4283547a-4c34-4b26-b82f-50730dc4c4fa"<br>
> name="resource-stickiness" value="INFINITY" /><br>
> <nvpair id="nvpair-9478fb7c-e1aa-405e-b4a9-c031971bc612"<br>
> name="is-managed" value="true" /><br>
> </meta_attributes><br>
> - <operations id="Instance-operations"><br>
> <op enabled="true" id="Instance-op-monitor-120" interval="30"<br>
> name="monitor" role="Started" start-delay="1m" timeout="240" /><br>
> </operations><br>
> - <instance_attributes id="Instance-instance_attributes"><br>
> <nvpair id="nvpair-30288e1c-e9e9-4360-b658-045f2f353704" name="sid"<br>
> value="BpmDBp" /><br>
> </instance_attributes><br>
> </primitive><br>
> - <primitive class="ocf" id="Listener" provider="heartbeat" type="oralsnr"><br>
> - <meta_attributes id="Listener-meta_attributes"><br>
> <nvpair id="nvpair-f6219b53-5d6a-42cb-8dec-d8a17b0c240c"<br>
> name="target-role" value="Started" /><br>
> <nvpair id="nvpair-ae38b2bd-b3ee-4a5a-baec-0a998ca7742d"<br>
> name="failure-timeout" value="1" /><br>
> </meta_attributes><br>
> - <operations id="Listener-operations"><br>
> <op id="Listener-op-monitor-10" interval="10" name="monitor"<br>
> start-delay="10" timeout="30" /><br>
> </operations><br>
> - <instance_attributes id="Listener-instance_attributes"><br>
> <nvpair id="nvpair-96615bed-b8a1-4385-a61c-0f399225e63e" name="sid"<br>
> value="BpmDBp" /><br>
> </instance_attributes><br>
> </primitive><br>
> </group><br>
> - <clone id="Fence"><br>
> - <meta_attributes id="Fence-meta_attributes"><br>
> <nvpair id="nvpair-59471c10-ec9d-4eb8-becc-ef6d91115614"<br>
> name="clone-max" value="2" /><br>
> <nvpair id="nvpair-6af8cf4a-d96f-449b-9625-09a10c206a5f"<br>
> name="target-role" value="Started" /><br>
> </meta_attributes><br>
> - <primitive class="stonith" id="sbd-stonith" type="external/sbd"><br>
> - <meta_attributes id="sbd-stonith-meta_attributes"><br>
> <nvpair id="nvpair-8c7cfdb7-ec19-4fcc-a7f8-39a9957326d2"<br>
> name="target-role" value="Started" /><br>
> </meta_attributes><br>
> - <operations id="sbd-stonith-operations"><br>
> <op id="sbd-stonith-op-monitor-15" interval="15" name="monitor"<br>
> start-delay="15" timeout="15" /><br>
> </operations><br>
> - <instance_attributes id="sbd-stonith-instance_attributes"><br>
> <nvpair id="nvpair-724e62b3-0b26-4778-807e-a457dbd2fe42"<br>
> name="sbd_device" value="/dev/dm-1" /><br>
> </instance_attributes><br>
> </primitive><br>
> </clone><br>
> </resources><br>
<br>
> After some failover test, we found something strange. Said that if the<br>
> oracle service is running on node1. First we pull out the node1's<br>
> network cable, we can see the node1 is fenced by node2, resulting<br>
> node1 rebooting. Second the oracle db service will failover to node2,<br>
> and this is what we expected. But third, after the node1 boot up and<br>
> the network is up again, node2 is fenced by node1, and finally oracle<br>
> db service is failback to node1---this is NOT what we want.<br>
<br>
</div></div>Yes, I can imagine that.<br>
<div class="im"><br>
> We found after node1 rebooting, it seems that node1 cannot communicate<br>
> with node2 via TOTEM. And node1 uses sbd to fence node2 and get the<br>
> resource back. Is there something wrong with my settings?<br>
<br>
</div>Can't say for sure because cluster properties are missing.<br>
<div><div></div><div class="h5"><br>
> Or someone<br>
> could give me some advice about this situation?<br>
><br>
> I've attached the log on both nodes and the pacemaker's settings, if<br>
> you can, please pay some attention in node1's log below (tibcodb is<br>
> node1 and tibcodb2 is node2) :<br>
><br>
> Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] Token Timeout (5000 ms)<br>
> retransmit timeout (490 ms)<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [TOTEM] The network interface<br>
> [10.224.1.89] is now up.<br>
> .......................................................<br>
> ......................................................<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] CLM CONFIGURATION CHANGE<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] New Configuration:<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] r(0) ip(10.224.1.89)<br>
> <======================== Why node2 ( tibcodb2 : 10.224.1.90 ) is not<br>
> recognized, and only node1 is recognized?<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] Members Left:<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] Members Joined:<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [CLM ] r(0) ip(10.224.1.89)<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [crm ] notice:<br>
> pcmk_peer_update: Stable membership event on ring 104: memb=1, new=1,<br>
> lost=0<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_peer_update:<br>
> NEW: tibcodb 1<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [crm ] info: pcmk_peer_update:<br>
> MEMB: tibcodb 1<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [MAIN ] info: update_member:<br>
> Node tibcodb now has process list: 00000000000000000000000000053312<br>
> (340754)<br>
> Jun 1 11:30:39 tibcodb openais[5018]: [SYNC ] This node is within the<br>
> primary component and will provide service.<br>
> ........................................<br>
> .........................................<br>
> Jun 1 11:30:57 tibcodb stonithd: [5026]: info: client tengine [pid:<br>
> 5031] requests a STONITH operation RESET on node tibcodb2 <==== Why<br>
> node1 want to fence node2 ?<br>
<br>
</div></div>Don't know, that part of the logs is missing.<br>
<div class="im"><br>
> Jun 1 11:30:57 tibcodb stonithd: [5026]: info:<br>
> stonith_operate_locally::2683: sending fencing op RESET for tibcodb2<br>
> to sbd-stonith:0 (external/sbd) (pid=5398)<br>
> Jun 1 11:30:57 tibcodb sbd: [5400]: info: tibcodb2 owns slot 1<br>
> Jun 1 11:30:57 tibcodb sbd: [5400]: info: Writing reset to node slot tibcodb2<br>
> Jun 1 11:31:07 tibcodb sbd: [5400]: info: reset successfully<br>
> delivered to tibcodb2<br>
><br>
><br>
> Please, any help would be appreciated.<br>
<br>
</div>The most likely reason is that there was something wrong with the<br>
network. But it's really hard to say without full logs and<br>
configuration and so on. You can prepare a hb_report. This being<br>
SLES, best would be to open a call with your representative.<br>
<br>
Thanks,<br>
<br>
Dejan<br>
<br>
> _______________________________________________<br>
> Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
<br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
</blockquote></div><br>