<div dir="ltr">Hello,<div><br></div><div>The context : </div><div> Red Hat Enterprise Linux Server release 5.7<br></div><div> corosynclib-1.2.7-1.1.el5.x86_64</div><div> corosync-1.2.7-1.1.el5.x86_64</div><div> pacemaker-1.0.10-1.4.el5.x86_64</div><div> pacemaker-libs-1.0.10-1.4.el5.x86_64</div><div> 2 nodes, both on same ESX server</div><div><br></div><div>I've lost of processor joined of left the membership message but can't understand why, because the 2 hosts are up and running, and when the corosync try to start the cluster's ressource he can't because the are already up on the first node. </div><div>We can see "Another DC detected" so the communication between the 2 VM is OK.</div><div><br></div><div>I've tried to raise totem parameter, without success.</div><div><br></div><div>Here are some log's extract :<br></div><div><br></div><div>service corosync restart at 11:35 :</div><div><br></div><div>grep TOTEM corosync.log</div><div><div>Apr 10 11:35:56 corosync [TOTEM ] Initializing transport (UDP/IP).</div><div>Apr 10 11:35:56 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).</div><div>Apr 10 11:35:56 corosync [TOTEM ] The network interface [10.10.72.7] is now up.</div><div>Apr 10 11:35:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 11:35:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:13:07 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:13:08 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:13:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:13:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:31:39 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:31:40 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:31:41 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:34:53 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:34:54 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:34:55 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:34:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:47:59 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:48:00 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:48:01 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:48:01 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:55:35 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:55:36 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:55:37 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:55:38 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:55:39 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:55:42 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:57:54 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 13:57:55 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 13:57:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 14:01:03 corosync [TOTEM ] A processor failed, forming new configuration.</div><div>Apr 10 14:01:04 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 14:01:05 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 14:01:06 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div><div>Apr 10 14:01:06 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.</div></div><div><br></div><div>This comes so often !</div><div><br></div><div><br></div><div>grep ERROR corosync.log<br></div><div><div>Apr 10 13:13:10 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: crmd_ha_msg_filter: Another DC detected: vif5_7 (op=noop)</div><div>Apr 10 13:31:42 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: crmd_ha_msg_filter: Another DC detected: vif5_7 (op=noop)</div><div>Apr 10 13:34:55 <a href="http://host2.exemple.com">host2.exemple.com</a> pengine: [26529]: ERROR: unpack_rsc_op: Hard error - routing-jboss_stop_0 failed with rc=2: Preventing routing-jboss from re-starting on <a href="http://host2.exemple.com">host2.exemple.com</a></div><div>Apr 10 13:34:55 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: te_graph_trigger: Transition failed: terminated</div><div>Apr 10 13:34:56 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: crmd_ha_msg_filter: Another DC detected: vif5_7 (op=noop)</div><div>Apr 10 13:48:01 <a href="http://host2.exemple.com">host2.exemple.com</a> pengine: [26529]: ERROR: unpack_rsc_op: Hard error - routing-jboss_stop_0 failed with rc=2: Preventing routing-jboss from re-starting on <a href="http://host2.exemple.com">host2.exemple.com</a></div><div>Apr 10 13:48:01 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: te_graph_trigger: Transition failed: terminated</div><div>Apr 10 13:48:01 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: crmd_ha_msg_filter: Another DC detected: vif5_7 (op=noop)</div><div>Apr 10 13:55:39 <a href="http://host2.exemple.com">host2.exemple.com</a> pengine: [26529]: ERROR: unpack_rsc_op: Hard error - routing-jboss_stop_0 failed with rc=2: Preventing routing-jboss from re-starting on <a href="http://host2.exemple.com">host2.exemple.com</a></div><div>Apr 10 13:55:39 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: te_graph_trigger: Transition failed: terminated</div><div>Apr 10 13:57:56 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: crmd_ha_msg_filter: Another DC detected: vif5_7 (op=noop)</div><div>Apr 10 14:01:05 <a href="http://host2.exemple.com">host2.exemple.com</a> pengine: [26529]: ERROR: unpack_rsc_op: Hard error - routing-jboss_stop_0 failed with rc=2: Preventing routing-jboss from re-starting on <a href="http://host2.exemple.com">host2.exemple.com</a></div><div>Apr 10 14:01:05 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: te_graph_trigger: Transition failed: terminated</div><div>Apr 10 14:01:06 <a href="http://host2.exemple.com">host2.exemple.com</a> crmd: [26530]: ERROR: crmd_ha_msg_filter: Another DC detected: vif5_7 (op=noop)</div></div><div><br></div><div>grep WARN corosync.log</div><div><div>Apr 10 13:13:08 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 13:13:08 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2812.8 -> 0.2812.9 not applied to 0.2813.1: current "epoch" is greater than required</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2812.9 -> 0.2813.1 not applied to 0.2813.1: current "epoch" is greater than required</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_diff_notify: Local-only Change (client:crmd, call: 139): -1.-1.-1 (Application of an update diff failed)</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:13:09 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:14:29 <a href="http://host2.example.com">host2.example.com</a> lrmd: [27113]: WARN: For LSB init script, no additional parameters are needed.</div><div>Apr 10 13:31:40 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 13:31:40 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 13:31:41 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:31:41 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:31:41 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:31:41 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:34:54 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 13:34:54 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: match_down_event: No match for shutdown action on vif5_7</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: unpack_rsc_op: Processing failed op routing-jboss_stop_0 on tango2.luxlait.lan: invalid parameter (2)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: common_apply_stickiness: Forcing routing-jboss away from tango2.luxlait.lan after 1000000 failures (max=1000000)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: run_graph: Transition 0 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=4, Source=/var/lib/pengine/pe-input-87782.bz2): Terminated</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Graph 0 (5 actions in 5 synapses): batch-limit=30 jobs, network-delay=60000ms</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 0 is pending (priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 1 was confirmed (priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 2 is pending (priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 6]: Pending (id: routing-jboss_stop_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 3 is pending (priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 4 is pending (priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 5]: Pending (id: clusterIP_monitor_30000, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:34:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:34:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2817.1 -> 0.2817.2 not applied to 0.2819.1: current "epoch" is greater than required</div><div>Apr 10 13:34:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2817.2 -> 0.2817.3 not applied to 0.2819.1: current "epoch" is greater than required</div><div>Apr 10 13:34:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2817.3 -> 0.2817.4 not applied to 0.2819.1: current "epoch" is greater than required</div><div>Apr 10 13:34:56 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: do_log: FSA: Input I_JOIN_OFFER from route_message() received in state S_ELECTION</div><div>Apr 10 13:34:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2817.4 -> 0.2818.1 not applied to 0.2819.1: current "epoch" is greater than required</div><div>Apr 10 13:48:00 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 13:48:00 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: match_down_event: No match for shutdown action on vif5_7</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: unpack_rsc_op: Processing failed op routing-jboss_stop_0 on tango2.luxlait.lan: invalid parameter (2)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: common_apply_stickiness: Forcing routing-jboss away from tango2.luxlait.lan after 1000000 failures (max=1000000)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: run_graph: Transition 1 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=4, Source=/var/lib/pengine/pe-input-87783.bz2): Terminated</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Graph 1 (5 actions in 5 synapses): batch-limit=30 jobs, network-delay=60000ms</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 0 is pending (priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 1 was confirmed (priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 2 is pending (priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 6]: Pending (id: routing-jboss_stop_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 3 is pending (priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 4 is pending (priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 5]: Pending (id: clusterIP_monitor_30000, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2821.1 -> 0.2821.2 not applied to 0.2822.1: current "epoch" is greater than required</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2821.2 -> 0.2821.3 not applied to 0.2822.1: current "epoch" is greater than required</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: do_log: FSA: Input I_JOIN_OFFER from route_message() received in state S_ELECTION</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2821.3 -> 0.2821.4 not applied to 0.2822.1: current "epoch" is greater than required</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2821.4 -> 0.2822.1 not applied to 0.2822.1: current "epoch" is greater than required</div><div>Apr 10 13:48:01 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_diff_notify: Local-only Change (client:crmd, call: 283): -1.-1.-1 (Application of an update diff failed, requesting a full refresh)</div><div>Apr 10 13:55:36 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 13:55:36 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: unpack_rsc_op: Processing failed op routing-jboss_stop_0 on tango2.luxlait.lan: invalid parameter (2)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: common_apply_stickiness: Forcing routing-jboss away from tango2.luxlait.lan after 1000000 failures (max=1000000)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: run_graph: Transition 2 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=4, Source=/var/lib/pengine/pe-input-87784.bz2): Terminated</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Graph 2 (5 actions in 5 synapses): batch-limit=30 jobs, network-delay=60000ms</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 0 is pending (priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 1 was confirmed (priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 2 is pending (priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 6]: Pending (id: routing-jboss_stop_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 3 is pending (priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 4 is pending (priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 5]: Pending (id: clusterIP_monitor_30000, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:55:39 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 13:55:42 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: do_log: FSA: Input I_RELEASE_DC from do_election_count_vote() received in state S_INTEGRATION</div><div>Apr 10 13:55:42 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: update_dc: New DC vif5_7 is not tango2.luxlait.lan</div><div>Apr 10 13:55:42 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: do_cl_join_offer_respond: Discarding offer from vif5_7 (expected tango2.luxlait.lan)</div><div>Apr 10 13:57:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 13:57:55 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 13:57:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:57:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:57:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:57:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_server_process_diff: Not requesting full refresh in slave mode.</div><div>Apr 10 13:57:56 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_diff_notify: Local-only Change (client:crmd, call: 356): -1.-1.-1 (Application of an update diff failed, requesting a full refresh)</div><div>Apr 10 14:01:04 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: check_dead_member: Our DC node (vif5_7) left the cluster</div><div>Apr 10 14:01:04 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: cib_client_add_notify_callback: Callback already present</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: match_down_event: No match for shutdown action on vif5_7</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: unpack_rsc_op: Processing failed op routing-jboss_stop_0 on tango2.luxlait.lan: invalid parameter (2)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> pengine: [26529]: WARN: common_apply_stickiness: Forcing routing-jboss away from tango2.luxlait.lan after 1000000 failures (max=1000000)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: run_graph: Transition 3 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=4, Source=/var/lib/pengine/pe-input-87785.bz2): Terminated</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Graph 3 (5 actions in 5 synapses): batch-limit=30 jobs, network-delay=60000ms</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 0 is pending (priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 1 was confirmed (priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 2 is pending (priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 12]: Pending (id: vifGroup_stopped_0, type: pseduo, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 6]: Pending (id: routing-jboss_stop_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 11]: Completed (id: vifGroup_stop_0, type: pseduo, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 3 is pending (priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 9]: Pending (id: vifGroup_start_0, type: pseduo, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_graph: Synapse 4 is pending (priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: [Action 5]: Pending (id: clusterIP_monitor_30000, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 14:01:05 <a href="http://host2.example.com">host2.example.com</a> crmd: [26530]: WARN: print_elem: * [Input 4]: Pending (id: clusterIP_start_0, loc: tango2.luxlait.lan, priority: 0)</div><div>Apr 10 14:01:06 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2832.1 -> 0.2832.2 not applied to 0.2834.1: current "epoch" is greater than required</div><div>Apr 10 14:01:06 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2832.2 -> 0.2832.3 not applied to 0.2834.1: current "epoch" is greater than required</div><div>Apr 10 14:01:06 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2832.3 -> 0.2832.4 not applied to 0.2834.1: current "epoch" is greater than required</div><div>Apr 10 14:01:06 <a href="http://host2.example.com">host2.example.com</a> cib: [26526]: WARN: cib_process_diff: Diff 0.2832.4 -> 0.2833.1 not applied to 0.2834.1: current "epoch" is greater than required</div></div><div><br></div><div><br></div><div>Multicast is no more used because of errors on other customers.</div><div><br></div><div><br></div><div>corosync.conf :</div><div><div>compatibility: whitetank</div><div><br></div><div>aisexec {</div><div> user: root</div><div> group: root</div><div>}</div><div>service {</div><div> # Load the Pacemaker Cluster Resource Manager</div><div> name: pacemaker</div><div> ver: 0</div><div>}</div><div>totem {</div><div> version: 2</div><div> secauth: on</div><div> threads: 0</div><div> interface {</div><div> ringnumber: 0</div><div> bindnetaddr: 10.10.72.0</div><div> #mcastaddr: 226.94.1.1</div><div> mcastport: 5405</div><div> broadcast: yes</div><div> token: 10000</div><div> }</div><div>}</div><div><br></div><div>logging {</div><div> fileline: off</div><div> to_stderr: no</div><div> to_logfile: yes</div><div> to_syslog: no</div><div> logfile: /var/log/cluster/corosync.log</div><div> debug: off</div><div> timestamp: on</div><div> logger_subsys {</div><div> subsys: AMF</div><div> debug: off</div><div> }</div><div>}</div><div><br></div><div>amf {</div><div> mode: disabled</div><div>}</div></div><div><br></div><div><br></div><div><br></div><div>crm configuration:</div><div><div>node <a href="http://host2.exemple.com">host2.exemple.com</a> \</div><div> attributes standby="off"</div><div>node vif5_7 \</div><div> attributes standby="off"</div><div>primitive clusterIP ocf:heartbeat:IPaddr2 \</div><div> params ip="10.10.72.3" cidr_netmask="32" iflabel="jbossfailover" \</div><div> op monitor interval="30s"</div><div>primitive routing-jboss lsb:routing-jboss \</div><div> op monitor interval="30s"</div><div>group vifGroup clusterIP routing-jboss</div><div>location prefer-clusterIP clusterIP 50: vif5_7</div><div>property $id="cib-bootstrap-options" \</div><div> dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \</div><div> cluster-infrastructure="openais" \</div><div> expected-quorum-votes="2" \</div><div> stonith-enabled="false" \</div><div> no-quorum-policy="ignore"</div><div>rsc_defaults $id="rsc-options" \</div><div> resource-stickiness="20"</div></div><div><br></div><div><br></div><div>Best regards,</div><div>Philippe</div><div><br></div><div></div><div><div><div class="gmail_signature"><div dir="ltr"><table border="0" style="font-family:'Times New Roman'"><tbody><tr><td><table border="0" cellspacing="0" cellpadding="0" style="font-family:'Times New Roman';width:300px;display:block"><tbody></tbody></table><table border="0" cellspacing="0" cellpadding="0" style="font-family:'Times New Roman';width:300px;height:45px;display:block"><tbody></tbody></table></td><td><br></td><td><br></td></tr></tbody></table></div></div></div>
</div></div>
<br>
<table border="0" cellspacing="0" cellpadding="0" style="color:rgb(34,34,34);font-size:13px;background-color:rgb(255,255,255);font-family:Times;width:300px;display:block"><tbody><tr><td colspan="3" style="font-family:arial,sans-serif;margin:0px"><a href="http://www.vif.fr/" style="color:rgb(17,85,204);float:left" target="_blank"><img src="https://ci3.googleusercontent.com/proxy/jNw3kk6adk01Es5Sohxe1dLczyeO-oDK-x1ilyy9UZKzBvvYS6tLJdWQccesL76IUbn7nqjdykptLsCu6Q=s0-d-e1-ft#http://exper-ia.com/images/logo_vif.png" alt="logoVif"></a></td></tr><tr><td width="148" style="font-family:'Trebuchet MS',Arial,Helvetica,sans-serif;margin:0px;vertical-align:top;line-height:16px;font-size:11px;font-weight:bold"><span style="color:rgb(77,77,77)"><font><font><font><font>L'informatique 100% Agro</font></font></font></font></span></td><td width="81" align="right" style="font-family:'Trebuchet MS',Arial,Helvetica,sans-serif;margin:0px;color:rgb(77,77,77);font-size:11px;line-height:16px;font-weight:bold;vertical-align:top"><a href="http://www.vif.fr/" style="color:rgb(77,77,77);text-decoration:none" target="_blank"><font><font><font><font>www.vif.fr</font></font></font></font></a> <br></td><td width="71" rowspan="2" style="font-family:arial,sans-serif;margin:0px"><a href="http://www.youtube.com/user/Agrovif" style="color:rgb(17,85,204);float:right" target="_blank"><img src="https://ci3.googleusercontent.com/proxy/McHI_KgbDIV1VTZNHms4RfYjSyf9dZilloQOXJFt2R_8k5gZP5ezMt8pL7m7dkrNDrVIYdSrY9JJO8CWk8AArr_x2xs=s0-d-e1-ft#http://exper-ia.com/images/ico_YouTubeMail.png" alt="VifYouTube"></a><a href="https://twitter.com/VIF_agro" style="color:rgb(17,85,204);float:right;margin-right:4px" target="_blank"><img src="https://ci5.googleusercontent.com/proxy/TQSsKHGL--n7vnMbP-E37n9cQ0gF0xKtEyEKLVaDA1XyHPqseT8z2Ocw6PnUxjTfHHg7o8H3fzrNxcs7ZmDoKGzURx4=s0-d-e1-ft#http://exper-ia.com/images/ico_twitterMail.png" alt="VifTwitter"></a></td></tr><tr><td colspan="2" align="right" style="font-family:'Trebuchet MS',Arial,Helvetica,sans-serif;margin:0px;color:rgb(239,124,0);font-size:9px;line-height:16px;width:220px;vertical-align:top"><em><font><font><font><font>Suivez l'actualité VIF sur:</font></font></font></font></em></td></tr></tbody></table><a href="http://www.vif.fr/regilait-devoile-les-dessous-du-processus-sop-et-vif-lance-vif-sop-au-cfia-de-rennes/" target="_blank"><img src="http://www.vif.fr/wp-content/uploads/2014/02/signaturemailregilait.jpg"></a>