[Pacemaker] CLUSTERIP/iptables interaction

Michael Schwartzkopff misch at multinet.de
Mon Dec 14 09:03:34 EST 2009


Am Montag, 14. Dezember 2009 13:00:34 schrieb Chris Picton:
> Hi all
>
> I am doing some tests with clusterip and pacemaker/heartbeat on Centos
> 5.4, using the clusterlabs repo
>
> My resource looks like:
> primitive CLUSTERIP_21 ocf:heartbeat:IPaddr2 \
> 	op monitor interval="10" timeout="20" start-delay="0" \
> 	params ip="10.202.4.21" nic="eth0" cidr_netmask="24"
> clusterip_hash="sourceip-sourceport" \
> 	meta resource-stickiness="0"
> clone clone_CLUSTERIP_21 CLUSTERIP_21 \
> 	meta clone-max="2" globally-unique="true" clone-node-max="2"
>
>
> This start up fine, and adds an iptables rule correctly, however, if I
> restart the iptables service the clusterip rule gets removed.

Of course. The rule is inserted and managed by the cluster dynamically.

> I then get the below errors in my log file.  These continue without
> stopping.
>
> It seems that the ipaddr2 script is not currently capable of recreating
> the iptables rule if it get inadvertently removed.
>
> It this known behaviour?  If not, where does my error lie?

I did not verify this behaviour but it sounds reasonable. Perhaps recreating 
the iptables rule after the loss could/should be part of the monitoring of the 
IPaddr2 script.


Looking through your logs it seems monitor detect the problem but cannot 
recreate the correct rule. Perhaps an error in the RA.

Greetings,

Michael.

> Regards
>
> Chris
>
> Dec 14 13:45:45 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:monitor:stderr) egrep:
> Dec 14 13:45:45 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:monitor:stderr) /proc/net/ipt_CLUSTERIP/10.202.4.21
> Dec 14 13:45:45 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:monitor:stderr) : No such file or directory
> Dec 14 13:45:45 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:monitor:stderr)
> Dec 14 13:45:45 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_monitor_10000 (call=7, rc=7, cib-update=18,
> confirmed=false) not running
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_ha_callback:
> Update relayed from slb-test-04.ecntelecoms.za.net
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: find_hash_entry:
> Creating hash entry for fail-count-CLUSTERIP_21:0
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_local_callback:
> Expanded fail-count-CLUSTERIP_21:0=value++ to 1
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_trigger_update:
> Sending flush op to all hosts for: fail-count-CLUSTERIP_21:0 (1)
> Dec 14 13:45:46 slb-test-03 crmd: [8105]: info: do_lrm_rsc_op: Performing
> key=2:4:0:976a51f6-6729-48a3-b6cc-20073cbd7fc4 op=CLUSTERIP_21:0_stop_0 )
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_perform_update:
> Sent update 14: fail-count-CLUSTERIP_21:0=1
> Dec 14 13:45:46 slb-test-03 lrmd: [8102]: info: rsc:CLUSTERIP_21:0:8: stop
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_ha_callback:
> Update relayed from slb-test-04.ecntelecoms.za.net
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: find_hash_entry:
> Creating hash entry for last-failure-CLUSTERIP_21:0
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_trigger_update:
> Sending flush op to all hosts for: last-failure-CLUSTERIP_21:0
> (1260791145)
> Dec 14 13:45:46 slb-test-03 attrd: [8104]: info: attrd_perform_update:
> Sent update 17: last-failure-CLUSTERIP_21:0=1260791145
> Dec 14 13:45:46 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_monitor_10000 (call=7, status=1, cib-update=0,
> confirmed=true) Cancelled
> Dec 14 13:45:47 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:stop:stderr) egrep: /proc/net/ipt_CLUSTERIP/10.202.4.21:
> No such file or directory
> Dec 14 13:45:47 slb-test-03 lrmd: [8102]: info: Managed
> CLUSTERIP_21:0:stop process 5256 exited with return code 0.
> Dec 14 13:45:47 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_stop_0 (call=8, rc=0, cib-update=19,
> confirmed=true) ok
> Dec 14 13:45:48 slb-test-03 crmd: [8105]: info: do_lrm_rsc_op: Performing
> key=8:5:0:976a51f6-6729-48a3-b6cc-20073cbd7fc4 op=CLUSTERIP_21:0_start_0 )
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: rsc:CLUSTERIP_21:0:9:
> start
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) egrep:
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) /proc/net/ipt_CLUSTERIP/10.202.4.21
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) : No such file or directory
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr)
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) /usr/lib/ocf/resource.d//heartbeat/IPaddr2:
> line 638: /proc/net/ipt_CLUSTERIP/10.202.4.21: No such file or directory
> Dec 14 13:45:48 slb-test-03 IPaddr2[5306]: [5357]: INFO: /usr/lib64/
> heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/
> send_arp-10.202.4.21 eth0 10.202.4.21 cb937eba29c1 not_used not_used
> Dec 14 13:45:48 slb-test-03 lrmd: [8102]: info: Managed
> CLUSTERIP_21:0:start process 5306 exited with return code 0.
> Dec 14 13:45:48 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_start_0 (call=9, rc=0, cib-update=20,
> confirmed=true) ok
> Dec 14 13:45:49 slb-test-03 crmd: [8105]: info: do_lrm_rsc_op: Performing
> key=9:5:0:976a51f6-6729-48a3-b6cc-20073cbd7fc4
> op=CLUSTERIP_21:0_monitor_10000 )
> Dec 14 13:45:49 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:monitor:stderr) egrep: /proc/net/
> ipt_CLUSTERIP/10.202.4.21: No such file or directory
> Dec 14 13:45:49 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_monitor_10000 (call=10, rc=7, cib-update=21,
> confirmed=false) not running
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_ha_callback:
> Update relayed from slb-test-04.ecntelecoms.za.net
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_local_callback:
> Expanded fail-count-CLUSTERIP_21:0=value++ to 2
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_trigger_update:
> Sending flush op to all hosts for: fail-count-CLUSTERIP_21:0 (2)
> Dec 14 13:45:50 slb-test-03 crmd: [8105]: info: do_lrm_rsc_op: Performing
> key=2:6:0:976a51f6-6729-48a3-b6cc-20073cbd7fc4 op=CLUSTERIP_21:0_stop_0 )
> Dec 14 13:45:50 slb-test-03 lrmd: [8102]: info: rsc:CLUSTERIP_21:0:11:
> stop
> Dec 14 13:45:50 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_monitor_10000 (call=10, status=1, cib-update=0,
> confirmed=true) Cancelled
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_perform_update:
> Sent update 19: fail-count-CLUSTERIP_21:0=2
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_ha_callback:
> Update relayed from slb-test-04.ecntelecoms.za.net
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_trigger_update:
> Sending flush op to all hosts for: last-failure-CLUSTERIP_21:0
> (1260791149)
> Dec 14 13:45:50 slb-test-03 attrd: [8104]: info: attrd_perform_update:
> Sent update 21: last-failure-CLUSTERIP_21:0=1260791149
> Dec 14 13:45:51 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:stop:stderr) egrep: /proc/net/ipt_CLUSTERIP/10.202.4.21:
> No such file or directory
> Dec 14 13:45:51 slb-test-03 lrmd: [8102]: info: Managed
> CLUSTERIP_21:0:stop process 5411 exited with return code 0.
> Dec 14 13:45:51 slb-test-03 crmd: [8105]: info: process_lrm_event: LRM
> operation CLUSTERIP_21:0_stop_0 (call=11, rc=0, cib-update=22,
> confirmed=true) ok
> Dec 14 13:45:52 slb-test-03 crmd: [8105]: info: do_lrm_rsc_op: Performing
> key=8:7:0:976a51f6-6729-48a3-b6cc-20073cbd7fc4 op=CLUSTERIP_21:0_start_0 )
> Dec 14 13:45:52 slb-test-03 lrmd: [8102]: info: rsc:CLUSTERIP_21:0:12:
> start
> Dec 14 13:45:52 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) egrep:
> Dec 14 13:45:52 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) /proc/net/ipt_CLUSTERIP/10.202.4.21
> Dec 14 13:45:52 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr) : No such file or directory
> Dec 14 13:45:52 slb-test-03 lrmd: [8102]: info: RA output:
> (CLUSTERIP_21:0:start:stderr)
>
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: misch at multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42




More information about the Pacemaker mailing list