[ClusterLabs] RES: Can't get a group of IP address up when moving to a new version of Pacemaker/Corosync

Ken Gaillot kgaillot at redhat.com
Wed Sep 2 19:15:38 UTC 2015


On 09/02/2015 01:44 PM, Carlos Xavier wrote:
> Hi Kristoffer.
> 
> Tank you very much for fast reply.
> 
> I did a cleanup of the resource and took a look at the log and could see that the issue has something to do with the RA IPaddr2 trying to set some iptables rule, although we are not using any iptables rule set.
> 
> crm(live)resource# cleanup c-ip-httpd
> Cleaning up ip_ccardbusiness:0 on apolo
> Cleaning up ip_ccardbusiness:0 on diana
> Cleaning up ip_ccardgift:0 on apolo
> Cleaning up ip_ccardgift:0 on diana
> Cleaning up ip_intranet:0 on apolo
> Cleaning up ip_intranet:0 on diana
> Cleaning up ip_ccardbusiness:1 on apolo
> Cleaning up ip_ccardbusiness:1 on diana
> Cleaning up ip_ccardgift:1 on apolo
> Cleaning up ip_ccardgift:1 on diana
> Cleaning up ip_intranet:1 on apolo
> Cleaning up ip_intranet:1 on diana
> Waiting for 12 replies from the CRMd............ OK
> 
> And on the log we have
> 
> 2015-09-02T14:40:54.074834-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on diana after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.075034-03:00 apolo crmd[5185]:  warning: status_from_rc: Action 33 (ip_ccardbusiness:0_start_0) on diana failed (target: 0 vs. rc: 1): Error
> 2015-09-02T14:40:54.075230-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on diana after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.075427-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on diana after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.078344-03:00 apolo crmd[5185]:   notice: abort_transition_graph: Transition aborted by status-168427778-fail-count-ip_ccardbusiness, fail-count-ip_ccardbusiness=INFINITY: Transient attribute change (create cib=0.378.14, source=te_update_diff:391, path=/cib/status/node_state[@id='168427778']/transient_attributes[@id='168427778']/instance_attributes[@id='status-168427778'], 0)
> 2015-09-02T14:40:54.184995-03:00 apolo IPaddr2(ip_ccardbusiness)[8360]: ERROR: iptables failed
> 2015-09-02T14:40:54.187651-03:00 apolo lrmd[5182]:   notice: operation_finished: ip_ccardbusiness_start_0:8360:stderr [ iptables: No chain/target/match by that name. ]
> 2015-09-02T14:40:54.187978-03:00 apolo lrmd[5182]:   notice: operation_finished: ip_ccardbusiness_start_0:8360:stderr [ ocf-exit-reason:iptables failed ]
> 2015-09-02T14:40:54.203780-03:00 apolo crmd[5185]:   notice: process_lrm_event: Operation ip_ccardbusiness_start_0: unknown error (node=apolo, call=88, rc=1, cib-update=1321, confirmed=true)
> 2015-09-02T14:40:54.204026-03:00 apolo crmd[5185]:   notice: process_lrm_event: apolo-ip_ccardbusiness_start_0:88 [ iptables: No chain/target/match by that name.\nocf-exit-reason:iptables failed\n ]
> 2015-09-02T14:40:54.206111-03:00 apolo crmd[5185]:  warning: status_from_rc: Action 43 (ip_ccardbusiness:1_start_0) on apolo failed (target: 0 vs. rc: 1): Error
> 2015-09-02T14:40:54.206442-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on apolo after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.206663-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on apolo after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.206859-03:00 apolo crmd[5185]:  warning: status_from_rc: Action 43 (ip_ccardbusiness:1_start_0) on apolo failed (target: 0 vs. rc: 1): Error
> 2015-09-02T14:40:54.207109-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on apolo after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.207489-03:00 apolo crmd[5185]:  warning: update_failcount: Updating failcount for ip_ccardbusiness on apolo after failed start: rc=1 (update=INFINITY, time=1441215654)
> 2015-09-02T14:40:54.207829-03:00 apolo crmd[5185]:   notice: run_graph: Transition 1166 (Complete=14, Pending=0, Fired=0, Skipped=12, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-95.bz2): Stopped
> 
> Is there a way to stop the IPaddr2 RA to try to manage the iptables rule?
> 
> Regards,
> Carlos

This is occurring because you cloned the IP group. When IPaddr2 is
cloned, it uses iptables to do a rudimentary form of load-balancing. The
IP address will be simultaneously active on all nodes running the clone,
and each node will use iptables' support for multicast Ethernet MAC
addresses to respond only to a subset of requests.

The error you're getting is at the iptables level rather than the
cluster level, so you'll need to deal with it there.

> 
>> -----Mensagem original-----
>> De: Kristoffer Grönlund [mailto:kgronlund at suse.com]
>> Enviada em: quarta-feira, 2 de setembro de 2015 02:46
>> Para: Carlos Xavier; users at clusterlabs.org
>> Assunto: Re: [ClusterLabs] Can't get a group of IP address up when moving to a new version of
>> Pacemaker/Corosync
>>
>> Carlos Xavier <cbastos at connection.com.br> writes:
>>
>>> Hi.
>>>
>>
>> [snip]
>>
>>>
>>> Failed actions:
>>>     ip_ccardbusiness_start_0 on apolo 'unknown error' (1): call=65,
>>> status=complete, last-rc-change='Fri Aug 21 17:46:32 2015', queued=0ms, exec=337ms
>>>     ip_ccardbusiness_start_0 on diana 'unknown error' (1): call=64,
>>> status=complete, last-rc-change='Fri Aug 21 17:44:21 2015',
>>> queued=1ms, exec=254ms
>>>
>>
>> Hi Carlos,
>>
>> There should be more information in the logs as to what is going wrong. Look in /var/log/messages for
>> the time indicated in the failure and search for ip_ccardbusiness.
>>
>> Cheers,
>> Kristoffer
>>
>>>
>>> I have already trayed to set the clone this way, but without any success:
>>>
>>> clone c-ip-httpd ip-httpd \
>>>         meta interleave=true globally-unique=false clone-max=2
>>> clone-node-max=1 target-role=Started
>>>
>>>
>>> Am I missing something?
>>> Please, can someone shade some light on this issue? Any help will be very welcome.
>>>
>>> Best regards,
>>> Carlos
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>> --
>> // Kristoffer Grönlund
>> // kgronlund at suse.com
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 





More information about the Users mailing list