[ClusterLabs] ClusterIP won't return to recovered node

Mon Jun 12 13:24:19 UTC 2017

On 6/12/2017 2:03 AM, Klaus Wenninger wrote:
> On 06/10/2017 05:53 PM, Dan Ragle wrote:
>>
>>
>> On 5/25/2017 5:33 PM, Ken Gaillot wrote:
>>> On 05/24/2017 12:27 PM, Dan Ragle wrote:
>>>> I suspect this has been asked before and apologize if so, a google
>>>> search didn't seem to find anything that was helpful to me ...
>>>>
>>>> I'm setting up an active/active two-node cluster and am having an issue
>>>> where one of my two defined clusterIPs will not return to the other
>>>> node
>>>> after it (the other node) has been recovered.
>>>>
>>>> I'm running on CentOS 7.3. My resource setups look like this:
>>>>
>>>> # cibadmin -Q|grep dc-version
>>>>          <nvpair id="cib-bootstrap-options-dc-version"
>>>> name="dc-version"
>>>> value="1.1.15-11.el7_3.4-e174ec8"/>
>>>>
>>>> # pcs resource show PublicIP-clone
>>>>   Clone: PublicIP-clone
>>>>    Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true
>>>> interleave=true
>>>>    Resource: PublicIP (class=ocf provider=heartbeat type=IPaddr2)
>>>>     Attributes: ip=75.144.71.38 cidr_netmask=24 nic=bond0
>>>>     Meta Attrs: resource-stickiness=0
>>>>     Operations: start interval=0s timeout=20s
>>>> (PublicIP-start-interval-0s)
>>>>                 stop interval=0s timeout=20s
>>>> (PublicIP-stop-interval-0s)
>>>>                 monitor interval=30s (PublicIP-monitor-interval-30s)
>>>>
>>>> # pcs resource show PrivateIP-clone
>>>>   Clone: PrivateIP-clone
>>>>    Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true
>>>> interleave=true
>>>>    Resource: PrivateIP (class=ocf provider=heartbeat type=IPaddr2)
>>>>     Attributes: ip=192.168.1.3 nic=bond1 cidr_netmask=24
>>>>     Meta Attrs: resource-stickiness=0
>>>>     Operations: start interval=0s timeout=20s
>>>> (PrivateIP-start-interval-0s)
>>>>                 stop interval=0s timeout=20s
>>>> (PrivateIP-stop-interval-0s)
>>>>                 monitor interval=10s timeout=20s
>>>> (PrivateIP-monitor-interval-10s)
>>>>
>>>> # pcs constraint --full | grep -i publicip
>>>>    start WEB-clone then start PublicIP-clone (kind:Mandatory)
>>>> (id:order-WEB-clone-PublicIP-clone-mandatory)
>>>> # pcs constraint --full | grep -i privateip
>>>>    start WEB-clone then start PrivateIP-clone (kind:Mandatory)
>>>> (id:order-WEB-clone-PrivateIP-clone-mandatory)
>>>
>>> FYI These constraints cover ordering only. If you also want to be sure
>>> that the IPs only start on a node where the web service is functional,
>>> then you also need colocation constraints.
>>>
>>>>
>>>> When I first create the resources, they split across the two nodes as
>>>> expected/desired:
>>>>
>>>>   Clone Set: PublicIP-clone [PublicIP] (unique)
>>>>       PublicIP:0        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>       PublicIP:1        (ocf::heartbeat:IPaddr2):       Started
>>>> node2-pcs
>>>>   Clone Set: PrivateIP-clone [PrivateIP] (unique)
>>>>       PrivateIP:0        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>       PrivateIP:1        (ocf::heartbeat:IPaddr2):       Started
>>>> node2-pcs
>>>>   Clone Set: WEB-clone [WEB]
>>>>       Started: [ node1-pcs node2-pcs ]
>>>>
>>>> I then put the second node in standby:
>>>>
>>>> # pcs node standby node2-pcs
>>>>
>>>> And the IPs both jump to node1 as expected:
>>>>
>>>>   Clone Set: PublicIP-clone [PublicIP] (unique)
>>>>       PublicIP:0        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>       PublicIP:1        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>   Clone Set: WEB-clone [WEB]
>>>>       Started: [ node1-pcs ]
>>>>       Stopped: [ node2-pcs ]
>>>>   Clone Set: PrivateIP-clone [PrivateIP] (unique)
>>>>       PrivateIP:0        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>       PrivateIP:1        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>
>>>> Then unstandby the second node:
>>>>
>>>> # pcs node unstandby node2-pcs
>>>>
>>>> The publicIP goes back, but the private does not:
>>>>
>>>>   Clone Set: PublicIP-clone [PublicIP] (unique)
>>>>       PublicIP:0        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>       PublicIP:1        (ocf::heartbeat:IPaddr2):       Started
>>>> node2-pcs
>>>>   Clone Set: WEB-clone [WEB]
>>>>       Started: [ node1-pcs node2-pcs ]
>>>>   Clone Set: PrivateIP-clone [PrivateIP] (unique)
>>>>       PrivateIP:0        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>       PrivateIP:1        (ocf::heartbeat:IPaddr2):       Started
>>>> node1-pcs
>>>>
>>>> Anybody see what I'm doing wrong? I'm not seeing anything in the
>>>> logs to
>>>> indicate that it tries node2 and then fails; but I'm fairly new to the
>>>> software so it's possible I'm not looking in the right place.
>>>
>>> The pcs status would show any failed actions, and anything important in
>>> the logs would start with "error:" or "warning:".
>>>
>>> At any given time, one of the nodes is the DC, meaning it schedules
>>> actions for the whole cluster. That node will have more "pengine:"
>>> messages in its logs at the time. You can check those logs to see what
>>> decisions were made, as well as a "saving inputs" message to get the
>>> cluster state that was used to make those decisions. There is a
>>> crm_simulate tool that you can run on that file to get more information.
>>>
>>> By default, pacemaker will try to balance the number of resources
>>> running on each node, so I'm not sure why in this case node1 has four
>>> resources and node2 has two. crm_simulate might help explain it.
>>>
>>> However, there's nothing here telling pacemaker that the instances of
>>> PrivateIP should run on different nodes when possible. With your
>>> existing constraints, pacemaker would be equally happy to run both
>>> PublicIP instances on one node and both PrivateIP instances on the other
>>> node.
>>
>> Thanks for your reply. Finally getting back to this.
>>
>> Looking back at my config and my notes I realized I'm guilty of not
>> giving you enough information. There was indeed an additional pair of
>> resources that I didn't list in my original output that I didn't think
>> were relevant to the issue--my bad. Reading what you wrote made me
>> realize that it does appear as though pacemaker is simply trying to
>> balance the overall load of *all* the available resources.
>>
>> But I'm still confused as to how one would definitively correct the
>> issue. I tried this full reduction this morning. Starting from an
>> empty two-node cluster (no resources, no constraints):
>>
>> [root at node1 clustertest]# pcs status
>> Cluster name: MyCluster
>> Stack: corosync
>> Current DC: NONE
>> Last updated: Sat Jun 10 10:58:46 2017          Last change: Sat Jun
>> 10 10:40:23 2017 by root via cibadmin on node1-pcs
>>
>> 2 nodes and 0 resources configured
>>
>> OFFLINE: [ node1-pcs node2-pcs ]
>>
>> No resources
>>
>>
>> Daemon Status:
>>   corosync: active/disabled
>>   pacemaker: active/disabled
>>   pcsd: active/enabled
>>
>> [root at node1 clustertest]# pcs resource create ClusterIP
>> ocf:heartbeat:IPaddr2 ip=1.2.3.4 nic=bond0 cidr_netmask=24
>> [root at node1 clustertest]# pcs resource meta ClusterIP
>> resource-stickiness=0
>> [root at node1 clustertest]# pcs resource clone ClusterIP clone-max=2
>> clone-node-max=2 globally-unique=true interleave=true
>> [root at node1 clustertest]# pcs resource create Test1 systemd:vtest1
>> [root at node1 clustertest]# pcs resource create Test2 systemd:vtest2
>> [root at node1 clustertest]# pcs constraint location Test1 prefers
>> node1-pcs=INFINITY
>> [root at node1 clustertest]# pcs constraint location Test2 prefers
>> node1-pcs=INFINITY
>>
>> [root at node1 clustertest]# pcs node standby node1-pcs
>> [root at node1 clustertest]# pcs status
>> Cluster name: MyCluster
>> Stack: corosync
>> Current DC: node1-pcs (version 1.1.15-11.el7_3.4-e174ec8) - partition
>> with quorum
>> Last updated: Sat Jun 10 11:01:07 2017          Last change: Sat Jun
>> 10 11:00:59 2017 by root via crm_attribute on node1-pcs
>>
>> 2 nodes and 4 resources configured
>>
>> Node node1-pcs: standby
>> Online: [ node2-pcs ]
>>
>> Full list of resources:
>>
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>  Test1  (systemd:vtest1):       Started node2-pcs
>>  Test2  (systemd:vtest2):       Started node2-pcs
>>
>> Daemon Status:
>>   corosync: active/disabled
>>   pacemaker: active/disabled
>>   pcsd: active/enabled
>>
>> [root at node1 clustertest]# pcs node unstandby node1-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>
>> [root at node1 clustertest]# pcs node standby node2-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>
>> [root at node1 clustertest]# pcs node unstandby node2-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>
>> [root at node1 clustertest]# pcs resource delete ClusterIP
>> Attempting to stop: ClusterIP...Stopped
>> [root at node1 clustertest]# pcs resource create ClusterIP
>> ocf:heartbeat:IPaddr2 ip=1.2.3.4 nic=bond0 cidr_netmask=24
>> [root at node1 clustertest]# pcs resource meta ClusterIP
>> resource-stickiness=0
>> [root at node1 clustertest]# pcs resource clone ClusterIP clone-max=2
>> clone-node-max=2 globally-unique=true interleave=true
>>
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>
>> [root at node1 clustertest]# pcs node standby node1-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node2-pcs
>>  Test2  (systemd:vtest2):       Started node2-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>
>> [root at node1 clustertest]# pcs node unstandby node1-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>
>> [root at node1 clustertest]# pcs node standby node2-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>
>> [root at node1 clustertest]# pcs node unstandby node2-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node1-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>
>> [root at node1 clustertest]# pcs node standby node1-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node2-pcs
>>  Test2  (systemd:vtest2):       Started node2-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>
>> [root at node1 clustertest]# pcs node unstandby node1-pcs
>> [root at node1 clustertest]# pcs status resources
>>  Test1  (systemd:vtest1):       Started node1-pcs
>>  Test2  (systemd:vtest2):       Started node1-pcs
>>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>>      ClusterIP:0        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>      ClusterIP:1        (ocf::heartbeat:IPaddr2):       Started node2-pcs
>>
>> So in the initial configuration, it works as expected; putting the
>> nodes in standby one at a time (I waited at least 5 seconds between
>> each standby/unstandby operation) and then restoring the nodes shows
>> the ClusterIP bouncing back and forth as expected. But then after
>> deleting the ClusterIP resource and recreating it exactly as it
>> originally was the clones initially both stay on one node (the one the
>> test resources are not on). Putting the node the extra resources are
>> on in standby and then restoring it the IPs stay on the other node.
>> Putting the node the extra resources are *not* on in standby and then
>> restoring that node allows the IPs to split once again.
>>
>> I also did the test above with full pcs status displays after each
>> standby/unstandby; there were no errors displayed at each step.
>>
>> So I guess my bottom line question is: How does one tell Pacemaker
>> that the individual legs of globally unique clones should *always* be
>> spread across the available nodes whenever possible, regardless of the
>> number of processes on any one of the nodes? For kicks I did try:
>>
>
> You configured 'clone-node-max=2'. Set that to '1' and the maximum
> number of clones per node is gonna be '1' - if this is what you intended ...
>
> Regards,
> Klaus
>

Thanks for the reply, Klaus. My understanding was that with the IPaddr2 agent in an active-active setup it was necessary to set 
'clone-node-max=2' so that in the event of failover the traffic that had been targeted to the now-failed node would be answered on 
the still-working node. I.E., I *want* that unique clone to bounce to the working node on failover, but I want it to bounce back 
when the node is recovered.

My main reference here is the "Clusters from Scratch" tutorial:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_clone_the_ip_address.html

Cheers,

Dan

>> pcs constraint location ClusterIP:0 prefers node1-pcs=INFINITY
>>
>> but it responded with an error about an invalid character (:).
>>
>> Thanks,
>>
>> Dan
>>
>>>
>>> I think you could probably get what you want by putting an optional
>>> (<INFINITY) colocation preference between PrivateIP and PublicIP. The
>>> only way pacemaker could satisfy that would be to run one of each on
>>> each node.
>>>
>>>> Also, I noticed when putting a node in standby the main NIC appears to
>>>> be interrupted momentarily (long enough for my SSH session, which is
>>>> connected via the permanent IP on the NIC and not the clusterIP, to be
>>>> dropped). Is there any way to avoid this? I was thinking that the
>>>> cluster operations would only affect the ClusteIP and not the other IPs
>>>> being served on that NIC.
>>>
>>> Nothing in the cluster should cause that behavior. Check all the system
>>> logs around the time to see if anything unusual is reported.
>>>
>>>>
>>>> Thanks!
>>>>
>>>> Dan
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>