[ClusterLabs] Colocation of a primitive resource with a clone with limited copies

Sun Apr 23 14:18:18 EDT 2017

On 21.4.2017 14:14, Vladislav Bogdanov wrote:
> 20.04.2017 23:16, Jan Wrona wrote:
>> On 20.4.2017 19:33, Ken Gaillot wrote:
>>> On 04/20/2017 10:52 AM, Jan Wrona wrote:
>>>> Hello,
>>>>
>>>> my problem is closely related to the thread [1], but I didn't find a
>>>> solution there. I have a resource that is set up as a clone C 
>>>> restricted
>>>> to two copies (using the clone-max=2 meta attribute||), because the
>>>> resource takes long time to get ready (it starts immediately though),
>>> A resource agent must not return from "start" until a "monitor"
>>> operation would return success.
>>>
>>> Beyond that, the cluster doesn't care what "ready" means, so it's OK if
>>> it's not fully operational by some measure. However, that raises the
>>> question of what you're accomplishing with your monitor.
>> I know all that and my RA respects that. I didn't want to go into
>> details about the service I'm running, but maybe it will help you
>> understand. Its a data collector which receives and processes data from
>> a UDP stream. To understand these data, it needs templates which
>> periodically occur in the stream (every five minutes or so). After
>> "start" the service is up and running, "monitor" operations are
>> successful, but until the templates arrive the service is not "ready". I
>> basically need to somehow simulate this "ready" state.
>
> If you are able to detect that your application is ready (it already 
> received its templates) in your RA's monitor, you may want to use 
> transient node attributes to indicate that to the cluster. And tie 
> your vip with such an attribute (with location constraint with rules).
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_using_rules_to_determine_resource_location.html#_location_rules_based_on_other_node_properties 
>
>
> Look at pacemaker/ping RA for attr management example.
>
> [...]

It looks like transient attributes are what I've been looking for, thank 
you! I'm not able to detect "ready" state, but at least I'm able to 
assign application's elapsed time since the process was started (with 
the upper limit of 1000 seconds) into the transient attribute. I've tied 
this with IP's location constraint score and now the cluster places the 
IP resource on the node with the longest running application process. 
When both clone instances hit the upper limit of 1000 seconds, then both 
should be "ready" and the cluster may safely apply other location 
preferences. I've also made the IP's resource slightly sticky, so it 
doesn't oscillate between nodes when several clone instances are started 
at the same time.

>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org