[ClusterLabs] Why shouldn't one store resource configuration in the CIB?

Ferenc Wágner wferi at niif.hu
Tue Apr 18 18:46:27 CEST 2017


Ken Gaillot <kgaillot at redhat.com> writes:

> On 04/13/2017 11:11 AM, Ferenc Wágner wrote:
> 
>> I encountered several (old) statements on various forums along the lines
>> of: "the CIB is not a transactional database and shouldn't be used as
>> one" or "resource parameters should only uniquely identify a resource,
>> not configure it" and "the CIB was not designed to be a configuration
>> database but people still use it that way".  Sorry if I misquote these,
>> I go by my memories now, I failed to dig up the links by a quick try.
>> 
>> Well, I've been feeling guilty in the above offenses for years, but it
>> worked out pretty well that way which helped to suppress these warnings
>> in the back of my head.  Still, I'm curious: what's the reason for these
>> warnings, what are the dangers of "abusing" the CIB this way?
>> /var/lib/pacemaker/cib/cib.xml is 336 kB with 6 nodes and 155 resources
>> configured.  Old Pacemaker versions required tuning PCMK_ipc_buffer to
>> handle this, but even the default is big enough nowadays (128 kB after
>> compression, I guess).
>> 
>> Am I walking on thin ice?  What should I look out for?
>
> That's a good question. Certainly, there is some configuration
> information in most resource definitions, so it's more a matter of degree.
>
> The main concerns I can think of are:
>
> 1. Size: Increasing the CIB size increases the I/O, CPU and networking
> overhead of the cluster (and if it crosses the compression threshold,
> significantly). It also marginally increases the time it takes the
> policy engine to calculate a new state, which slows recovery.

Thanks for the input, Ken!  Is this what you mean?

cib: info: crm_compress_string: Compressed 1028972 bytes into 69095 (ratio 14:1) in 138ms

At the same time /var/lib/pacemaker/cib/cib.xml is 336K, and

# cibadmin -Q --scope resources | wc -c
330951
# cibadmin -Q --scope status | wc -c
732820

Even though I consume about 2 kB per resource, the status section
weights 2.2 times the resources section.  Which means shrinking the
resource size wouldn't change the full size significantly.

At the same time, we should probably monitor the trends of the cluster
messaging health as we expand it (with nodes and resources).  What would
be some useful indicators to graph?

> 2. Consistency: Clusters can become partitioned. If changes are made on
> one or more partitions during the separation, the changes won't be
> reflected on all nodes until the partition heals, at which time the
> cluster will reconcile them, potentially losing one side's changes.

Ah, that's a very good point, which I neglected totally: even inquorate
partitions can have configuration changes.  Thanks for bringing this up!
I wonder if there's any practical workaround for that.

> I suppose this isn't qualitatively different from using a separate
> configuration file, but those tend to be more static, and failure to
> modify all copies would be more obvious when doing them individually
> rather than issuing a single cluster command.

>From a different angle: if a node is off, you can't modify its
configuration file.  So you need an independent mechanism to do what the
CIB synchronization does anyway, or a shared file system with its added
complexity.  On the other hand, one needn't guess how Pacemaker
reconciles the conflicting resource configuration changes.  Indeed, how
does it?
-- 
Thanks,
Feri



More information about the Users mailing list