[ClusterLabs] big trouble with a DRBD resource

Tue Aug 8 05:02:43 EDT 2017

----- On Aug 7, 2017, at 10:43 PM, kgaillot kgaillot at redhat.com wrote:
> 
> The logs are very useful, but not particularly easy to follow. It takes
> some practice and experience, but I think it's worth it if you have to
> troubleshoot cluster events often.

I will give my best.

> 
> It's on the to-do list to create a "Troubleshooting Pacemaker" document
> that helps with this and using tools such as crm_simulate.
> 
> The first step in understanding the logs is to learn what the pacemaker
> daemons are and what they do, and what the DC node is. It starts to make
> more sense from there:
> 
>   pacemakerd: spawns all other daemons and re-spawns them if they crash
>   attrd: manages node attributes
>   cib: manages reading/writing the configuration
>   lrmd: executes resource agents
>   pengine: given a cluster state, determines any actions needed
>   crmd: manages cluster membership and carries out the pengine's
> decisions by asking the lrmd to perform actions
> 
> At any given time, one node's crmd in the cluster (or partition if there
> is a network split) is elected as the DC (designated controller). The DC
> asks the pengine what needs to be done, then farms out the results to
> all the other crmd's, which (if necessary) call their local lrmd to
> actually execute the actions.
> 
That's very helpful. Thanks.

Bernd

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671