[ClusterLabs] can't start/stop a drbd resource with pacemaker

Fri Jul 15 17:48:45 EDT 2016

On 07/15/2016 03:54 PM, Lentes, Bernd wrote:
> 
> 
> ----- Am 13. Jul 2016 um 14:25 schrieb Kristoffer Grönlund kgronlund at suse.com:
> 
>> "Lentes, Bernd" <bernd.lentes at helmholtz-muenchen.de> writes:
>>
>>> Starting or stopping drbd with start/stop does not work neither for the ms
>>> ressource nor for the primitive.
>>> If i try to stop it keeps running. Also if i do a cleanup before (for both
>>> resources).
>>> Which resource should i stop first ? The primitive or the ms ?
>>
>> You should always act on the container, so the ms in this case (or the
>> clone, or the group).
>>
> 
> Ok.
> 
>>> I tried both, but none worked. Other resources, like an ip, can start/stop with
>>> crm.
>>> When i change the target-role of of the primitive via "crm configure edit" and
>>> commit that, it starts/stops immediately.
>>> But that can't be the prefered way to start/stop a drbd resource ?

I'm not sure what you're asking. Normally, you leave target-role to
default, and the cluster starts and stops everything as appropriate. If
you want to forcibly ensure that a resource is down (for example, to
upgrade its software), you can set target-role to Stopped. That is the
preferred way to forcibly stop a resource, but that shouldn't be
necessary in normal operation (the point of HA is to keep the services
up, after all).

Are you saying that the DRBD resource remains active even after
target-role has been set to Stopped? If that's the case, do the status
or logs show any errors for the stop operation?

>> All crm resource stop <resource> does is set the target-role... so there
>> is something else going on.
>>>
>>> If you need more information ask me.
>>
>> You can run crm with the -d argument to get more information about what
>> it does, and -dR to get a full trace of all the commands it executes.
>>
>> Grepping the logs on the DC node (see crm status output) will probably
>> get you more hints as well.
>>
> 
> Hi,
> 
> i found that:
> 
> crm(live)resource# scores
> 
> Current cluster status:
> Online: [ sunhb58820 sunhb65277 ]
> 
>  prim_ip_hawk   (ocf::heartbeat:IPaddr):        (target-role:Stopped) Stopped
>  prim_fs_drbd_r0        (ocf::heartbeat:Filesystem):    (target-role:Stopped) Stopped
>  prim_hawk      (lsb:hawk):     Stopped
>  Master/Slave Set: ms_drbd_r0 [prim_drbd_r0]
>      Stopped: [ sunhb58820 sunhb65277 ]
> 
> Allocation scores:
> native_color: prim_ip_hawk allocation score on sunhb58820: -INFINITY
> native_color: prim_ip_hawk allocation score on sunhb65277: -INFINITY
> native_color: prim_fs_drbd_r0 allocation score on sunhb58820: -INFINITY
> native_color: prim_fs_drbd_r0 allocation score on sunhb65277: -INFINITY
> native_color: prim_hawk allocation score on sunhb58820: -INFINITY
> native_color: prim_hawk allocation score on sunhb65277: -INFINITY
> clone_color: ms_drbd_r0 allocation score on sunhb58820: 0
> clone_color: ms_drbd_r0 allocation score on sunhb65277: 0
> clone_color: prim_drbd_r0:0 allocation score on sunhb58820: 0
> clone_color: prim_drbd_r0:0 allocation score on sunhb65277: 0
> clone_color: prim_drbd_r0:1 allocation score on sunhb58820: 0
> clone_color: prim_drbd_r0:1 allocation score on sunhb65277: 0
> native_color: prim_drbd_r0:0 allocation score on sunhb58820: -INFINITY
> native_color: prim_drbd_r0:0 allocation score on sunhb65277: -INFINITY
> native_color: prim_drbd_r0:1 allocation score on sunhb58820: -INFINITY
> native_color: prim_drbd_r0:1 allocation score on sunhb65277: -INFINITY
> prim_drbd_r0:0 promotion score on none: 0
> prim_drbd_r0:1 promotion score on none: 0
> 
> When the score is -INFINITY, the resource can't run on both nodes. Yes ?

Correct, a score of -INFINITY for a particular resource on a particular
node means that resource can't run there. In this case, the
"target-role:Stopped" explains it -- you've explicitly disabled
prim_ip_hawk and prim_fs_drbd_r0 in the configuration, and the cluster
implements that by setting -INFINITY scores on all nodes.

> What means native_color and clone_color ? I read something about different functions in the allocation ?

Right, it's just an internal detail indicating where the score was
calculated. The important information is the resource name, node name,
and score.

> Why are the values different ? Is the score changing depending on the time ?

No, it just means different functions contribute to the score. For
clones, both the clone as a whole and the individual clone instances
have scores. Scores are added together to get a final value.

> And why is there a prim_drbd_r0:0 and a prim_drbd_r0:1 ?

Those are the individual clone instances. It's possible for individual
clone instances to have different scores. For example, you might have a
constraint saying that the master role prefers a certain node.

> Thanks.

Thanks for taking the time to understand so thoroughly! Feel free to ask
if anything is still unclear.

> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Dr. Alfons Enhsen, Renate Schlusen (komm.)
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>