<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Many Thanks for your brilliant answers ,</p>

    <p>Ken your suggestion : <br>

    </p>

    <pre wrap=""><font color="#3333ff">"The second problem is that you have an ordering constraint but no colocation constraint. With your current setup, windows_VM has to start

after the storage, but it doesn't have to start on the same node. You need a colocation constraint as well, to ensure they start on the same

node."</font></pre>

    <b>for the storage i have the following complete steps</b><br>

    <br>

    pcs resource create ProcDRBD_SigmaVMs ocf:linbit:drbd 

    drbd_resource=sigma_vms drbdconf=/etc/drbd.conf op monitor

    interval=10s<br>

    <br>

    pcs resource master clone_ProcDRBD_SigmaVMs ProcDRBD_SigmaVMs

    master-max=1 master-node-max=1 clone-max=2 clone-node-max=1

    notify=true<br>

    <br>

    pcs resource create StorageDRBD_SigmaVMs Filesystem

    device="/dev/drbd1" directory="/opt/sigma_vms/" fstype="ext4"<br>

    <br>

    <font color="#ff0000">pcs constraint location

      clone_ProcDRBD_SigmaVMs prefers sgw-01</font><br>

    <br>

    pcs constraint colocation add StorageDRBD_SigmaVMs with

    clone_ProcDRBD_SigmaVMs INFINITY with-rsc-role=Master <br>

       <br>

    pcs constraint order promote clone_ProcDRBD_SigmaVMs then start

    StorageDRBD_SigmaVMs<br>

    <b><br>

    </b><b>and when i create the VM</b><br>

    <br>

    pcs resource create windows_VM_res VirtualDomain

    hypervisor="qemu:///system"

    config="/opt/sigma_vms/xml_definitions/windows_VM.xml" <br>

         <br>

    <font color="#ff0000">pcs constraint colocation add windows_VM_res

      with StorageDRBD_SigmaVMs INFINITY<br>

      <br>

      pcs constraint order start StorageDRBD_SigmaVMs_rers then start

      windows_VM</font>  <br>

    <br>

    <br>

    <b>My question Ken is : are the below steps (in red enough) to

      ensure that the new VM will be placed on the node 1 ?</b><br>

    <br>

    <font color="#ff0000"><font color="#000000">( storage process

        prefers node 1 (the primary of drbd) with weight INFINITY,

        windows_VM should be placed with StorageDRBD_SigmaVMs always <br>

        <br>

        and from transitive rule </font></font><font color="#ff0000"><font

        color="#000000"><font color="#ff0000"><font color="#000000">windows_VM</font></font> 

        should be placed on node1 (assume that ---> means prefer) 

        storage --> node1 , windows ---> storage thus from

        transitive rule windows_VM ---> node1</font><br>

      <br>

      pcs constraint location clone_ProcDRBD_SigmaVMs prefers sgw-01<br>

    </font><font color="#ff0000"><br>

      pcs constraint colocation add windows_VM_res with

      StorageDRBD_SigmaVMs INFINITY<br>

      <br>

      pcs constraint order start StorageDRBD_SigmaVMs_rers then start

      windows_VM</font> <br>

    <font color="#ff0000"></font>

    <div class="moz-cite-prefix"><br>

      <br>

      <br>

      On 06/26/2018 07:36 PM, Ken Gaillot wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:1530030995.5202.9.camel@redhat.com">

      <pre wrap="">On Tue, 2018-06-26 at 18:24 +0300, Vaggelis Papastavros wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">Many thanks for the excellent answer ,

Ken after investigation of the log files :

In our environment we have two drbd partitions one for customer_vms

and on for sigma_vms 

For the customer_vms the active node is node2 and for the sigma_vms

the active node is node1 .

[root@sgw-01 drbd.d]# drbdadm status

customer_vms role:Secondary

  disk:UpToDate

  sgw-02 role:Primary

    peer-disk:UpToDate

sigma_vms role:Primary

  disk:UpToDate

  sgw-02 role:Secondary

    peer-disk:UpToDate

when i create a new VM i can't force the resource creation to take

place on a specific node , the cluster places the resource 

spontaneously on one of the two nodes (if the node happens to be the

drbd Primary then is ok, else the pacemaker raise a failure fro the

node) .

My solution is the following  :

pcs resource create windows_VM_res VirtualDomain

hypervisor="qemu:///system"

config="/opt/sigma_vms/xml_definitions/windows_VM.xml" 

(the cluster arbitrarily try to place the above resource on node 2

who is currently the secondary for the corresponding partition.

Personally 

i assume that the VirtualDomain agent should be able to read the

correct disk location from the xml defintion and then try to find the

correct drbd node)      

pcs constraint colocation add windows_VM_res with

StorageDRBD_SigmaVMs INFINITY

pcs constraint order start StorageDRBD_SigmaVMs_rers then start

windows_VM

</pre>

      </blockquote>

      <pre wrap="">

Two things will help:

One problem is that you are creating the VM, and then later adding

constraints about what the cluster can do with it. Therefore there is a

time in between where the cluster can start it without any constraint.

The solution is to make your changes all at once. Both pcs and crm have

a way to do this; with pcs, it's:

  pcs cluster cib <filename>

  pcs -f <filename> ...whatever command you want...

  ...repeat...

  pcs cluster cib-push --config <filename>

The second problem is that you have an ordering constraint but no

colocation constraint. With your current setup, windows_VM has to start

after the storage, but it doesn't have to start on the same node. You

need a colocation constraint as well, to ensure they start on the same

node.

</pre>

      <blockquote type="cite">

        <pre wrap="">

pcs resource cleanup windows_VM_res

After the above steps the VM is located on the correct node and

everything is ok.

Is my approach correct ?

Your opinion would be valuable,

Sincerely 

On 06/25/2018 07:15 PM, Ken Gaillot wrote:

</pre>

        <blockquote type="cite">

          <pre wrap="">On Mon, 2018-06-25 at 09:47 -0500, Ken Gaillot wrote:

</pre>

          <blockquote type="cite">

            <pre wrap="">On Mon, 2018-06-25 at 11:33 +0300, Vaggelis Papastavros wrote:

</pre>

            <blockquote type="cite">

              <pre wrap="">Dear friends ,

We have the following configuration :

CentOS7 , pacemaker 0.9.152 and Corosync 2.4.0, storage with

DRBD

and 

stonith eanbled with APC PDU devices.

I have a windows VM configured as cluster resource with the

following 

attributes :

Resource: WindowSentinelOne_res (class=ocf provider=heartbeat 

type=VirtualDomain)

Attributes: hypervisor=qemu:///system 

config=/opt/customer_vms/conf/WindowSentinelOne/WindowSentinelO

ne.x

ml

migration_transport=ssh

Utilization: cpu=8 hv_memory=8192

Operations: start interval=0s timeout=120s 

(WindowSentinelOne_res-start-interval-0s)

                     stop interval=0s timeout=120s 

(WindowSentinelOne_res-stop-interval-0s)

                     monitor interval=10s timeout=30s 

(WindowSentinelOne_res-monitor-interval-10s)

under some circumstances  (which i try to identify) the VM

fails

and 

disappears under virsh list --all and also pacemaker reports

the VM

as 

stopped .

If run pcs resource cleanup windows_wm everything is OK, but i

can't 

identify the reason of failure.

For example when shutdown the VM (with windows shutdown)  the

cluster 

reports the following :

WindowSentinelOne_res    (ocf::heartbeat:VirtualDomain):

Started

sgw-

02 

(failure ignored)

Failed Actions:

* WindowSentinelOne_res_monitor_10000 on sgw-02 'not running'

(7): 

call=67, status=complete, exitreason='none',

     last-rc-change='Mon Jun 25 07:41:37 2018', queued=0ms,

exec=0ms.

My questions are

1) why the VM shutdown is reported as (FailedAction) from

cluster ?

Its 

a worthy operation during VM life cycle .

</pre>

            </blockquote>

            <pre wrap="">

Pacemaker has no way of knowing that the VM was intentionally

shut

down, vs crashed.

When some resource is managed by the cluster, all starts and

stops of

the resource have to go through the cluster. You can either set

target-

role=Stopped in the resource configuration, or if it's a

temporary

issue (e.g. rebooting for some OS updates), you could set is-

managed=false to take it out of cluster control, do the work,

then

set

is-managed=true again.

</pre>

          </blockquote>

          <pre wrap="">

Also, a nice feature is that you can use rules to set a maintenance

window ahead of time (especially helpful if the person who

maintains

the cluster isn't the same person who needs to do the VM

updates). For

example, you could set a rule that the resource's is-managed option

will be false from 9pm to midnight on Fridays. See:

<a class="moz-txt-link-freetext" href="http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-singl">http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-singl</a>

e/Pa

cemaker_Explained/index.html#idm140583511697312

particularly the parts about time/date expressions and using rules

to

control resource options.

</pre>

          <blockquote type="cite">

            <blockquote type="cite">

              <pre wrap="">2) why sometimes the resource is marked as stopped (the VM is

healthy) 

and needs cleanup ?

</pre>

            </blockquote>

            <pre wrap="">

That's a problem. If the VM is truly healthy, it sounds like

there's

an

issue with the resource agent. You'd have to look at the logs to

see

if

it gave any more information (e.g. if it's a timeout, raising the

timeout might be sufficient).

</pre>

            <blockquote type="cite">

              <pre wrap="">3) I can't understand the corosync logs ... during the the VM

shutdown 

corosync logs is the following

</pre>

            </blockquote>

            <pre wrap="">

FYI, the system log will have the most important messages.

corosync.log

will additionally have info-level messages -- potentially helpful

but

definitely difficult to follow.

</pre>

            <blockquote type="cite">

              <pre wrap="">Jun 25 07:41:37 [5140] sgw-02       crmd:     info: 

process_lrm_event:    Result of monitor operation for 

WindowSentinelOne_res on sgw-02: 7 (not running) | call=67 

key=WindowSentinelOne_res_monitor_10000 confirmed=false cib-

update=36

</pre>

            </blockquote>

            <pre wrap="">

This is really the only important message. It says that a

recurring

monitor on the WindowSentinelOne_res resource on node sgw-02

exited

with status code 7 (which means the resource agent thinks the

resource

is not running).

'key=WindowSentinelOne_res_monitor_10000' is how pacemaker

identifies

resource agent actions. The format is <resource-name>_<action-

name>_<action-interval-in-milliseconds>

This is the only information Pacemaker will get from the resource

agent. To investigate more deeply, you'll have to check for log

messages from the agent itself.

</pre>

            <blockquote type="cite">

              <pre wrap="">Jun 25 07:41:37 [5130] sgw-02        cib:     info: 

cib_process_request:    Forwarding cib_modify operation for

section 

status to all (origin=local/crmd/36)

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

Diff: --- 0.4704.67 2

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

Diff: +++ 0.4704.68 (null)

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

+  /cib:  @num_updates=68

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

+  /cib/status/node_state[@id='2']: @crm-debug-

origin=do_update_resource

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

++ 

/cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_

reso

ur

ce[@id='WindowSentinelOne_res']: 

<lrm_rsc_op id="WindowSentinelOne_res_last_failure_0" 

operation_key="WindowSentinelOne_res_monitor_10000"

operation="monitor" 

crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" 

transition-key="84:3:0:f910c793-a714-4e24-80d1-b0ec66275491" 

transition-magic="0:7;84:3:0:f910c793-a714-4e24-80d1-

b0ec66275491" 

on_node="sgw-02" cal

Jun 25 07:41:37 [5130] sgw-02        cib:     info: 

cib_process_request:    Completed cib_modify operation for

section 

status: OK (rc=0, origin=sgw-02/crmd/36, version=0.4704.68)

</pre>

            </blockquote>

            <pre wrap="">

You can usually ignore the 'cib' messages. This just means

Pacemaker

recorded the result on disk.

</pre>

            <blockquote type="cite">

              <pre wrap="">Jun 25 07:41:37 [5137] sgw-02      attrd:     info: 

attrd_peer_update:    Setting fail-count-

WindowSentinelOne_res[sgw-

02]: 

(null) -> 1 from sgw-01

</pre>

            </blockquote>

            <pre wrap="">

Since the cluster expected the resource to be running, this

result is

a

failure. Failures are counted using special node attributes that

start

with "fail-count-". This is what Pacemaker uses to determine if a

resource has reached its migration-threshold.

</pre>

            <blockquote type="cite">

              <pre wrap="">Jun 25 07:41:37 [5137] sgw-02      attrd:     info:

write_attribute:    

Sent update 10 with 1 changes for fail-count-

WindowSentinelOne_res, 

id=<n/a>, set=(null)

Jun 25 07:41:37 [5130] sgw-02        cib:     info: 

cib_process_request:    Forwarding cib_modify operation for

section 

status to all (origin=local/attrd/10)

Jun 25 07:41:37 [5137] sgw-02      attrd:     info: 

attrd_peer_update:    Setting 

last-failure-WindowSentinelOne_res[sgw-02]: (null) ->

1529912497

from

sgw-01

</pre>

            </blockquote>

            <pre wrap="">

Similarly, the time the failure occurred is stored in a 'last-

failure-' 

node attribute, which Pacemaker uses to determine if a resource

has

reached its failure-timeout.

</pre>

            <blockquote type="cite">

              <pre wrap="">Jun 25 07:41:37 [5137] sgw-02      attrd:     info:

write_attribute:    

Sent update 11 with 1 changes for last-failure-

WindowSentinelOne_res, 

id=<n/a>, set=(null)

Jun 25 07:41:37 [5130] sgw-02        cib:     info: 

cib_process_request:    Forwarding cib_modify operation for

section 

status to all (origin=local/attrd/11)

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

Diff: --- 0.4704.68 2

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

Diff: +++ 0.4704.69 (null)

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

+  /cib:  @num_updates=69

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

++ 

/cib/status/node_state[@id='2']/transient_attributes[@id='2']/i

nsta

nc

e_attributes[@id='status-2']: 

<nvpair id="status-2-fail-count-WindowSentinelOne_res" 

name="fail-count-WindowSentinelOne_res" value="1"/>

Jun 25 07:41:37 [5130] sgw-02        cib:     info: 

cib_process_request:    Completed cib_modify operation for

section 

status: OK (rc=0, origin=sgw-02/attrd/10, version=0.4704.69)

Jun 25 07:41:37 [5137] sgw-02      attrd:     info: 

attrd_cib_callback:    Update 10 for fail-count-

WindowSentinelOne_res: 

OK (0)

Jun 25 07:41:37 [5137] sgw-02      attrd:     info: 

attrd_cib_callback:    Update 10 for 

fail-count-WindowSentinelOne_res[sgw-02]=1: OK (0)

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

Diff: --- 0.4704.69 2

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

Diff: +++ 0.4704.70 (null)

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

+  /cib:  @num_updates=70

Jun 25 07:41:37 [5130] sgw-02        cib:     info:

cib_perform_op:    

++ 

/cib/status/node_state[@id='2']/transient_attributes[@id='2']/i

nsta

nc

e_attributes[@id='status-2']: 

<nvpair id="status-2-last-failure-WindowSentinelOne_res" 

name="last-failure-WindowSentinelOne_res" value="1529912497"/>

Jun 25 07:41:37 [5130] sgw-02        cib:     info: 

cib_process_request:    Completed cib_modify operation for

section 

status: OK (rc=0, origin=sgw-02/attrd/11, version=0.4704.70)

Jun 25 07:41:37 [5137] sgw-02      attrd:     info: 

attrd_cib_callback:    Update 11 for last-failure-

WindowSentinelOne_res: 

OK (0)

Jun 25 07:41:37 [5137] sgw-02      attrd:     info: 

attrd_cib_callback:    Update 11 for 

last-failure-WindowSentinelOne_res[sgw-02]=1529912497: OK (0)

Jun 25 07:41:42 [5130] sgw-02        cib:     info:

cib_process_ping:    

Reporting our current digest to sgw-01:

3e27415fcb003ef3373b47ffa6c5f358 

for 0.4704.70 (0x7faac1729720 0)

Sincerely ,

Vaggelis Papastavros

</pre>

            </blockquote>

          </blockquote>

        </blockquote>

        <pre wrap=""> 

_______________________________________________

Users mailing list: <a class="moz-txt-link-abbreviated" href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a>

<a class="moz-txt-link-freetext" href="https://lists.clusterlabs.org/mailman/listinfo/users">https://lists.clusterlabs.org/mailman/listinfo/users</a>

Project Home: <a class="moz-txt-link-freetext" href="http://www.clusterlabs.org">http://www.clusterlabs.org</a>

Getting started: <a class="moz-txt-link-freetext" href="http://www.clusterlabs.org/doc/Cluster_from_Scratch">http://www.clusterlabs.org/doc/Cluster_from_Scratch</a>.

pdf

Bugs: <a class="moz-txt-link-freetext" href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a>

</pre>

      </blockquote>

    </blockquote>

    <br>

  </body>

</html>