<div dir="ltr">No problem! That's what we're here for. I'm glad it's sorted out :)<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Aug 28, 2020 at 12:27 AM Citron Vert <<a href="mailto:citron_vert@hotmail.com">citron_vert@hotmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  
  <div>
    <p>Hi,</p>
    <p>You are right, the problems seem to come from some services that
      are started at startup. <br>
      <br>
      My installation script disables all startup options for all
      services we use, that's why I didn't focus on this possibility. <br>
    </p>
    <p>But after a quick investigation, a colleague had the good idea to
      make a "security" script that monitors and starts certain
      services.</p>
    <p><br>
    </p>
    <p>Sorry to have contacted you for this little mistake, <br>
    </p>
    <p>Thank you for the help, it was effective</p>
    <p>Quentin<br>
    </p>
    <p><br>
    </p>
    <p><br>
    </p>
    <div>Le 27/08/2020 à 09:56, Reid Wahl a
      écrit :<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>Hi, Quentin. Thanks for the logs!</div>
        <div><br>
        </div>
        <div>I see you highlighted the fact that SERVICE1 was in
          "Stopping" state on both node 1 and node 2 when node 1 was
          rejoining the cluster. I also noted the following later in the
          logs, as well as some similar messages earlier:<br>
        </div>
        <div><br>
        </div>
        <div>
          <pre>Aug 27 08:47:02 [1330] NODE2    pengine:     info: determine_op_status:       Operation monitor found resource SERVICE1 active on NODE1
Aug 27 08:47:02 [1330] NODE2    pengine:     info: determine_op_status:       Operation monitor found resource SERVICE1 active on NODE1
Aug 27 08:47:02 [1330] NODE2    pengine:     info: determine_op_status:       Operation monitor found resource SERVICE4 active on NODE2
Aug 27 08:47:02 [1330] NODE2    pengine:     info: determine_op_status:       Operation monitor found resource SERVICE1 active on NODE2
...
Aug 27 08:47:02 [1330] NODE2    pengine:     info: common_print:              1 : NODE1
Aug 27 08:47:02 [1330] NODE2    pengine:     info: common_print:              2 : NODE2
...
Aug 27 08:47:02 [1330] NODE2    pengine:    error: native_create_actions:     Resource SERVICE1 is active on 2 nodes (attempting recovery)
Aug 27 08:47:02 [1330] NODE2    pengine:   notice: native_create_actions:     See <a href="https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active" target="_blank">https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active</a> for more information

</pre>
          <pre><font face="arial,sans-serif">Can you make sure that all the cluster-managed systemd services are disabled from starting at boot (i.e., `systemctl is-enabled service1`, and the same for all the others) on both nodes? If they are enabled, disable them.</font>
</pre>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Thu, Aug 27, 2020 at 12:46
          AM Citron Vert <<a href="mailto:citron_vert@hotmail.com" target="_blank">citron_vert@hotmail.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>
            <p>Hi,</p>
            <p>Sorry for using this email adress, my name is Quentin.
              Thank you for your reply.</p>
            <p>I have already tried the stickiness solution (with the
              deprecated  value). I tried the one you gave me, and it
              does not change anything. <br>
            </p>
            <p>Resources don't seem to move from node to node (i don't
              see the changes with crm_mon command).</p>
            <p><br>
            </p>
            <p>In the logs i found this line <i>"error:
                native_create_actions:     Resource SERVICE1 is active
                on 2 nodes</i>"</p>
            <p>Which led me to contact you to understand and learn a
              little more about this cluster. And why there are running
              resources on the passive node.<br>
            </p>
            <p><br>
            </p>
            <p>You will find attached the logs during the reboot of the
              passive node and my cluster configuration.<br>
            </p>
            <p>I think I'm missing out on something in the configuration
              / logs that I don't understand..</p>
            <p><br>
            </p>
            <p>Thank you in advance for your help,</p>
            <p>Quentin<br>
            </p>
            <p><br>
            </p>
            <div>Le 26/08/2020 à 20:16, Reid Wahl a écrit :<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>Hi, Citron.</div>
                <div><br>
                </div>
                <div>Based on your description, it sounds like some
                  resources **might** be moving from node 1 to node 2,
                  failing on node 2, and then moving back to node 1. If
                  that's what's happening (and even if it's not), then
                  it's probably smart to set some resource stickiness as
                  a resource default. The below command sets a resource
                  stickiness score of 1.<br>
                </div>
                <div><br>
                </div>
                <div>    # pcs resource defaults resource-stickiness=1<br>
                </div>
                <div><br>
                </div>
                <div>Also note that the "default-resource-stickiness"
                  cluster property is deprecated and should not be used.</div>
                <div><br>
                </div>
                <div>Finally, an explicit default resource stickiness
                  score of 0 can interfere with the placement of cloned
                  resource instances. If you don't want any stickiness,
                  then it's better to leave stickiness unset. That way,
                  primitives will have a stickiness of 0, but clone
                  instances will have a stickiness of 1.<br>
                </div>
                <div><br>
                </div>
                <div>If adding stickiness does not resolve the issue,
                  can you share your cluster configuration and some logs
                  that show the issue happening? Off the top of my head
                  I'm not sure why resources would start and stop on
                  node 2 without moving away from node1, unless they're
                  clone instances that are starting and then failing a
                  monitor operation on node 2.</div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Wed, Aug 26, 2020
                  at 8:42 AM Citron Vert <<a href="mailto:citron_vert@hotmail.com" target="_blank">citron_vert@hotmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                  <div>
                    <p>Hello,<br>
                      I am contacting you because I have a problem with
                      my cluster and I cannot find (nor understand) any
                      information that can help me.</p>
                    <p>I have a 2 nodes cluster (pacemaker, corosync,
                      pcs) installed on CentOS 7 with a set of
                      configuration.<br>
                      Everything seems to works fine, but here is what
                      happens:</p>
                    <ul>
                      <li>Node1 and Node2 are running well with Node1 as
                        primary<br>
                      </li>
                      <li>I reboot Node2 wich is passive (no changes on
                        Node1)</li>
                      <li>Node2 comes back in the cluster as passive<br>
                      </li>
                      <li>corosync logs shows resources getting started
                        then stopped on Node2</li>
                      <li>"crm_mon" command shows some ressources on
                        Node1 getting restarted <br>
                      </li>
                    </ul>
                    <p>I don't understand how it should work.<br>
                      If a node comes back, and becomes passive (since
                      Node1 is running primary), there is no reason for
                      the resources to be started then stopped on the
                      new passive node ?<br>
                    </p>
                    <p>One of my resources becomes unstable because it
                      gets started and then stoped too quickly on Node2,
                      wich seems to make it restart on Node1 without a
                      failover.</p>
                    <p>I tried several things and solution proposed by
                      different sites and forums but without success.</p>
                    <p><br>
                    </p>
                    <p>Is there a way so that the node, which joins the
                      cluster as passive, does not start its own
                      resources ?</p>
                    <p><br>
                    </p>
                    <p>thanks in advance</p>
                    <p><br>
                    </p>
                    <p>Here are some information just in case :</p>
                    <div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Consolas,"Courier New",monospace;font-weight:normal;font-size:14px;line-height:19px;white-space:pre-wrap"><div><span style="color:rgb(212,212,212)">$ rpm -qa | grep -E </span><span style="color:rgb(206,145,120)">"corosync|pacemaker|pcs"</span></div><div><span style="color:rgb(212,212,212)">   corosync-2.4.5-4.el7.x86_64</span></div><div><span style="color:rgb(212,212,212)">   pacemaker-cli-1.1.21-4.el7.x86_64</span></div><div><span style="color:rgb(212,212,212)">   pacemaker-1.1.21-4.el7.x86_64</span></div><div><span style="color:rgb(212,212,212)">   pcs-0.9.168-4.el7.centos.x86_64</span></div><div><span style="color:rgb(212,212,212)">   corosynclib-2.4.5-4.el7.x86_64</span></div><div><span style="color:rgb(212,212,212)">   pacemaker-libs-1.1.21-4.el7.x86_64</span></div><div><span style="color:rgb(212,212,212)">   pacemaker-cluster-libs-1.1.21-4.el7.x86_64</span></div></div>
                    <p><br>
                    </p>
                    <div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Consolas,"Courier New",monospace;font-weight:normal;font-size:14px;line-height:19px;white-space:pre-wrap"><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-stonith-enabled"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"stonith-enabled"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"false"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-no-quorum-policy"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"no-quorum-policy"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"ignore"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-dc-deadtime"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"dc-deadtime"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"120s"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-have-watchdog"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"have-watchdog"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"false"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-dc-version"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"dc-version"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"1.1.21-4.el7-f14e36fd43"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-cluster-infrastructure"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"cluster-infrastructure"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"corosync"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-cluster-name"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"cluster-name"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"CLUSTER"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-last-lrm-refresh"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"last-lrm-refresh"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"1598446314"</span><span style="color:rgb(212,212,212)">/></span></div><div><span style="color:rgb(212,212,212)">        <nvpair id=</span><span style="color:rgb(206,145,120)">"cib-bootstrap-options-default-resource-stickiness"</span><span style="color:rgb(212,212,212)"> name=</span><span style="color:rgb(206,145,120)">"default-resource-stickiness"</span><span style="color:rgb(212,212,212)"> value=</span><span style="color:rgb(206,145,120)">"0"</span><span style="color:rgb(212,212,212)">/></span></div></div>
                    <p><br>
                    </p>
                    <p><br>
                    </p>
                    <p><br>
                    </p>
                  </div>
                  _______________________________________________<br>
                  Manage your subscription:<br>
                  <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
                  <br>
                  ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>
                </blockquote>
              </div>
              <br clear="all">
              <br>
              -- <br>
              <div dir="ltr">
                <div dir="ltr">
                  <div>
                    <div dir="ltr">
                      <div>
                        <div dir="ltr">
                          <div>
                            <div dir="ltr">
                              <div>
                                <div dir="ltr">
                                  <div>
                                    <div dir="ltr">
                                      <div>
                                        <div dir="ltr">
                                          <div>
                                            <div>Regards,<br>
                                              <br>
                                            </div>
                                            Reid Wahl, RHCA<br>
                                          </div>
                                          <div>Software Maintenance
                                            Engineer, Red Hat<br>
                                          </div>
                                          CEE - Platform Support
                                          Delivery - ClusterHA</div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
        </blockquote>
      </div>
      <br clear="all">
      <br>
      -- <br>
      <div dir="ltr">
        <div dir="ltr">
          <div>
            <div dir="ltr">
              <div>
                <div dir="ltr">
                  <div>
                    <div dir="ltr">
                      <div>
                        <div dir="ltr">
                          <div>
                            <div dir="ltr">
                              <div>
                                <div dir="ltr">
                                  <div>
                                    <div>Regards,<br>
                                      <br>
                                    </div>
                                    Reid Wahl, RHCA<br>
                                  </div>
                                  <div>Software Maintenance Engineer,
                                    Red Hat<br>
                                  </div>
                                  CEE - Platform Support Delivery -
                                  ClusterHA</div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
  </div>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div>Regards,<br><br></div>Reid Wahl, RHCA<br></div><div>Software Maintenance Engineer, Red Hat<br></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div>