[ClusterLabs] Users Digest, Vol 9, Issue 21

Thu Oct 8 21:00:35 EDT 2015

Corosync+Pacemaker error during failover

------------------ Original ------------------
From:  "users-request";<users-request at clusterlabs.org>;
Date:  Fri, Oct 9, 2015 02:50 AM
To:  "users"<users at clusterlabs.org>; 

Subject:  Users Digest, Vol 9, Issue 21

Send Users mailing list submissions to
	users at clusterlabs.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://clusterlabs.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
	users-request at clusterlabs.org

You can reach the person managing the list at
	users-owner at clusterlabs.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Users digest..."

Today's Topics:

   1. Re: Current DC becomes None suddenly (Ken Gaillot)
   2. Re: Corosync+Pacemaker error during failover (emmanuel segura)
   3. Re: Corosync+Pacemaker error during failover (Digimer)
   4. Re: Corosync+Pacemaker error during failover (Ken Gaillot)
   5. Re: Xen Migration/resource cleanup problem in SLES11	SP3
      (Cleber Paiva de Souza)
   6. Re: gfs2 crashes when i, e.g., dd to a lvm volume (J. Echter)

----------------------------------------------------------------------

Message: 1
Date: Thu, 8 Oct 2015 10:21:50 -0500
From: Ken Gaillot <kgaillot at redhat.com>
To: Pritam Kharat <pritam.kharat at oneconvergence.com>,	Cluster Labs -
	All topics related to open-source clustering welcomed
	<users at clusterlabs.org>
Subject: Re: [ClusterLabs] Current DC becomes None suddenly
Message-ID: <56168A0E.40405 at redhat.com>
Content-Type: text/plain; charset=utf-8

On 10/08/2015 09:55 AM, Pritam Kharat wrote:
> Hi Ken,
> 
> Thanks for reply.
> 
> On Thu, Oct 8, 2015 at 8:13 PM, Ken Gaillot <kgaillot at redhat.com> wrote:
> 
>> On 10/02/2015 01:47 PM, Pritam Kharat wrote:
>>> Hi,
>>>
>>> I have set up a ACTIVE/PASSIVE HA
>>>
>>> *Issue 1) *
>>>
>>> *corosync.conf*  file is
>>>
>>> # Please read the openais.conf.5 manual page
>>>
>>> totem {
>>>
>>>         version: 2
>>>
>>>         # How long before declaring a token lost (ms)
>>>         token: 10000
>>>
>>>         # How many token retransmits before forming a new configuration
>>>         token_retransmits_before_loss_const: 20
>>>
>>>         # How long to wait for join messages in the membership protocol
>> (ms)
>>>         join: 10000
>>>
>>>         # How long to wait for consensus to be achieved before starting a
>>> new round of membership configuration (ms)
>>>         consensus: 12000
>>>
>>>         # Turn off the virtual synchrony filter
>>>         vsftype: none
>>>
>>>         # Number of messages that may be sent by one processor on receipt
>>> of the token
>>>         max_messages: 20
>>>
>>>         # Limit generated nodeids to 31-bits (positive signed integers)
>>>         clear_node_high_bit: yes
>>>
>>>         # Disable encryption
>>>         secauth: off
>>>
>>>         # How many threads to use for encryption/decryption
>>>         threads: 0
>>>
>>>         # Optionally assign a fixed node id (integer)
>>>         # nodeid: 1234
>>>
>>>         # This specifies the mode of redundant ring, which may be none,
>>> active, or passive.
>>>         rrp_mode: none
>>>         interface {
>>>                 # The following values need to be set based on your
>>> environment
>>>                 ringnumber: 0
>>>                 bindnetaddr: 192.168.101.0
>>> mcastport: 5405
>>>         }
>>>
>>>         transport: udpu
>>> }
>>>
>>> amf {
>>>         mode: disabled
>>> }
>>>
>>> quorum {
>>>         # Quorum for the Pacemaker Cluster Resource Manager
>>>         provider: corosync_votequorum
>>>         expected_votes: 1
>>
>> If you're using a recent version of corosync, use "two_node: 1" instead
>> of "expected_votes: 1", and get rid of "no-quorum-policy: ignore" in the
>> pacemaker cluster options.
>>
>>    -> We are using corosync version 2.3.3. Do we above mentioned change
> for this version ?

Yes, you can use two_node.

FYI, two_node automatically enables wait_for_all, which means that when
a node first starts up, it waits until it can see the other node before
forming the cluster. So once the cluster is running, it can handle the
failure of one node, and the other will continue. But to start, both
nodes needs to be present.

>>> }
>>>
>>>
>>> nodelist {
>>>
>>>         node {
>>>                 ring0_addr: 192.168.101.73
>>>         }
>>>
>>>         node {
>>>                 ring0_addr: 192.168.101.74
>>>         }
>>> }
>>>
>>> aisexec {
>>>         user:   root
>>>         group:  root
>>> }
>>>
>>>
>>> logging {
>>>         fileline: off
>>>         to_stderr: yes
>>>         to_logfile: yes
>>>         to_syslog: yes
>>>         syslog_facility: daemon
>>>         logfile: /var/log/corosync/corosync.log
>>>         debug: off
>>>         timestamp: on
>>>         logger_subsys {
>>>                 subsys: AMF
>>>                 debug: off
>>>                 tags: enter|leave|trace1|trace2|trace3|trace4|trace6
>>>         }
>>> }
>>>
>>> And I have added 5 resources - 1 is VIP and 4 are upstart jobs
>>> Node names are configured as -> sc-node-1(ACTIVE) and sc-node-2(PASSIVE)
>>> Resources are running on ACTIVE node
>>>
>>> Default cluster properties -
>>>
>>>       <cluster_property_set id="cib-bootstrap-options">
>>>         <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>>> value="1.1.10-42f2063"/>
>>>         <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>>> name="cluster-infrastructure" value="corosync"/>
>>>         <nvpair name="no-quorum-policy" value="ignore"
>>> id="cib-bootstrap-options-no-quorum-policy"/>
>>>         <nvpair name="stonith-enabled" value="false"
>>> id="cib-bootstrap-options-stonith-enabled"/>
>>>         <nvpair name="cluster-recheck-interval" value="3min"
>>> id="cib-bootstrap-options-cluster-recheck-interval"/>
>>>         <nvpair name="default-action-timeout" value="120s"
>>> id="cib-bootstrap-options-default-action-timeout"/>
>>>       </cluster_property_set>
>>>
>>>
>>> But sometimes after 2-3 migrations from ACTIVE to STANDBY and then from
>>> STANDBY to ACTIVE,
>>> both nodes become OFFLINE and Current DC becomes None, I have disabled
>> the
>>> stonith property and even quorum is ignored
>>
>> Disabling stonith isn't helping you. The cluster needs stonith to
>> recover from difficult situations, so it's easier to get into weird
>> states like this without it.
>>
>>> root at sc-node-2:/usr/lib/python2.7/dist-packages/sc# crm status
>>> Last updated: Sat Oct  3 00:01:40 2015
>>> Last change: Fri Oct  2 23:38:28 2015 via crm_resource on sc-node-1
>>> Stack: corosync
>>> Current DC: NONE
>>> 2 Nodes configured
>>> 5 Resources configured
>>>
>>> OFFLINE: [ sc-node-1 sc-node-2 ]
>>>
>>> What is going wrong here ? What is the reason for node Current DC
>> becoming
>>> None suddenly ? Is corosync.conf okay ? Are default cluster properties
>> fine
>>> ? Help will be appreciated.
>>
>> I'd recommend seeing how the problem behaves with stonith enabled, but
>> in any case you'll need to dive into the logs to figure what starts the
>> chain of events.
>>
>>
>    -> We are seeing this issue when we try rebooting the vms

For VMs, fence_virtd/fence_xvm are relatively easy to set up for
stonith. I'd get that going first, then try to reproduce the problem,
and show the cluster logs from around the time the problem starts.

>>
>>> *Issue 2)*
>>> Command used to add upstart job is
>>>
>>> crm configure primitive service upstart:service meta allow-migrate=true
>>> migration-threshold=5 failure-timeout=30s op monitor interval=15s
>>>  timeout=60s
>>>
>>> But still sometimes I see fail count going to INFINITY. Why ? How can we
>>> avoid it ? Resource should have migrated as soon as it reaches migration
>>> threshold.
>>>
>>> * Node sc-node-2:
>>>    service: migration-threshold=5 fail-count=1000000 last-failure='Fri
>> Oct
>>>  2 23:38:53 2015'
>>>    service1: migration-threshold=5 fail-count=1000000 last-failure='Fri
>> Oct
>>>  2 23:38:53 2015'
>>>
>>> Failed actions:
>>>     service_start_0 (node=sc-node-2, call=-1, rc=1, status=Timed Out,
>>> last-rc-change=Fri Oct  2 23:38:53 2015
>>> , queued=0ms, exec=0ms
>>> ): unknown error
>>>     service1_start_0 (node=sc-node-2, call=-1, rc=1, status=Timed Out,
>>> last-rc-change=Fri Oct  2 23:38:53 2015
>>> , queued=0ms, exec=0ms
>>
>> migration-threshold is used for monitor failures, not (by default) start
>> or stop failures.
>>
>> This is a start failure, which (by default) makes the fail-count go to
>> infinity. The rationale is that a monitor failure indicates some sort of
>> temporary error, but failing to start could well mean that something is
>> wrong with the installation or configuration.
>>
>> You can tell the cluster to apply migration-threshold to start failures
>> too, by setting the start-failure-is-fatal=false cluster option.

------------------------------

Message: 2
Date: Thu, 8 Oct 2015 17:22:16 +0200
From: emmanuel segura <emi2fast at gmail.com>
To: Cluster Labs - All topics related to open-source clustering
	welcomed	<users at clusterlabs.org>
Subject: Re: [ClusterLabs] Corosync+Pacemaker error during failover
Message-ID:
	<CAE7pJ3C5D8gd_mScQ18AgN_tZX9yML9p42P1OVjU_Uvd8JGSbw at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

please check if you drbd is configured to call fence-handler
https://drbd.linbit.com/users-guide/s-pacemaker-fencing.html

2015-10-08 17:16 GMT+02:00 priyanka <priyanka at cse.iitb.ac.in>:
> Hi,
>
> We are trying to build a HA setup for our servers using DRBD + Corosync +
> pacemaker stack.
>
> Attached is the configuration file for corosync/pacemaker and drbd.
>
> We are getting errors while testing this setup.
> 1. When we stop corosync on Master machine say server1(lock), it is
> Stonith'ed. In this case slave-server2(sher) is promoted to master.
>    But when server1(lock) reboots res_exportfs_export1 is started on both
> the servers and that resource goes into failed state followed by servers
> going into unclean state.
>    Then server1(lock) reboots and server2(sher) is master but in unclean
> state. After server1(lock) comes up, server2(sher) is stonith'ed and
> server1(lock) is slave(the only online node).
>    When server2(sher) comes up, both the servers are slaves and resource
> group(rg_export) is stopped. Then server2(sher) becomes Master and
> server1(lock) is slave and resource group is started.
>    At this point configuration becomes stable.
>
>
> PFA logs(syslog) of server2(sher) after it is promoted to master till it is
> first rebooted when resource exportfs goes into failed state.
>
> Please let us know if the configuration is appropriate. From the logs we
> could not figure out exact reason of resource failure.
> Your comment on this scenario will be very helpful.
>
> Thanks,
> Priyanka
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

------------------------------

Message: 3
Date: Thu, 8 Oct 2015 11:35:02 -0400
From: Digimer <lists at alteeve.ca>
To: Cluster Labs - All topics related to open-source clustering
	welcomed	<users at clusterlabs.org>
Subject: Re: [ClusterLabs] Corosync+Pacemaker error during failover
Message-ID: <56168D26.8090500 at alteeve.ca>
Content-Type: text/plain; charset=windows-1252

On 08/10/15 11:16 AM, priyanka wrote:
> 		fencing resource-only;

This needs to be 'fencing resource-and-stonith;'.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

------------------------------

Message: 4
Date: Thu, 8 Oct 2015 10:50:04 -0500
From: Ken Gaillot <kgaillot at redhat.com>
To: users at clusterlabs.org
Subject: Re: [ClusterLabs] Corosync+Pacemaker error during failover
Message-ID: <561690AC.5030607 at redhat.com>
Content-Type: text/plain; charset=windows-1252

On 10/08/2015 10:16 AM, priyanka wrote:
> Hi,
> 
> We are trying to build a HA setup for our servers using DRBD + Corosync
> + pacemaker stack.
> 
> Attached is the configuration file for corosync/pacemaker and drbd.

A few things I noticed:

* Don't set become-primary-on in the DRBD configuration in a Pacemaker
cluster; Pacemaker should handle all promotions to primary.

* I'm no NFS expert, but why is res_exportfs_root cloned? Can both
servers export it at the same time? I would expect it to be in the group
before res_exportfs_export1.

* Your constraints need some adjustment. Partly it depends on the answer
to the previous question, but currently res_fs (via the group) is
ordered after res_exportfs_root, and I don't see how that could work.

> We are getting errors while testing this setup.
> 1. When we stop corosync on Master machine say server1(lock), it is
> Stonith'ed. In this case slave-server2(sher) is promoted to master.
>    But when server1(lock) reboots res_exportfs_export1 is started on
> both the servers and that resource goes into failed state followed by
> servers going into unclean state.
>    Then server1(lock) reboots and server2(sher) is master but in unclean
> state. After server1(lock) comes up, server2(sher) is stonith'ed and
> server1(lock) is slave(the only online node).
>    When server2(sher) comes up, both the servers are slaves and resource
> group(rg_export) is stopped. Then server2(sher) becomes Master and
> server1(lock) is slave and resource group is started.
>    At this point configuration becomes stable.
> 
> 
> PFA logs(syslog) of server2(sher) after it is promoted to master till it
> is first rebooted when resource exportfs goes into failed state.
> 
> Please let us know if the configuration is appropriate. From the logs we
> could not figure out exact reason of resource failure.
> Your comment on this scenario will be very helpful.
> 
> Thanks,
> Priyanka
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

------------------------------

Message: 5
Date: Thu, 8 Oct 2015 14:20:54 -0300
From: Cleber Paiva de Souza <cleberps at gmail.com>
To: Cluster Labs - All topics related to open-source clustering
	welcomed	<users at clusterlabs.org>
Subject: Re: [ClusterLabs] Xen Migration/resource cleanup problem in
	SLES11	SP3
Message-ID:
	<CAEm4n7ZYv7xb+ZaBHot=ZXA1EA2GJgTsOS7CAS57_XMGrJpo-g at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Are both machines identical hardware/version/model? We found that machines
with different CPU features crash while migrating from the machine with
more features to one with few features.
Also are your STONITH ok? STONITH protects from that muti-running behavior.

On Thu, Oct 8, 2015 at 9:29 AM, Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

> Hi!
>
> I'd like to report an "interesting problem" with SLES11 SP3+HAE (latest
> updates):
>
> When doing "rcopenais stop" on node "h10" with three Xen-VMs running, the
> cluster tried to migrate those VMs to other nodes (OK).
>
> However migration failed on the remote nodes, but the cluster thought
> migration was successfully. Later the cluster restarted the VMs (BAD).
>
> Oct  8 13:19:17 h10 Xen(prm_xen_v07)[16537]: INFO: v07: xm migrate to h01
> succeeded.
> Oct  8 13:20:38 h01 Xen(prm_xen_v07)[9027]: ERROR: v07: Not active
> locally, migration failed!
>
> Oct  8 13:44:53 h01 pengine[18985]:  warning: unpack_rsc_op_failure:
> Processing failed op migrate_from for prm_xen_v07 on h01: unknown error (1)
>
> Things are really bad after h10 was rebooted eventually: The cluster
> restarted the three VMs again, because it thought those VMs were still
> running on h10! (VERY BAD)
> During startup, the cluster did nor probe the three VMs.
>
> Oct  8 14:14:20 h01 pengine[18985]:  warning: unpack_rsc_op_failure:
> Processing failed op migrate_from for prm_xen_v07 on h01: unknown error (1)
>
> Oct  8 14:14:20 h01 pengine[18985]:   notice: LogActions: Restart
> prm_xen_v07 (Started h10)
>
> Oct  8 14:14:20 h01 crmd[18986]:   notice: te_rsc_command: Initiating
> action 89: stop prm_xen_v07_stop_0 on h01 (local)
>
> ...
>
> Regards,
> Ulrich
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

-- 
Cleber Paiva de Souza
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20151008/0b2d2f01/attachment-0001.html>

------------------------------

Message: 6
Date: Thu, 8 Oct 2015 20:42:50 +0200
From: "J. Echter" <j.echter at echter-kuechen-elektro.de>
To: users at clusterlabs.org
Subject: Re: [ClusterLabs] gfs2 crashes when i, e.g., dd to a lvm
	volume
Message-ID: <5616B92A.1070108 at echter-kuechen-elektro.de>
Content-Type: text/plain; charset=utf-8

Am 08.10.2015 um 16:34 schrieb Bob Peterson:
> ----- Original Message -----
>>
>> Am 08.10.2015 um 16:15 schrieb Digimer:
>>> On 08/10/15 07:50 AM, J. Echter wrote:
>>>> Hi,
>>>>
>>>> i have a strange issue on CentOS 6.5
>>>>
>>>> If i install a new vm on node1 it works well.
>>>>
>>>> If i install a new vm on node2 it gets stuck.
>>>>
>>>> Same if i do a dd if=/dev/zero of=/dev/DATEN/vm-test (on node2)
>>>>
>>>> On node1 it works:
>>>>
>>>> dd if=/dev/zero of=vm-test
>>>> Schreiben in ?vm-test?: Auf dem Ger?t ist kein Speicherplatz mehr
>>>> verf?gbar
>>>> 83886081+0 Datens?tze ein
>>>> 83886080+0 Datens?tze aus
>>>> 42949672960 Bytes (43 GB) kopiert, 2338,15 s, 18,4 MB/s
>>>>
>>>>
>>>> dmesg shows the following (while dd'ing on node2):
>>>>
>>>> INFO: task flush-253:18:9820 blocked for more than 120 seconds.
>>>>        Not tainted 2.6.32-573.7.1.el6.x86_64 #1
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> <snip>
>>>> any hint on fixing that?
>>> Every time I've seen this, it was because dlm was blocked. The most
>>> common cause of DLM blocking is a failed fence call. Do you have fencing
>>> configured *and* tested?
>>>
>>> If I were to guess, given the rather limited information you shared
>>> about your setup, the live migration consumed the network bandwidth,
>>> chocking out corosync traffic which caused the peer to be declared lost,
>>> called a fence which failed and left locking hung (which is by design;
>>> better to hang that risk corruption).
>>>
>> Hi,
>>
>> fencing is configured and works.
>>
>> I re-checked it by typing
>>
>> echo c > /proc/sysrq-trigger
>>
>> into node2 console.
>>
>> The machine is fenced and comes back up. But the problem persists.
> Hi,
>
> Can you send any more information about the crash? What makes you think
> it's gfs2 and not some other kernel component? Do you get any messages
> on the console? If not, perhaps you can temporarily disable or delay fencing
> long enough to get console messages.
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems
>
> _______________________________________________
>
Hi,

i just recognized that gfs2 is probably the wrong candidate.

I use clustered lvm (drbd), and i experience this on a  lvm volume, not
formatted to anything.

What logs would you need to identify the cause?

------------------------------

_______________________________________________
Users mailing list
Users at clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

End of Users Digest, Vol 9, Issue 21
************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20151009/59b16d74/attachment-0002.html>