[ClusterLabs] Cluster active/active

Mon Oct 10 14:04:21 UTC 2016

On 10/09/2016 11:31 PM, Dayvidson Bezerra wrote:
> Analyzing the log of zabbix application, I see that the problem of not
> being started on node02 is Cuz the PID can not be accessed by 2 hosts.

DRBD and GFS2 can be active/active, but I believe zabbix can't. While
pacemaker can manage active/active mode for anything, the individual
application must support it as well.

> root at node01:/usr/local/etc/log_zabbix# pcs resource 
>  Master/Slave Set: ZabbixDTClone [ZabbixDT]
>      Masters: [ node01 node02 ]
>  Clone Set: dlm-clone [dlm]
>      Started: [ node01 node02 ]
>  Clone Set: ClusterIP-clone [ClusterIP] (unique)
>      ClusterIP:0(ocf::heartbeat:IPaddr2):Started node01
>      ClusterIP:1(ocf::heartbeat:IPaddr2):Started node02
>  Clone Set: ZabbixFS-clone [ZabbixFS]
>      Started: [ node01 node02 ]
>  Clone Set: p_zabbix-clone [p_zabbix]
>      Started: [ node01 ]
>      Stopped: [ node02 ]
> root at node01:/usr/local/etc/log_zabbix# 
> 
> 
> zabbix log;
> 
> root at node01:/usr/local/etc# tail -f log_zabbix/zabbix_server.log 
>   7360:20161010:011833.254 server #23 started [discoverer #5]
>   7363:20161010:011833.255 server #26 started [history syncer #3]
>   7364:20161010:011833.255 server #27 started [history syncer #4]
>   7365:20161010:011833.257 server #28 started [escalator #1]
>   7366:20161010:011833.257 server #29 started [ipmi poller #1]
>   7367:20161010:011833.258 server #30 started [proxy poller #1]
>   7368:20161010:011833.258 server #31 started [self-monitoring #1]
>   7369:20161010:011833.259 server #32 started [task manager #1]
> zabbix_server [5964]: Is this process already running? Could not lock
> PID file [/usr/local/etc/log_zabbix/zabbix_server.pid]: [11] Resource
> temporarily unavailable
> zabbix_server [6481]: Is this process already running? Could not lock
> PID file [/usr/local/etc/log_zabbix/zabbix_server.pid]: [11] Resource
> temporarily unavailable
> 
> 
> 
> 2016-10-10 1:13 GMT-03:00 Dayvidson Bezerra <dayvidsonbezerra at gmail.com
> <mailto:dayvidsonbezerra at gmail.com>>:
> 
>     I'm getting this error shown below.
> 
>     Can someone help me?
> 
>     root at node01:~# pcs status 
>     Cluster name: mycluster
>     WARNING: corosync and pacemaker node names do not match (IPs used in
>     setup?)
>     Last updated: Mon Oct 10 01:11:52 2016Last change: Mon Oct 10
>     01:04:58 2016 by root via crm_resource on node01
>     Stack: corosync
>     Current DC: node02 (version 1.1.14-70404b0) - partition with quorum
>     2 nodes and 10 resources configured
> 
>     Online: [ node01 node02 ]
> 
>     Full list of resources:
> 
>      Master/Slave Set: ZabbixDTClone [ZabbixDT]
>          Masters: [ node01 node02 ]
>      Clone Set: dlm-clone [dlm]
>          Started: [ node01 node02 ]
>      Clone Set: ClusterIP-clone [ClusterIP] (unique)
>          ClusterIP:0(ocf::heartbeat:IPaddr2):Started node01
>          ClusterIP:1(ocf::heartbeat:IPaddr2):Started node02
>      Clone Set: ZabbixFS-clone [ZabbixFS]
>          Started: [ node01 node02 ]
>      Clone Set: p_zabbix-clone [p_zabbix]
>          Started: [ node01 ]
>          Stopped: [ node02 ]
> 
>     Failed Actions:
>     * p_zabbix_start_0 on node02 'unknown error' (1): call=42,
>     status=Timed Out, exitreason='none',
>         last-rc-change='Mon Oct 10 01:11:17 2016', queued=0ms, exec=20005ms
>     * p_zabbix_monitor_10000 on node01 'not running' (7): call=51,
>     status=complete, exitreason='none',
>         last-rc-change='Mon Oct 10 01:11:47 2016', queued=0ms, exec=0ms
> 
> 
>     PCSD Status:
>       node01 (10.10.10.100): Online
>       node02 (10.10.10.200): Online
> 
>     Daemon Status:
>       corosync: active/enabled
>       pacemaker: active/enabled
>       pcsd: active/enabled
>     root at node01:~# 
> 
> 
>     2016-10-09 20:50 GMT-03:00 Dayvidson Bezerra
>     <dayvidsonbezerra at gmail.com <mailto:dayvidsonbezerra at gmail.com>>:
> 
>         I could add the Zabbix service to the cluster with the following
>         command;
> 
>         pcs resource create p_zabbix ocf:heartbeat:zabbixserver params
>         binary="/usr/local/sbin/zabbix_server"
>         pid="/usr/local/etc/log_zabbix/zabbix_server.pid" op monitor
>         interval="10s" timeout="20s" op stop interval="0" timeout="20s"
>         meta target-role="Started"
> 
>         The project is going, I will now adjust the application and go
>         to the GSF2 for active / active cluster.
> 
>         2016-10-09 7:34 GMT-03:00 Dayvidson Bezerra
>         <dayvidsonbezerra at gmail.com <mailto:dayvidsonbezerra at gmail.com>>:
> 
>             added service with the following line;
> 
>             pcs resource create ZabbixServer lsb:zabbix_server op
>             monitor interval=30s
> 
>             when I look at the status;
> 
>             oot at node01:/usr/local/etc# pcs status
>             Cluster name: mycluster
>             WARNING: corosync and pacemaker node names do not match (IPs
>             used in setup?)
>             Last updated: Sun Oct  9 07:33:19 2016Last change: Sun Oct
>              9 07:27:33 2016 by root via cibadmin on node01
>             Stack: corosync
>             Current DC: node02 (version 1.1.14-70404b0) - partition with
>             quorum
>             2 nodes and 2 resources configured
> 
>             Online: [ node01 node02 ]
> 
>             Full list of resources:
> 
>              ClusterIP(ocf::heartbeat:IPaddr2):Started node01
>             * ZabbixServer(lsb:zabbix_server):Started node01*
> 
>             *Failed Actions:*
>             ** ZabbixServer_start_0 on node02 'unknown error' (1):
>             call=10510, status=Error, exitreason='none',*
>             *    last-rc-change='Sun Oct  9 07:30:21 2016', queued=0ms,
>             exec=57ms*
>             ** ZabbixServer_monitor_30000 on node01 'not running' (7):
>             call=9573, status=complete, exitreason='none',*
>             *    last-rc-change='Sun Oct  9 07:33:19 2016', queued=0ms,
>             exec=14ms*
> 
> 
>             PCSD Status:
>               node01 (10.10.10.100): Online
>               node02 (10.10.10.200): Online
> 
>             Daemon Status:
>               corosync: active/enabled
>               pacemaker: active/enabled
>               pcsd: active/enabled
>             root at node01:/usr/local/etc# 
> 
> 
>             2016-10-09 5:29 GMT-03:00 Dayvidson Bezerra
>             <dayvidsonbezerra at gmail.com
>             <mailto:dayvidsonbezerra at gmail.com>>:
> 
>                 I am following the clusterlabs documentation that is for
>                 RedHat and making adjustments to Ubuntu.
> 
>                 node01= 10.10.10.100 (zabbix core)
>                 node02= 10.10.10.200 (zabbix core)
>                 VIP= 10.10.10.250
>                 zabbixweb= 10.10.10.2
>                 zabbixbd= 10.10.10.1
> 
>                 I am having problem in zabbix core connection (node01
>                 and node02) with zabbixbd is coming duplicate data in
>                 the database server because the two core is sending the
>                 same information to the database.
> 
>                 As the cluster is active / active wanted a way to
>                 information originated by them also be unique and not
>                 each generate the same information and try to add to the
>                 database because it is generating me the errors below.
> 
>                 2016-10-08 12:40:19 BRT [23651-1] zabbix at zabbix ERROR:
>                  duplicate key value violates unique constraint
>                 "events_pkey"
>                 2016-10-08 12:40:19 BRT [23651-2] zabbix at zabbix DETAIL:
>                  Key (eventid)=(83) already exists.
> 
>                 And zabbix Server that runs on the node (01 and 02) is
>                 giving me these errors below.
> 
>                 root at node02:~# tail -f /var/log/zabbix/zabbix_server.log 
>                 ]
>                   1276:20161008:123901.007 [Z3005] query failed: [0]
>                 PGRES_FATAL_ERROR:ERROR:  duplicate key value violates
>                 unique constraint "events_pkey"
>                 DETAIL:  Key (eventid)=(80) already exists.
>                  [insert into events
>                 (eventid,source,object,objectid,clock,ns,value) values
>                 (80,0,0,13491,1475941141,832194,1),(81,3,0,13491,1475941141,832194,0),(82,3,0,13574,1475941141,836284,0);
>                 ]
>                   1267:20161008:124019.389 enabling Zabbix agent checks
>                 on host "Zabbix server": host became available
>                   1285:20161008:124019.764 [Z3005] query failed: [0]
>                 PGRES_FATAL_ERROR:ERROR:  duplicate key value violates
>                 unique constraint "events_pkey"
>                 DETAIL:  Key (eventid)=(83) already exists.
>                  [insert into events
>                 (eventid,source,object,objectid,clock,ns,value) values
>                 (83,0,0,13491,1475941219,390629436,0);
>                 ]
> 
> 
>                 2016-10-08 22:26 GMT-03:00 Digimer <lists at alteeve.ca
>                 <mailto:lists at alteeve.ca>>:
> 
>                     Can you please share the requested information?
> 
>                     digimer
> 
>                     On 08/10/16 07:51 PM, Dayvidson Bezerra wrote:
>                     > My project is to achieve a cluster zabbix active / active.
>                     >
>                     > I'm having trouble declaring zabbix service in the cluster.
>                     >
>                     > 2016-10-08 13:04 GMT-03:00 Digimer <lists at alteeve.ca <mailto:lists at alteeve.ca>
>                     > <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>>>:
>                     >
>                     >     Can you share your current full configuration please?
>                     >
>                     >     If you're hitting errors, please also share the relevant log entries
>                     >     from the nodes.
>                     >
>                     >     digimer
>                     >
>                     >     On 07/10/16 09:06 PM, Dayvidson Bezerra wrote:
>                     >     > The company only uses Ubuntu, and do not want another distro in your
>                     >     > environment.
>                     >     >
>                     >     > I'm scolding to solve this .. I've done active / passive cluster with
>                     >     > DRBD but active / not active.
>                     >     >
>                     >     > What I have read is about the use of pacemaker + corosync and using GSF2.
>                     >     >
>                     >     > Someone who has already succeeded in making active / active in Linux Ubuntu?
>                     >     >
>                     >     >
>                     >     > 2016-10-07 20:50 GMT-03:00 Digimer <lists at alteeve.ca <mailto:lists at alteeve.ca>
>                     <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>>
>                     >     > <mailto:lists at alteeve.ca
>                     <mailto:lists at alteeve.ca> <mailto:lists at alteeve.ca
>                     <mailto:lists at alteeve.ca>>>>:
>                     >     >
>                     >     >     On 07/10/16 07:46 PM, Dayvidson Bezerra wrote:
>                     >     >     > Hello.
>                     >     >     > I am wanting to set up an active / active cluster in ubuntu 16.04 with
>                     >     >     > pacemaket and corosync and following the clusterlabs documentation I'm
>                     >     >     > not getting.
>                     >     >     > Someone has documentation that might help?
>                     >     >
>                     >     >     Last I checked (and it's not been recently), ubuntu's support for HA is
>                     >     >     still lacking. It's recommended for people new to HA to use either RHEL
>                     >     >     (CentOS) or SUSE. Red Hat and SUSE both have paid staff who make sure
>                     >     >     that HA works well.
>                     >     >
>                     >     >     If you want to use Ubuntu, after you get a working config in either EL
>                     >     >     or SUSE, then you can port. That way, if you run into issues, you will
>                     >     >     know your config is good and that you're dealing with an OS issue. Keeps
>                     >     >     the fewest variables in play at a time.
>                     >     >
>                     >     >     Also, I don't know of any good docs for HA on ubuntu, for the same
>                     >     >     reason.
>                     >     >
>                     >     >     --
>                     >     >     Digimer
>                     >     >     Papers and Projects: https://alteeve.ca/w/
>                     >     >     What if the cure for cancer is trapped in the mind of a person without
>                     >     >     access to education?
>                     >     >
>                     >     >     _______________________________________________
>                     >     >     Users mailing list: Users at clusterlabs.org <mailto:Users at clusterlabs.org>
>                     >     <mailto:Users at clusterlabs.org
>                     <mailto:Users at clusterlabs.org>>
>                     <mailto:Users at clusterlabs.org
>                     <mailto:Users at clusterlabs.org>
>                     >     <mailto:Users at clusterlabs.org <mailto:Users at clusterlabs.org>>>
>                     >     >     http://clusterlabs.org/mailman/listinfo/users
>                     <http://clusterlabs.org/mailman/listinfo/users>
>                     >     <http://clusterlabs.org/mailman/listinfo/users
>                     <http://clusterlabs.org/mailman/listinfo/users>>
>                     >     >     <http://clusterlabs.org/mailman/listinfo/users
>                     <http://clusterlabs.org/mailman/listinfo/users>
>                     >     <http://clusterlabs.org/mailman/listinfo/users
>                     <http://clusterlabs.org/mailman/listinfo/users>>>
>                     >     >
>                     >     >     Project Home: http://www.clusterlabs.org
>                     >     >     Getting started:
>                     >     >     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>                     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>                     >     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>                     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
>                     >     >     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>                     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>                     >     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>                     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>>
>                     >     >     Bugs: http://bugs.clusterlabs.org
>                     >     >
>                     >     >
>                     >     >
>                     >     >
>                     >     > --
>                     >     > /*Dayvidson Bezerra*
>                     >     > /
>                     >     > //*/Pós-Graduado em Gerenciamento de Redes - FIR-PE/
>                     >     > /Graduado em Redes de Computadores - FMN
>                     >     > /*/F: +55 81 9877-5127
>                     <tel:%2B55%2081%209877-5127>
>                     <tel:%2B55%2081%209877-5127>/
>                     >     > /Skype: dayvidson.bezerra
>                     >     > //Lattes: //http://lattes.cnpq.br/3299061783823913/
>                     <http://lattes.cnpq.br/3299061783823913/>
>                     >     <http://lattes.cnpq.br/3299061783823913/
>                     <http://lattes.cnpq.br/3299061783823913/>> /
>                     >     > //Linked In:
>                     >     /http://br.linkedin.com/pub/dayvidson-bezerra/2a/772/bb7/
>                     <http://br.linkedin.com/pub/dayvidson-bezerra/2a/772/bb7/>
>                     >     <http://br.linkedin.com/pub/dayvidson-bezerra/2a/772/bb7/
>                     <http://br.linkedin.com/pub/dayvidson-bezerra/2a/772/bb7/>>