[Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server

Thu Nov 24 16:59:09 EST 2011

Hello,

On 11/24/2011 10:12 PM, Vadim Bulst wrote:
> Hi Andreas,
> 
> I changed my cib:
> 
> node bbzclnode04
> node bbzclnode06
> node bbzclnode07
> primitive clvm ocf:lvm2:clvmd \
>         params daemon_timeout="30"
> primitive dlm ocf:pacemaker:controld
> group g_lock dlm clvm
> clone g_lock-clone g_lock \
>         meta interleave="true"
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="3" \
>         no-quorum-policy="ignore" \
>         stonith-enabled="false" \
>         last-lrm-refresh="1322049979
> 
> but no luck at all.

I assume you did at least a cleanup on clvm and it still does not work
... next step would be to grep for ERROR in your cluster log and look
for other suspicious messages to find out why clvm is not that motivated
to start.

> 
> "And use Corosync 1.4.x with redundant rings and automatic ring recovery
> feature enabled."
> 
> I got two interfaces per server - there are bonded together and bridged
> for virtualization.  Only one untagged vlan. I tried to give a tagged
> Vlan Bridge a Address but didn't worked. My network conf looks like that:

One ore two extra nics are quite affordable today to build e.g. a direct
connection between the nodes (if possible)

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> 
> auto bond0
> iface bond0 inet manual
>         post-up ifconfig $IFACE up
>         pre-down ifconfig $IFACE down
>         bond-slaves none
>         bond-mode 802.3ad
>         bond-miimon 100
>         bond-downdelay 200
>         bond-updelay 100
> 
> auto eth0
> 
> allow-bond0 eth0
> 
> iface eth0 inet manual
>     bond-master bond0
> 
> auto eth1
> allow-bond0 eth1
> 
> iface eth1 inet manual
>     bond-master bond0
> 
> 
> auto bond0.223
> iface bond0.223 inet manual
>         post-up ifconfig $IFACE up
>         pre-down ifconfig $IFACE down
>         vlan-raw-device bond0
> 
> auto bond0.982
> iface bond0.982 inet manual
>         post-up ifconfig $IFACE up
>         pre-down ifconfig $IFACE down
>         vlan-raw-device bond0
> 
> auto br0
> iface br0 inet static
>     # Static assign the IP, netmask, default gateway.
>     address 192.168.128.53
>     netmask 255.255.255.0
>     gateway 192.168.128.254
>     dns-nameservers 192.168.129.4
>     dns-search bbz.uni-leipzig.de
>     # Bind one or more interfaces to the bridge.
>     bridge_ports bond0
>     # Tune the bridge for a single interface.
>     bridge_stp off
>     bridge_fd 0
>     bridge_maxwait 0
> 
> auto br0-223
> iface br0-223 inet manual
>     # Force the interface to up/down automatically without an IP.
>     post-up ifconfig $IFACE up
>     pre-down ifconfig $IFACE down
>     # Bind one or more interfaces to the bridge.
>     bridge_ports bond0.223
>     # Tune the bridge for a single interface.
>     bridge_stp off
>     bridge_fd 0
>     bridge_maxwait 0
> 
> auto br0-982
> iface br0-982 inet manual
> #iface br0-982 inet static
>    #address192.168.129.23
>                                                              #netmask
> 255.255.255.0
>     # Force the interface to up/down automatically without an IP.
>     post-up ifconfig $IFACE up
>     pre-down ifconfig $IFACE down
>     # Bind one or more interfaces to the bridge.
>     bridge_ports bond0.982
>     # Tune the bridge for a single interface.
>     bridge_stp off
>     bridge_fd 0
>     bridge_maxwait 0
> 
> 
> 
> So I guess - there is no way to get it to work.
> 
> 
> ----- Ursprüngliche Mail -----
> Von: "Andreas Kurz" <andreas at hastexo.com>
> An: pacemaker at oss.clusterlabs.org
> Gesendet: Donnerstag, 24. November 2011 15:23:30
> Betreff: Re: [Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric
> Server
> 
> Hello,
> 
> On 11/23/2011 03:30 PM, Vadim Bulst wrote:
>> Hi list,
>>
>> I try to bring up a 3-node cluster running Ubuntu Omeiric. The packages
>> I used are all from the Ubuntu-repo: pacemaker 1.1.5, corosync 1.3.0,
>> clvm 2.02.66.
>>
>> I'm using teamed and bridged interfaces for networking. Every node has
>> only one address for cluster management.
>> All nodes are connected to a FC-san and see the same volumes. Iptables
>> --list shows no rules.
>>
>> My problem:
>>
>> clvm resource is not coming up and when I commit some changes on
>> resources in most cases one node is dying.
> 
> there are some constraints missing in your config ... or better, use a
> cloned group ... see below ...
> 
> And use Corosync 1.4.x with redundant rings and automatic ring recovery
> feature enabled.
> 
>>
>> crm_mon:
>>
>> ============
>> Last updated: Wed Nov 23 14:47:58 2011
>> Stack: openais
>> Current DC: bbzclnode07 - partition with quorum
>> Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
>> 3 Nodes configured, 3 expected votes
>> 2 Resources configured.
>> ============
>>
>> Online: [ bbzclnode06 bbzclnode04 bbzclnode07 ]
>>
>>  Clone Set: dlm-clone [dlm]
>>      Started: [ bbzclnode06 bbzclnode07 bbzclnode04 ]
>>
>> Failed actions:
>>     clvm:0_start_0 (node=bbzclnode06, call=31, rc=1, status=complete):
>> unknown error
>>     clvm:1_start_0 (node=bbzclnode07, call=5, rc=1, status=complete):
>> unknown error
>>     clvm:0_start_0 (node=bbzclnode04, call=29, rc=1, status=complete):
>> unknown error
>>
>>
>>
>>
>> I configured corosync like this:
>>
>> totem {
>>     version: 2
>>     token: 3000
>>     token_retransmits_before_loss_const: 10
>>     join: 60
>>     consensus: 3600
>>     vsftype: none
>>     max_messages: 20
>>     clear_node_high_bit: yes
>>      secauth: on
>>      threads: 8
>>      rrp_mode: none
>>      interface {
>>         ringnumber: 0
>>         bindnetaddr: 192.168.128.0
>>         mcastaddr: 226.94.1.1
>>         mcastport: 5405
>>     }
>> }
>> amf {
>>     mode: disabled
>> }
>> service {
>>      ver:       0
>>      name:      pacemaker
>> }
>> aisexec {
>>         user:   root
>>         group:  root
>> }
>> logging {
>>         fileline: off
>>         to_stderr: yes
>>         to_logfile: no
>>         to_syslog: yes
>>     syslog_facility: daemon
>>         debug: off
>>         timestamp: on
>>         logger_subsys {
>>                 subsys: AMF
>>                 debug: off
>>                 tags: enter|leave|trace1|trace2|trace3|trace4|trace6
>>         }
>> }
>>
>> and my cib looks like this:
>>
>> node bbzclnode04
>> node bbzclnode06
>> node bbzclnode07
>> primitive clvm ocf:lvm2:clvmd \
>>     params daemon_timeout="30" \
>>     meta target-role="started"
>> primitive m ocf:pacemaker:controld \
>>     meta target-role="started"
>> clone clvm-clone clvm \
>>     meta clone-max="3" clone-node-max="1"
>> clone dlm-clone dlm \
>>     meta clone-max="3" clone-node-max="1"
> 
> ommit those two clones, use a cloned group:
> 
> group g_lock dlm clvm
> clone g_lock \
>       meta interleave="true"
> 
>> property $id="cib-bootstrap-options" \
>>     dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>>     cluster-infrastructure="openais" \
>>     expected-quorum-votes="3" \
>>     no-quorum-policy="ignore" \
>>     stonith-enabled="false" \
>>     last-lrm-refresh="1322049979"
> 
> Don't forget to set up stonith in a productive system when using shared
> storage.
> 
> Regards,
> Andreas
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 286 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111124/8e832cb5/attachment-0003.sig>