[Pacemaker] CLVM & Pacemaker & Corosync on Ubuntu Omeiric Server
Vladislav Bogdanov
bubble at hoster-ok.com
Wed Nov 30 11:22:23 UTC 2011
30.11.2011 14:08, Vadim Bulst wrote:
> Hello,
>
> first of all I'd like to ask you a general question:
>
> Does somebody successfully set up a clvm cluster with pacemaker and run
> it in productive mode?
I will say yes after I finally resolve remaining dlm&fencing issues.
>
> Now back to the concrete problem:
>
> I configured two interfaces for corosync:
>
> root at bbzclnode04:~# corosync-cfgtool -s
> Printing ring status.
> Local node ID 897624256
> RING ID 0
> id = 192.168.128.53
> status = ring 0 active with no faults
> RING ID 1
> id = 192.168.129.23
> status = ring 1 active with no faults
>
> RRD set to passive
>
> I also made some changes to my cib:
>
> node bbzclnode04
> node bbzclnode06
> node bbzclnode07
> primitive clvm ocf:lvm2:clvmd \
> params daemon_timeout="30" \
> meta target-role="Started"
Please instruct clvmd to use corosync stack instead of openais (-I
corosync): otherwise it uses LCK service which is not mature and I
observed major problems with it.
> primitive dlm ocf:pacemaker:controld \
> meta target-role="Started"
> group dlm-clvm dlm clvm
> clone dlm-clvm-clone dlm-clvm \
> meta interleave="true" ordered="true"
> property $id="cib-bootstrap-options" \
> dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="3" \
> no-quorum-policy="ignore" \
> stonith-enabled="false" \
> last-lrm-refresh="1322643084"
>
> I cleaned and restarted the resources - nothing! :
>
> crm(live)resource# cleanup dlm-clvm-clone
> Cleaning up dlm:0 on bbzclnode04
> Cleaning up dlm:0 on bbzclnode06
> Cleaning up dlm:0 on bbzclnode07
> Cleaning up clvm:0 on bbzclnode04
> Cleaning up clvm:0 on bbzclnode06
> Cleaning up clvm:0 on bbzclnode07
> Cleaning up dlm:1 on bbzclnode04
> Cleaning up dlm:1 on bbzclnode06
> Cleaning up dlm:1 on bbzclnode07
> Cleaning up clvm:1 on bbzclnode04
> Cleaning up clvm:1 on bbzclnode06
> Cleaning up clvm:1 on bbzclnode07
> Cleaning up dlm:2 on bbzclnode04
> Cleaning up dlm:2 on bbzclnode06
> Cleaning up dlm:2 on bbzclnode07
> Cleaning up clvm:2 on bbzclnode04
> Cleaning up clvm:2 on bbzclnode06
> Cleaning up clvm:2 on bbzclnode07
> Waiting for 19 replies from the CRMd................... OK
>
> crm_mon:
>
> ============
> Last updated: Wed Nov 30 10:15:09 2011
> Stack: openais
> Current DC: bbzclnode04 - partition with quorum
> Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 3 Nodes configured, 3 expected votes
> 1 Resources configured.
> ============
>
> Online: [ bbzclnode04 bbzclnode06 bbzclnode07 ]
>
>
> Failed actions:
> clvm:1_start_0 (node=bbzclnode06, call=11, rc=1, status=complete):
> unknown error
> clvm:0_start_0 (node=bbzclnode04, call=11, rc=1, status=complete):
> unknown error
> clvm:2_start_0 (node=bbzclnode07, call=11, rc=1, status=complete):
> unknown error
>
>
> When I look in the log - there is a message which tells me that may be
> another clvm process is already running - but it isn't so.
>
> "clvmd could not create local socket Another clvmd is probably already
> running"
>
> Or is it a permission problem - writing to the filesystem? Is there a
> way to get rid of it?
You can try to run it manually under strace. It will show you what happens.
>
> Shell I use a different distro - our install from source?
>
>
> Am 24.11.2011 22:59, schrieb Andreas Kurz:
>> Hello,
>>
>> On 11/24/2011 10:12 PM, Vadim Bulst wrote:
>>> Hi Andreas,
>>>
>>> I changed my cib:
>>>
>>> node bbzclnode04
>>> node bbzclnode06
>>> node bbzclnode07
>>> primitive clvm ocf:lvm2:clvmd \
>>> params daemon_timeout="30"
>>> primitive dlm ocf:pacemaker:controld
>>> group g_lock dlm clvm
>>> clone g_lock-clone g_lock \
>>> meta interleave="true"
>>> property $id="cib-bootstrap-options" \
>>> dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>>> cluster-infrastructure="openais" \
>>> expected-quorum-votes="3" \
>>> no-quorum-policy="ignore" \
>>> stonith-enabled="false" \
>>> last-lrm-refresh="1322049979
>>>
>>> but no luck at all.
>> I assume you did at least a cleanup on clvm and it still does not work
>> ... next step would be to grep for ERROR in your cluster log and look
>> for other suspicious messages to find out why clvm is not that motivated
>> to start.
>>
>>> "And use Corosync 1.4.x with redundant rings and automatic ring recovery
>>> feature enabled."
>>>
>>> I got two interfaces per server - there are bonded together and bridged
>>> for virtualization. Only one untagged vlan. I tried to give a tagged
>>> Vlan Bridge a Address but didn't worked. My network conf looks like that:
>> One ore two extra nics are quite affordable today to build e.g. a direct
>> connection between the nodes (if possible)
>>
>> Regards,
>> Andreas
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> --
> Mit freundlichen Grüßen
>
> Vadim Bulst
> Systemadministrator BBZ
>
> Biotechnologisch-Biomedizinisches Zentrum
> Universität Leipzig
> Deutscher Platz 5, 04103 Leipzig
> Tel.: 0341 97 - 31 307
> Fax : 0341 97 - 31 309
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list