[ClusterLabs] Antw: Re: Antw: Re: single node fails to start the ocfs2 resource

Tue Mar 13 15:34:04 UTC 2018

>>> Klaus Wenninger <kwenning at redhat.com> schrieb am 13.03.2018 um 16:18 in
Nachricht <f83ddccc-5ef0-5d97-5121-e3f317863edb at redhat.com>:
> On 03/13/2018 03:43 PM, Muhammad Sharfuddin wrote:
>> Thanks a lot for the explanation. But other then the ocfs2 resource
>> group, this cluster starts all other resources
>>
>> on a single node, without any issue just because the use of
>> "no-quorum-policy=ignore" option.
> 
> Yes I know. And what I tried to point out is that "no-quorum-policy=ignore"
> is dangerous for services that do require a resource-manager. If you don't
> have any of those go with a systemd startup.

But will a two-node cluster ever have a quorum if one node has failed?
(I still think Muhammad did something wrong, besides of that)

> 
> Regards,
> Klaus
> 
>>
>> -- 
>> Regards,
>> Muhammad Sharfuddin
>>
>> On 3/13/2018 7:32 PM, Klaus Wenninger wrote:
>>> On 03/13/2018 02:30 PM, Muhammad Sharfuddin wrote:
>>>> Yes, by saying pacemaker,  I meant to say corosync as well.
>>>>
>>>> Is there any fix ? or a two node cluster can't run ocfs2 resources
>>>> when one node is offline ?
>>> Actually there can't be a "fix" as 2 nodes are just not enough
>>> for a partial-cluster to be quorate in the classical sense
>>> (more votes than half of the cluster nodes).
>>>
>>> So to still be able to use it we have this 2-node config that
>>> permanently sets quorum. But not to run into issues on
>>> startup we need it to require both nodes seeing each
>>> other once.
>>>
>>> So this is definitely nothing that is specific to ocfs2.
>>> It just looks specific to ocfs2 because you've disabled
>>> quorum for pacemaker.
>>> To be honnest doing this you wouldn't need a resource-manager
>>> at all and could just start up your services using systemd.
>>>
>>> If you don't want a full 3rd node, and still want to handle cases
>>> where one node doesn't come up after a full shutdown of
>>> all nodes, you probably could go for a setup with qdevice.
>>>
>>> Regards,
>>> Klaus
>>>
>>>> -- 
>>>> Regards,
>>>> Muhammad Sharfuddin
>>>>
>>>> On 3/13/2018 6:16 PM, Klaus Wenninger wrote:
>>>>> On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote:
>>>>>> Hi,
>>>>>>
>>>>>> 1 - if I put a node(node2) offline; ocfs2 resources keep running on
>>>>>> online node(node1)
>>>>>>
>>>>>> 2 - while node2 was offline, via cluster I stop/start the ocfs2
>>>>>> resource group successfully so many times in a row.
>>>>>>
>>>>>> 3 - while node2 was offline; I restart the pacemaker service on the
>>>>>> node1 and then tries to start the ocfs2 resource group, dlm started
>>>>>> but ocfs2 file system resource does not start.
>>>>>>
>>>>>> Nutshell:
>>>>>>
>>>>>> a - both nodes must be online to start the ocfs2 resource.
>>>>>>
>>>>>> b - if one crashes or offline(gracefully) ocfs2 resource keeps
>>>>>> running
>>>>>> on the other/surviving node.
>>>>>>
>>>>>> c - while one node was offline, we can stop/start the ocfs2 resource
>>>>>> group on the surviving node but if we stops the pacemaker service,
>>>>>> then ocfs2 file system resource does not start with the following
>>>>>> info
>>>>>> in the logs:
>>>>> >From the logs I would say startup of dlm_controld times out
>>>>> because it
>>>>> is waiting
>>>>> for quorum - which doesn't happen because of wait-for-all.
>>>>> Question is if you really just stopped pacemaker or if you stopped
>>>>> corosync as well.
>>>>> In the latter case I would say it is the expected behavior.
>>>>>
>>>>> Regards,
>>>>> Klaus
>>>>>  
>>>>>> lrmd[4317]:   notice: executing - rsc:p-fssapmnt action:start
>>>>>> call_id:53
>>>>>> Filesystem(p-fssapmnt)[5139]: INFO: Running start for
>>>>>> /dev/mapper/sapmnt on /sapmnt
>>>>>> kernel: [  706.162676] dlm: Using TCP for communications
>>>>>> kernel: [  706.162916] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining
>>>>>> the lockspace group...
>>>>>> dlm_controld[5105]: 759 fence work wait for quorum
>>>>>> dlm_controld[5105]: 764 BFA9FF042AA045F4822C2A6A06020EE9 wait for
>>>>>> quorum
>>>>>> lrmd[4317]:  warning: p-fssapmnt_start_0 process (PID 5139) timed out
>>>>>> lrmd[4317]:  warning: p-fssapmnt_start_0:5139 - timed out after
>>>>>> 60000ms
>>>>>> lrmd[4317]:   notice: finished - rsc:p-fssapmnt action:start
>>>>>> call_id:53 pid:5139 exit-code:1 exec-time:60002ms queue-time:0ms
>>>>>> kernel: [  766.056514] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group
>>>>>> event done -512 0
>>>>>> kernel: [  766.056528] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group
>>>>>> join failed -512 0
>>>>>> crmd[4320]:   notice: Result of stop operation for p-fssapmnt on
>>>>>> pipci001: 0 (ok)
>>>>>> crmd[4320]:   notice: Initiating stop operation dlm_stop_0 locally on
>>>>>> pipci001
>>>>>> lrmd[4317]:   notice: executing - rsc:dlm action:stop call_id:56
>>>>>> dlm_controld[5105]: 766 shutdown ignored, active lockspaces
>>>>>> lrmd[4317]:  warning: dlm_stop_0 process (PID 5326) timed out
>>>>>> lrmd[4317]:  warning: dlm_stop_0:5326 - timed out after 100000ms
>>>>>> lrmd[4317]:   notice: finished - rsc:dlm action:stop call_id:56
>>>>>> pid:5326 exit-code:1 exec-time:100003ms queue-time:0ms
>>>>>> crmd[4320]:    error: Result of stop operation for dlm on pipci001:
>>>>>> Timed Out
>>>>>> crmd[4320]:  warning: Action 15 (dlm_stop_0) on pipci001 failed
>>>>>> (target: 0 vs. rc: 1): Error
>>>>>> crmd[4320]:   notice: Transition aborted by operation dlm_stop_0
>>>>>> 'modify' on pipci001: Event failed
>>>>>> crmd[4320]:  warning: Action 15 (dlm_stop_0) on pipci001 failed
>>>>>> (target: 0 vs. rc: 1): Error
>>>>>> pengine[4319]:   notice: Watchdog will be used via SBD if fencing is
>>>>>> required
>>>>>> pengine[4319]:   notice: On loss of CCM Quorum: Ignore
>>>>>> pengine[4319]:  warning: Processing failed op stop for dlm:0 on
>>>>>> pipci001: unknown error (1)
>>>>>> pengine[4319]:  warning: Processing failed op stop for dlm:0 on
>>>>>> pipci001: unknown error (1)
>>>>>> pengine[4319]:  warning: Cluster node pipci001 will be fenced: dlm:0
>>>>>> failed there
>>>>>> pengine[4319]:  warning: Processing failed op start for p-fssapmnt:0
>>>>>> on pipci001: unknown error (1)
>>>>>> pengine[4319]:   notice: Stop of failed resource dlm:0 is implicit
>>>>>> after pipci001 is fenced
>>>>>> pengine[4319]:   notice:  * Fence pipci001
>>>>>> pengine[4319]:   notice: Stop    sbd-stonith#011(pipci001)
>>>>>> pengine[4319]:   notice: Stop    dlm:0#011(pipci001)
>>>>>> crmd[4320]:   notice: Requesting fencing (reboot) of node pipci001
>>>>>> stonith-ng[4316]:   notice: Client crmd.4320.4c2f757b wants to fence
>>>>>> (reboot) 'pipci001' with device '(any)'
>>>>>> stonith-ng[4316]:   notice: Requesting peer fencing (reboot) of
>>>>>> pipci001
>>>>>> stonith-ng[4316]:   notice: sbd-stonith can fence (reboot) pipci001:
>>>>>> dynamic-list
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Regards,
>>>>>> Muhammad Sharfuddin | +923332144823 | nds.com.pk
>>>>>>
>>>>>> On 3/13/2018 1:04 PM, Ulrich Windl wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> I'd recommend this:
>>>>>>> Cleanly boot your nodes, avoiding any manual operation with cluster
>>>>>>> resources. Keep the logs.
>>>>>>> Then start your tests, keeping the logs for each.
>>>>>>> Try to fix issues by reading the logs and adjusting the cluster
>>>>>>> configuration, and not by starting commands that the cluster should
>>>>>>> start.
>>>>>>>
>>>>>>> We had an 2-node OCFS2 cluster running for quite some time with
>>>>>>> SLES11, but now the cluster is three nodes. To me the output of
>>>>>>> "crm_mon -1Arfj" combined with having set record-pending=true was
>>>>>>> very valuable finding problems.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Ulrich
>>>>>>>
>>>>>>>
>>>>>>>>>> Muhammad Sharfuddin <M.Sharfuddin at nds.com.pk> schrieb am
>>>>>>>>>> 13.03.2018 um 08:43 in
>>>>>>> Nachricht <7b773ae9-4209-d246-b5c0-2c8b67e623b3 at nds.com.pk>:
>>>>>>>> Dear Klaus,
>>>>>>>>
>>>>>>>> If I understand you properly then, its a fencing issue, and
>>>>>>>> whatever I
>>>>>>>> am facing is "natural" or "by-design" in a two node cluster where
>>>>>>>> quorum
>>>>>>>> is incomplete.
>>>>>>>>
>>>>>>>> I am quite convinced that you have pointed out right because,
>>>>>>>> when I
>>>>>>>> start the dlm resource via cluster and then tries to start the
>>>>>>>> ocfs2
>>>>>>>> file system manually from command line, mount command remains
>>>>>>>> hanged
>>>>>>>> and
>>>>>>>> following events are reported in the logs:
>>>>>>>>
>>>>>>>>         kernel: [62622.864828] ocfs2: Registered cluster interface
>>>>>>>> user
>>>>>>>>         kernel: [62622.884427] dlm: Using TCP for communications
>>>>>>>>         kernel: [62622.884750] dlm:
>>>>>>>> BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>>> joining the lockspace group...
>>>>>>>>         dlm_controld[17655]: 62627 fence work wait for quorum
>>>>>>>>         dlm_controld[17655]: 62680 BFA9FF042AA045F4822C2A6A06020EE9
>>>>>>>> wait
>>>>>>>> for quorum
>>>>>>>>
>>>>>>>> and then following messages keep reported every 5-10 minutes,
>>>>>>>> till I
>>>>>>>> kill the mount.ocfs2 process:
>>>>>>>>
>>>>>>>>         dlm_controld[17655]: 62627 fence work wait for quorum
>>>>>>>>         dlm_controld[17655]: 62680 BFA9FF042AA045F4822C2A6A06020EE9
>>>>>>>> wait
>>>>>>>> for quorum
>>>>>>>>
>>>>>>>> I am also very much confused, because yesterday I did the same and
>>>>>>>> was
>>>>>>>> able to mount the ocfs2 file system manually from command line(at
>>>>>>>> least
>>>>>>>> once), and then unmount the file system manually stop the dlm
>>>>>>>> resource
>>>>>>>> from cluster and then complete ocfs2 resource stack(dlm, file
>>>>>>>> systems)
>>>>>>>> start/stop successfully via cluster even when only machine was
>>>>>>>> online.
>>>>>>>>
>>>>>>>> In a two-node cluster, which have ocfs2 resources, we can't run the
>>>>>>>> ocfs2 resources when quorum is incomplete(one node is offline) ?
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Regards,
>>>>>>>> Muhammad Sharfuddin
>>>>>>>>
>>>>>>>> On 3/12/2018 5:58 PM, Klaus Wenninger wrote:
>>>>>>>>> On 03/12/2018 01:44 PM, Muhammad Sharfuddin wrote:
>>>>>>>>>> Hi Klaus,
>>>>>>>>>>
>>>>>>>>>> primitive sbd-stonith stonith:external/sbd \
>>>>>>>>>>             op monitor interval=3000 timeout=20 \
>>>>>>>>>>             op start interval=0 timeout=240 \
>>>>>>>>>>             op stop interval=0 timeout=100 \
>>>>>>>>>>             params sbd_device="/dev/mapper/sbd" \
>>>>>>>>>>             meta target-role=Started
>>>>>>>>> Makes more sense now.
>>>>>>>>> Using pcmk_delay_max would probably be useful here
>>>>>>>>> to prevent a fence-race.
>>>>>>>>> That stonith-resource was not in your resource-list below ...
>>>>>>>>>
>>>>>>>>>> property cib-bootstrap-options: \
>>>>>>>>>>             have-watchdog=true \
>>>>>>>>>>             stonith-enabled=true \
>>>>>>>>>>             no-quorum-policy=ignore \
>>>>>>>>>>             stonith-timeout=90 \
>>>>>>>>>>             startup-fencing=true
>>>>>>>>> You've set no-quorum-policy=ignore for pacemaker.
>>>>>>>>> Whether this is a good idea or not in your setup is
>>>>>>>>> written on another page.
>>>>>>>>> But isn't dlm directly interfering with corosync so
>>>>>>>>> that it would get the quorum state from there?
>>>>>>>>> As you have 2-node set probably on a 2-node-cluster
>>>>>>>>> this would - after both nodes down - wait for all
>>>>>>>>> nodes up first.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Klaus
>>>>>>>>>
>>>>>>>>>> # ps -eaf |grep sbd
>>>>>>>>>> root      6129     1  0 17:35 ?        00:00:00 sbd: inquisitor
>>>>>>>>>> root      6133  6129  0 17:35 ?        00:00:00 sbd: watcher:
>>>>>>>>>> /dev/mapper/sbd - slot: 1 - uuid:
>>>>>>>>>> 6e80a337-95db-4608-bd62-d59517f39103
>>>>>>>>>> root      6134  6129  0 17:35 ?        00:00:00 sbd: watcher:
>>>>>>>>>> Pacemaker
>>>>>>>>>> root      6135  6129  0 17:35 ?        00:00:00 sbd: watcher:
>>>>>>>>>> Cluster
>>>>>>>>>>
>>>>>>>>>> This cluster does not start ocfs2 resources when I first
>>>>>>>>>> intentionally
>>>>>>>>>> crashed(reboot) both the nodes, then try to start ocfs2 resource
>>>>>>>>>> while
>>>>>>>>>> one node is  offline.
>>>>>>>>>>
>>>>>>>>>> To fix the issue, I have one permanent solution, bring the other
>>>>>>>>>> node(offline) online and things get fixed automatically, i.e
>>>>>>>>>> ocfs2
>>>>>>>>>> resources mounts.
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Regards,
>>>>>>>>>> Muhammad Sharfuddin
>>>>>>>>>>
>>>>>>>>>> On 3/12/2018 5:25 PM, Klaus Wenninger wrote:
>>>>>>>>>>> Hi Muhammad!
>>>>>>>>>>>
>>>>>>>>>>> Could you be a little bit more elaborate on your fencing-setup!
>>>>>>>>>>> I read about you using SBD but I don't see any
>>>>>>>>>>> sbd-fencing-resource.
>>>>>>>>>>> For the case you wanted to use watchdog-fencing with SBD this
>>>>>>>>>>> would require stonith-watchdog-timeout property to be set.
>>>>>>>>>>> But watchdog-fencing relies on quorum (without 2-node trickery)
>>>>>>>>>>> and thus wouldn't work on a 2-node-cluster anyway.
>>>>>>>>>>>
>>>>>>>>>>> Didn't read through the whole thread - so I might be missing
>>>>>>>>>>> something ...
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Klaus
>>>>>>>>>>>
>>>>>>>>>>> On 03/12/2018 12:51 PM, Muhammad Sharfuddin wrote:
>>>>>>>>>>>> Hello Gang,
>>>>>>>>>>>>
>>>>>>>>>>>> as informed, previously cluster was fixed to start the ocfs2
>>>>>>>>>>>> resources by
>>>>>>>>>>>>
>>>>>>>>>>>> a) crm resource start dlm
>>>>>>>>>>>>
>>>>>>>>>>>> b) mount/umount the ocfs2 file system manually. (this step was
>>>>>>>>>>>> the
>>>>>>>>>>>> fix)
>>>>>>>>>>>>
>>>>>>>>>>>> and then starting the clone group(which include dlm, ocfs2 file
>>>>>>>>>>>> systems) worked fine:
>>>>>>>>>>>>
>>>>>>>>>>>> c) crm resource start base-clone.
>>>>>>>>>>>>
>>>>>>>>>>>> Now I crash the nodes intentionally and then keep only one node
>>>>>>>>>>>> online, again cluster stopped starting the ocfs2 resources. I
>>>>>>>>>>>> again
>>>>>>>>>>>> tried to follow your instructions i.e
>>>>>>>>>>>>
>>>>>>>>>>>> i) crm resource start dlm
>>>>>>>>>>>>
>>>>>>>>>>>> then try to mount the ocfs2 file system manually which got
>>>>>>>>>>>> hanged this
>>>>>>>>>>>> time(previously manually mounting helped me):
>>>>>>>>>>>>
>>>>>>>>>>>> # cat /proc/3966/stack
>>>>>>>>>>>> [<ffffffffa039f18e>] do_uevent+0x7e/0x200 [dlm]
>>>>>>>>>>>> [<ffffffffa039fe0a>] new_lockspace+0x80a/0xa70 [dlm]
>>>>>>>>>>>> [<ffffffffa03a02d9>] dlm_new_lockspace+0x69/0x160 [dlm]
>>>>>>>>>>>> [<ffffffffa038e758>] user_cluster_connect+0xc8/0x350
>>>>>>>>>>>> [ocfs2_stack_user]
>>>>>>>>>>>> [<ffffffffa03c2872>] ocfs2_cluster_connect+0x192/0x240
>>>>>>>>>>>> [ocfs2_stackglue]
>>>>>>>>>>>> [<ffffffffa045eefc>] ocfs2_dlm_init+0x31c/0x570 [ocfs2]
>>>>>>>>>>>> [<ffffffffa04a9983>] ocfs2_fill_super+0xb33/0x1200 [ocfs2]
>>>>>>>>>>>> [<ffffffff8120e130>] mount_bdev+0x1a0/0x1e0
>>>>>>>>>>>> [<ffffffff8120ea1a>] mount_fs+0x3a/0x170
>>>>>>>>>>>> [<ffffffff81228bf2>] vfs_kern_mount+0x62/0x110
>>>>>>>>>>>> [<ffffffff8122b123>] do_mount+0x213/0xcd0
>>>>>>>>>>>> [<ffffffff8122bed5>] SyS_mount+0x85/0xd0
>>>>>>>>>>>> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
>>>>>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>>>>>>
>>>>>>>>>>>> I killed the mount.ocfs2 process stop(crm resource stop dlm)
>>>>>>>>>>>> the
>>>>>>>>>>>> dlm
>>>>>>>>>>>> process, and then try to start(crm resource start dlm) the
>>>>>>>>>>>> dlm(which
>>>>>>>>>>>> previously always get started successfully), this time dlm
>>>>>>>>>>>> didn't
>>>>>>>>>>>> start and I checked the dlm_controld process
>>>>>>>>>>>>
>>>>>>>>>>>> cat /proc/3754/stack
>>>>>>>>>>>> [<ffffffff8121dc55>] poll_schedule_timeout+0x45/0x60
>>>>>>>>>>>> [<ffffffff8121f0bc>] do_sys_poll+0x38c/0x4f0
>>>>>>>>>>>> [<ffffffff8121f2dd>] SyS_poll+0x5d/0xe0
>>>>>>>>>>>> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
>>>>>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>>>>>>
>>>>>>>>>>>> Nutshell:
>>>>>>>>>>>>
>>>>>>>>>>>> 1 - this cluster is configured to run when single node is
>>>>>>>>>>>> online
>>>>>>>>>>>>
>>>>>>>>>>>> 2 - this cluster does not start the ocfs2 resources after a
>>>>>>>>>>>> crash when
>>>>>>>>>>>> only one node is online.
>>>>>>>>>>>>
>>>>>>>>>>>> -- 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Muhammad Sharfuddin | +923332144823 | nds.com.pk
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/12/2018 12:41 PM, Gang He wrote:
>>>>>>>>>>>>>> Hello Gang,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> to follow your instructions, I started the dlm resource via:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>            crm resource start dlm
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> then mount/unmount the ocfs2 file system manually..(which
>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>> the fix of the situation).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now resources are getting started properly on a single node..
>>>>>>>>>>>>>> I am
>>>>>>>>>>>>>> happy
>>>>>>>>>>>>>> as the issue is fixed, but at the same time I am lost because
>>>>>>>>>>>>>> I have
>>>>>>>>>>>>>> no idea
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> how things get fixed here(merely by mounting/unmounting the
>>>>>>>>>>>>>> ocfs2
>>>>>>>>>>>>>> file
>>>>>>>>>>>>>> systems)
>>>>>>>>>>>>> >From your description.
>>>>>>>>>>>>> I just wonder  the DLM resource does not work normally under
>>>>>>>>>>>>> that
>>>>>>>>>>>>> situation.
>>>>>>>>>>>>> Yan/Bin, do you have any comments about two-node cluster?
>>>>>>>>>>>>> which
>>>>>>>>>>>>> configuration settings will affect corosync quorum/DLM ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Gang
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Muhammad Sharfuddin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/12/2018 10:59 AM, Gang He wrote:
>>>>>>>>>>>>>>> Hello Muhammad,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Usually, ocfs2 resource startup failure is caused by mount
>>>>>>>>>>>>>>> command
>>>>>>>>>>>>>>> timeout
>>>>>>>>>>>>>> (or hanged).
>>>>>>>>>>>>>>> The sample debugging method is,
>>>>>>>>>>>>>>> remove ocfs2 resource from crm first,
>>>>>>>>>>>>>>> then mount this file system manually, see if the mount
>>>>>>>>>>>>>>> command
>>>>>>>>>>>>>>> will be
>>>>>>>>>>>>>> timeout or hanged.
>>>>>>>>>>>>>>> If this command is hanged, please watch where is mount.ocfs2
>>>>>>>>>>>>>>> process hanged
>>>>>>>>>>>>>> via "cat /proc/xxx/stack" command.
>>>>>>>>>>>>>>> If the back trace is stopped at DLM kernel module, usually
>>>>>>>>>>>>>>> the root
>>>>>>>>>>>>>>> cause is
>>>>>>>>>>>>>> cluster configuration problem.
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Gang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 3/12/2018 7:32 AM, Gang He wrote:
>>>>>>>>>>>>>>>>> Hello Muhammad,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think this problem is not in ocfs2, the cause looks
>>>>>>>>>>>>>>>>> like the
>>>>>>>>>>>>>>>>> cluster
>>>>>>>>>>>>>>>> quorum is missed.
>>>>>>>>>>>>>>>>> For two-node cluster (does not three-node cluster), if one
>>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>> is offline,
>>>>>>>>>>>>>>>> the quorum will be missed by default.
>>>>>>>>>>>>>>>>> So, you should configure two-node related quorum setting
>>>>>>>>>>>>>>>>> according to the
>>>>>>>>>>>>>>>> pacemaker manual.
>>>>>>>>>>>>>>>>> Then, DLM can work normal, and ocfs2 resource can start
>>>>>>>>>>>>>>>>> up.
>>>>>>>>>>>>>>>> Yes its configured accordingly, no-quorum is set to
>>>>>>>>>>>>>>>> "ignore".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> property cib-bootstrap-options: \
>>>>>>>>>>>>>>>>                  have-watchdog=true \
>>>>>>>>>>>>>>>>                  stonith-enabled=true \
>>>>>>>>>>>>>>>>                  stonith-timeout=80 \
>>>>>>>>>>>>>>>>                  startup-fencing=true \
>>>>>>>>>>>>>>>>                  no-quorum-policy=ignore
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>> Gang
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This two node cluster starts resources when both nodes
>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>> online but
>>>>>>>>>>>>>>>>>> does not start the ocfs2 resources
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> when one node is offline. e.g if I gracefully stop the
>>>>>>>>>>>>>>>>>> cluster
>>>>>>>>>>>>>>>>>> resources
>>>>>>>>>>>>>>>>>> then stop the pacemaker service on
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> either node, and try to start the ocfs2 resource on the
>>>>>>>>>>>>>>>>>> online
>>>>>>>>>>>>>>>>>> node, it
>>>>>>>>>>>>>>>>>> fails.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> logs:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> pipci001 pengine[17732]:   notice: Start
>>>>>>>>>>>>>>>>>> dlm:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Start
>>>>>>>>>>>>>>>>>> p-fssapmnt:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Start
>>>>>>>>>>>>>>>>>> p-fsusrsap:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pipci001 pengine[17732]:   notice: Calculated
>>>>>>>>>>>>>>>>>> transition 2,
>>>>>>>>>>>>>>>>>> saving
>>>>>>>>>>>>>>>>>> inputs in /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>>>>>>>>>>>>>> pipci001 crmd[17733]:   notice: Processing graph 2
>>>>>>>>>>>>>>>>>> (ref=pe_calc-dc-1520613202-31) derived from
>>>>>>>>>>>>>>>>>> /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Initiating start operation
>>>>>>>>>>>>>>>>>> dlm_start_0
>>>>>>>>>>>>>>>>>> locally on
>>>>>>>>>>>>>>>>>> pipci001
>>>>>>>>>>>>>>>>>> lrmd[17730]:   notice: executing - rsc:dlm action:start
>>>>>>>>>>>>>>>>>> call_id:69
>>>>>>>>>>>>>>>>>> dlm_controld[19019]: 4575 dlm_controld 4.0.7 started
>>>>>>>>>>>>>>>>>> lrmd[17730]:   notice: finished - rsc:dlm action:start
>>>>>>>>>>>>>>>>>> call_id:69
>>>>>>>>>>>>>>>>>> pid:18999 exit-code:0 exec-time:1082ms queue-time:1ms
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Result of start operation for
>>>>>>>>>>>>>>>>>> dlm on
>>>>>>>>>>>>>>>>>> pipci001: 0 (ok)
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Initiating monitor operation
>>>>>>>>>>>>>>>>>> dlm_monitor_60000
>>>>>>>>>>>>>>>>>> locally on pipci001
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Initiating start operation
>>>>>>>>>>>>>>>>>> p-fssapmnt_start_0
>>>>>>>>>>>>>>>>>> locally on pipci001
>>>>>>>>>>>>>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt
>>>>>>>>>>>>>>>>>> action:start
>>>>>>>>>>>>>>>>>> call_id:71
>>>>>>>>>>>>>>>>>> Filesystem(p-fssapmnt)[19052]: INFO: Running start for
>>>>>>>>>>>>>>>>>> /dev/mapper/sapmnt on /sapmnt
>>>>>>>>>>>>>>>>>> kernel: [ 4576.529938] dlm: Using TCP for communications
>>>>>>>>>>>>>>>>>> kernel: [ 4576.530233] dlm:
>>>>>>>>>>>>>>>>>> BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>>>>>>>>>>>>> joining
>>>>>>>>>>>>>>>>>> the lockspace group.
>>>>>>>>>>>>>>>>>> dlm_controld[19019]: 4629 fence work wait for quorum
>>>>>>>>>>>>>>>>>> dlm_controld[19019]: 4634
>>>>>>>>>>>>>>>>>> BFA9FF042AA045F4822C2A6A06020EE9
>>>>>>>>>>>>>>>>>> wait
>>>>>>>>>>>>>>>>>> for quorum
>>>>>>>>>>>>>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0 process (PID
>>>>>>>>>>>>>>>>>> 19052)
>>>>>>>>>>>>>>>>>> timed out
>>>>>>>>>>>>>>>>>> kernel: [ 4636.418223] dlm:
>>>>>>>>>>>>>>>>>> BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>>>>>>>>>>>>> group
>>>>>>>>>>>>>>>>>> event done -512 0
>>>>>>>>>>>>>>>>>> kernel: [ 4636.418227] dlm:
>>>>>>>>>>>>>>>>>> BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>>>>>>>>>>>>> group join
>>>>>>>>>>>>>>>>>> failed -512 0
>>>>>>>>>>>>>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0:19052 -
>>>>>>>>>>>>>>>>>> timed out
>>>>>>>>>>>>>>>>>> after 60000ms
>>>>>>>>>>>>>>>>>> lrmd[17730]:   notice: finished - rsc:p-fssapmnt
>>>>>>>>>>>>>>>>>> action:start
>>>>>>>>>>>>>>>>>> call_id:71
>>>>>>>>>>>>>>>>>> pid:19052 exit-code:1 exec-time:60002ms queue-time:0ms
>>>>>>>>>>>>>>>>>> kernel: [ 4636.420628] ocfs2: Unmounting device
>>>>>>>>>>>>>>>>>> (254,1) on
>>>>>>>>>>>>>>>>>> (node 0)
>>>>>>>>>>>>>>>>>> crmd[17733]:    error: Result of start operation for
>>>>>>>>>>>>>>>>>> p-fssapmnt on
>>>>>>>>>>>>>>>>>> pipci001: Timed Out
>>>>>>>>>>>>>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on
>>>>>>>>>>>>>>>>>> pipci001 failed
>>>>>>>>>>>>>>>>>> (target: 0 vs. rc: 1): Error
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Transition aborted by operation
>>>>>>>>>>>>>>>>>> p-fssapmnt_start_0 'modify' on pipci001: Event failed
>>>>>>>>>>>>>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on
>>>>>>>>>>>>>>>>>> pipci001 failed
>>>>>>>>>>>>>>>>>> (target: 0 vs. rc: 1): Error
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Transition 2 (Complete=5,
>>>>>>>>>>>>>>>>>> Pending=0,
>>>>>>>>>>>>>>>>>> Fired=0,
>>>>>>>>>>>>>>>>>> Skipped=0, Incomplete=6,
>>>>>>>>>>>>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-339.bz2):
>>>>>>>>>>>>>>>>>> Complete
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Watchdog will be used via
>>>>>>>>>>>>>>>>>> SBD if
>>>>>>>>>>>>>>>>>> fencing is
>>>>>>>>>>>>>>>>>> required
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>>>>>>>>>>>> p-fssapmnt:0 on
>>>>>>>>>>>>>>>>>> pipci001: unknown error (1)
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>>>>>>>>>>>> p-fssapmnt:0 on
>>>>>>>>>>>>>>>>>> pipci001: unknown error (1)
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Forcing base-clone away from
>>>>>>>>>>>>>>>>>> pipci001
>>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>>> 1000000 failures (max=2)
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Forcing base-clone away from
>>>>>>>>>>>>>>>>>> pipci001
>>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>>> 1000000 failures (max=2)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Stop
>>>>>>>>>>>>>>>>>> p-fssapmnt:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Calculated transition 3, saving
>>>>>>>>>>>>>>>>>> inputs in
>>>>>>>>>>>>>>>>>> /var/lib/pacemaker/pengine/pe-input-340.bz2
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Watchdog will be used via
>>>>>>>>>>>>>>>>>> SBD if
>>>>>>>>>>>>>>>>>> fencing is
>>>>>>>>>>>>>>>>>> required
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>>>>>>>>>>>> p-fssapmnt:0 on
>>>>>>>>>>>>>>>>>> pipci001: unknown error (1)
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>>>>>>>>>>>> p-fssapmnt:0 on
>>>>>>>>>>>>>>>>>> pipci001: unknown error (1)
>>>>>>>>>>>>>>>>>> pengine[17732]:  warning: Forcing base-clone away from
>>>>>>>>>>>>>>>>>> pipci001
>>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>>> 1000000 failures (max=2)
>>>>>>>>>>>>>>>>>> pipci001 pengine[17732]:  warning: Forcing base-clone
>>>>>>>>>>>>>>>>>> away
>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>> pipci001
>>>>>>>>>>>>>>>>>> after 1000000 failures (max=2)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Stop
>>>>>>>>>>>>>>>>>> p-fssapmnt:0#011(pipci001)
>>>>>>>>>>>>>>>>>> pengine[17732]:   notice: Calculated transition 4, saving
>>>>>>>>>>>>>>>>>> inputs in
>>>>>>>>>>>>>>>>>> /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Processing graph 4
>>>>>>>>>>>>>>>>>> (ref=pe_calc-dc-1520613263-36)
>>>>>>>>>>>>>>>>>> derived from /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Initiating stop operation
>>>>>>>>>>>>>>>>>> p-fssapmnt_stop_0
>>>>>>>>>>>>>>>>>> locally on pipci001
>>>>>>>>>>>>>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt
>>>>>>>>>>>>>>>>>> action:stop
>>>>>>>>>>>>>>>>>> call_id:72
>>>>>>>>>>>>>>>>>> Filesystem(p-fssapmnt)[19189]: INFO: Running stop for
>>>>>>>>>>>>>>>>>> /dev/mapper/sapmnt
>>>>>>>>>>>>>>>>>> on /sapmnt
>>>>>>>>>>>>>>>>>> pipci001 lrmd[17730]:   notice: finished - rsc:p-fssapmnt
>>>>>>>>>>>>>>>>>> action:stop
>>>>>>>>>>>>>>>>>> call_id:72 pid:19189 exit-code:0 exec-time:83ms
>>>>>>>>>>>>>>>>>> queue-time:0ms
>>>>>>>>>>>>>>>>>> pipci001 crmd[17733]:   notice: Result of stop operation
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> p-fssapmnt
>>>>>>>>>>>>>>>>>> on pipci001: 0 (ok)
>>>>>>>>>>>>>>>>>> crmd[17733]:   notice: Initiating stop operation
>>>>>>>>>>>>>>>>>> dlm_stop_0
>>>>>>>>>>>>>>>>>> locally on
>>>>>>>>>>>>>>>>>> pipci001
>>>>>>>>>>>>>>>>>> pipci001 lrmd[17730]:   notice: executing - rsc:dlm
>>>>>>>>>>>>>>>>>> action:stop
>>>>>>>>>>>>>>>>>> call_id:74
>>>>>>>>>>>>>>>>>> pipci001 dlm_controld[19019]: 4636 shutdown ignored,
>>>>>>>>>>>>>>>>>> active
>>>>>>>>>>>>>>>>>> lockspaces
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> resource configuration:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> primitive p-fssapmnt Filesystem \
>>>>>>>>>>>>>>>>>>                  params device="/dev/mapper/sapmnt"
>>>>>>>>>>>>>>>>>> directory="/sapmnt"
>>>>>>>>>>>>>>>>>> fstype=ocfs2 \
>>>>>>>>>>>>>>>>>>                  op monitor interval=20 timeout=40 \
>>>>>>>>>>>>>>>>>>                  op start timeout=60 interval=0 \
>>>>>>>>>>>>>>>>>>                  op stop timeout=60 interval=0
>>>>>>>>>>>>>>>>>> primitive dlm ocf:pacemaker:controld \
>>>>>>>>>>>>>>>>>>                  op monitor interval=60 timeout=60 \
>>>>>>>>>>>>>>>>>>                  op start interval=0 timeout=90 \
>>>>>>>>>>>>>>>>>>                  op stop interval=0 timeout=100
>>>>>>>>>>>>>>>>>> clone base-clone base-group \
>>>>>>>>>>>>>>>>>>                  meta interleave=true target-role=Started
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> cluster properties:
>>>>>>>>>>>>>>>>>> property cib-bootstrap-options: \
>>>>>>>>>>>>>>>>>>                  have-watchdog=true \
>>>>>>>>>>>>>>>>>>                  stonith-enabled=true \
>>>>>>>>>>>>>>>>>>                  stonith-timeout=80 \
>>>>>>>>>>>>>>>>>>                  startup-fencing=true \
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Software versions:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> kernel version: 4.4.114-94.11-default
>>>>>>>>>>>>>>>>>> pacemaker-1.1.16-4.8.x86_64
>>>>>>>>>>>>>>>>>> corosync-2.3.6-9.5.1.x86_64
>>>>>>>>>>>>>>>>>> ocfs2-kmp-default-4.4.114-94.11.3.x86_64
>>>>>>>>>>>>>>>>>> ocfs2-tools-1.8.5-1.35.x86_64
>>>>>>>>>>>>>>>>>> dlm-kmp-default-4.4.114-94.11.3.x86_64
>>>>>>>>>>>>>>>>>> libdlm3-4.0.7-1.28.x86_64
>>>>>>>>>>>>>>>>>> libdlm-4.0.7-1.28.x86_64
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>> Muhammad Sharfuddin
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>> This email has been checked for viruses by Avast
>>>>>>>>>>>>>>>>>> antivirus
>>>>>>>>>>>>>>>>>> software.
>>>>>>>>>>>>>>>>>> https://www.avast.com/antivirus 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Muhammad Sharfuddin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>>
>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>>>> Getting started:
>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>>>> ---
>>>>>>>>>> This email has been checked for viruses by Avast antivirus
>>>>>>>>>> software.
>>>>>>>>>> https://www.avast.com/antivirus 
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>>>> Getting started:
>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>> _______________________________________________
>>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>  
>>>
>>
>>
>> ---
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus 
>>
> 
> -- 
> Klaus Wenninger
> 
> Senior Software Engineer, EMEA ENG Base Operating Systems
> 
> Red Hat
> 
> kwenning at redhat.com   
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org