[ClusterLabs] [Linux-HA] Antw: Re: file system resource becomes inaccesible when any of the node goes down

Tue Jul 7 09:15:14 UTC 2015

On 07/07/2015 12:14 PM, Ulrich Windl wrote:
>>>> Muhammad Sharfuddin <M.Sharfuddin at nds.com.pk> schrieb am 06.07.2015 um 12:14 in
> Nachricht <559A550A.8010906 at nds.com.pk>:
> [...]
>> Ok, so reducing the sbd timeout(or msgwait) would provide the
>> uninterrupted access to the ocfs2 file system on the surviving/online node ?
>> or would it just minimize the downtime ?
> It will reduce the time between "writing the reset message for a node" and "the cluster believes the node is down". So you can guess what happens if you set it to some very short time like 1 second...
>
> Regards,
> Ulrich
>
now msgwait timeout is set to 10s and a delay/inaccessibility of 15 
seconds was observed. If a service(App, DB, file server) is installed 
and running from the ocfs2 file system via the surviving/online node, then
wouldn't that service get crashed or become offline due to the 
inaccessibility of the file system(event though its ocfs2) while a 
member node goes down ?

If cluster is configured to run the two independent services, and starts 
one on node1 and ther on node2, while both the service shared the same 
file system, /sharedata(ocfs2),  then in case of a failure of one node, 
the other/online wont be able to
keep running the particular service because the file system holding the 
binaries/configuration/service is not available for around at least 15 
seconds.

I don't understand the advantage of Ocfs2 file system in such a setup.

-- 
Regards,

Muhammad Sharfuddin
<http://www.nds.com.pk>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150707/14bd44a5/attachment-0002.html>