[ClusterLabs] File System does not do a recovery on fail over

Gang He ghe at suse.com
Wed Jun 12 03:46:49 EDT 2019


CC our file system people Jeff to this loop.

>From my view, I feel the file system recovery time usually depends on file system journal size, not file system size.
Hello Jeff, do you think XFS will take 5 ~ 10 mins during the mounting after a uncleanly switch.

Thanks
Gang

>>> On 6/12/2019 at  1:29 pm, in message
<CALuPYL3XutG18+NuC6-L0hevG_X97BD3fgWh5=Xw6cHHZ6kLuw at mail.gmail.com>, Indivar
Nair <indivar.nair at techterra.in> wrote:
> Thanks, Gang
> 
> It is a very large file system - around 600TB.
> Could this be why it takes around 5 - 10mins to do journal recovery?
> 
> What we do as a workaround is -
> - Disable the filesystem resource on startup
> - Manually mount it (wait for as long as it takes)
> - Then umount it
> - Enable filesystem resource
> 
> But this doesn't seem like the right approach.
> 
> We have tried repairing the Filesystem when a failover happens, but it
> has never shown any major corruption.
> 
> Regards,
> 
> 
> Indivar Nair
> 
> 
> 
> On Tue, Jun 11, 2019 at 10:18 AM Gang He <ghe at suse.com> wrote:
>>
>> Hi Indivar,
>>
>> See my comments inline.
>>
>> >>> On 6/11/2019 at 12:10 pm, in message
>> <CALuPYL3-+8DBTyd8rRONjY=y8aa64y7W+V=EaNfjb+ez4rg6DQ at mail.gmail.com>, Indivar
>> Nair <indivar.nair at techterra.in> wrote:
>> > Hello ...,
>> >
>> > I have an Active-Passive cluster with two nodes hosting an XFS
>> > Filesystem over a CLVM Volume.
>> >
>> > If a failover happens, the volume is mounted on the other node without
>> > a recovery that usually happens to a volume that has not been cleanly
>> > unmounted.
>> > The FS journal is on the same volume.
>> >
>> > Now, when we fail it back (with a complete cluster shutdown and
>> > restart) on to its original node, it undergoes the automatic recovery.
>> >
>> > 1.
>> > Shouldn't it do an FS recovery during the failover to the other node?
>> > Note: The FS journal is on the same volume.
>> Usually, file system must do the log recovery during the file system is 
> mounted.
>>
>> >
>> > 2.
>> > Also, the failback usually fails because the FS check takes a
>> > considerable amount of time. How do I configure the mount not to fail
>> > when an automatic FS check is going on?
>> File system introduces a journal to avoiding take too long time for file 
> system recovery.
>> If the time is too long, maybe this is a file system problem, e.g. file 
> system is damaged.
>> Secondly, you can set the timeout value longer.
>>
>> Thanks
>> Gang
>>
>> >
>> > Any help/pointers would be highly appreciated.
>> >
>> > Thanks.
>> >
>> > Regards,
>> >
>> >
>> > Indivar Nair
>> > _______________________________________________
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users 
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/ 
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/



More information about the Users mailing list