[ClusterLabs] Poor performance of mirrored cLVM (with monitoring results)

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed May 25 03:12:19 EDT 2016


Hello!

Just to notify you: We run a mirrored cLVM2 logical Volume (LV) available on three nodes where the legs are hosted on a FC-SAN storage. While the SAN storage performs rather well, the LV is quite slow at some times. So the OCFS2 on top of it is slow also. For one application we see periodic disk timeouts. That's the background.

As reported before, I started to make measurements at various points using Direct-I/O reads of 512 bytes ("IOTW"). Very recently I also recorded the numbers of /sys/block/*/stats as rates (the original numbers are mostly cumulative) ("blockstats").

So my findings are these: While the SAN-storage performs better than 5ms near problem time, the LV has a read delay of up to 30 seconds at the same time! OCFS2 is following that. The analysis of block stats indicated that at the time of bad read performance, writes are executing with the write wait time accumulating up to 28 seconds ("per seconds"). at peak time about 32 I/Os/second were active.

The performance is equally bad on every node at the time of problems.

I was only monitoring the LV itself, not its components (*_mimage_0, *_mimage_1, *_mlog_mimage_0, *_mlog_mimage_1, *_mlog).

I'm afraid mirrored cLVM does not scale well for writes. (We also run an NFS server with an MD-RAID1 using the same SAN storage, and the performance seems much better (<5ms at the time of problem, <50ms in the last 24 hours).

Note that the "per second values" of block stats come from averaging a 120 second polling interval for the block stats, so the actual peaks may be much higher. See /usr/src/linux/Documentation/block/stat.txt for an explanation of the values.

Another note on "emax", the "exponential maximum" in the I/O wait graphs:  The values are sampled every 3 seconds, and a new maximum is accepted immediately, and it decays exponentially with every sample (alpha=0.008). That was chosen to capture the peaks, but not for too long. These values are queried/plotted at a significantly higher interval, so you still might miss the actual peak.

Anyway for those interested, I'll attach the graphs from last night. The problem is shortly before 6 o' clock.

>>> Lars Marowsky-Bree <lmb at suse.com> schrieb am 25.04.2016 um 12:12 in Nachricht
<20160425101236.GD10040 at suse.de>:
> On 2016-04-25T10:10:38, Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de> wrote:
> 
> Hi Ulrich,
> 
> I can't really comment on why the cLVM2 is slow (somewhat surprisingly,
> because flock is meta-data only and thus shouldn't even be affected by
> cLVM2, anyway ...).
> 
> But on the subject of performance, you're quite right - we know that
> cLVM2 is not fast enough, thus there has been an effort to make md raid
> cluster aware (especially RAID1). cluster-md is almost completely
> merged upstream and coming to your favorite enterprise distribution very
> soon too ;-)
> 
> 
> Regards,
>     Lars



-------------- next part --------------
A non-text attachment was scrubbed...
Name: h10-IOTW-cLVM-Leg1.png
Type: image/png
Size: 20506 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160525/7eab6693/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: h10-blockstat-cLVM-LV.png
Type: image/png
Size: 49610 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160525/7eab6693/attachment-0011.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: h10-IOTW-OCFS2.png
Type: image/png
Size: 18791 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160525/7eab6693/attachment-0012.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: h10-IOTW-cLVM-LV.png
Type: image/png
Size: 18169 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160525/7eab6693/attachment-0013.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: h10-IOTW-cLVM-Leg2.png
Type: image/png
Size: 27610 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160525/7eab6693/attachment-0014.png>


More information about the Users mailing list