[ClusterLabs] DRBD or SAN ?

Tue Jul 18 16:15:13 UTC 2017

On 2017-07-18 04:08 AM, Dmitri Maziuk wrote:
> On 7/17/2017 2:07 PM, Chris Adams wrote:
> 
>> However, just like RAID is not a replacement for backups, DRBD is IMHO
>> not a replacement for database replication.  DRBD would just replicate
>> database files, so if for example file corruption would be copied from
>> host to host.  When something provides a native replication system, it
>> is probably better to use that (or at least use it at one level).
> 
> Since DRBD is RAID-1, you need double the drives either way, no
> advantage over two independent copies -- only the potential for
> replicating errors. You probably need a 10G pipe, with associated costs,
> for "no performance penalty" DRBD while native replication tends to work
> OK over slower links.

That's... an interesting take. I strongly disagree.

We've deployed dozens and dozens of DRBD-backed clusters and only one
client needed 10 Gbps. Most users, in our experience, need lower
latency, not large throughput, and a good 1Gbps network has sub-ms
latency, faster than even 15krpm sas drives.

As for replication errors, well, you're judging something without using
it, I have to conclude. In all our years using DRBD, we have never had a
data corruption issue or any other problem induced by DRBD. We sure have
been saved by in on several occasions.

Having data synchronously replicated between two mechanically and
electrically isolated systems is fantastic protection.

> At this point a 2U SuperMicro chassis gives you 2 SSD slots for system
> and ZiL/L2ARC plus 12 spinning rust slots for a pretty large database...

Now speaking of trouble, I've been let down by Supermicro equipment
numerous times, and won't touch them with a ten foot pole anymore.

> That won't work for VM images, for that you'll need NAS or DRBD but IMO
> NAS wins. Realistically, a hard drive failure is the most likely kind of
> failure you're looking at, and initiating a full storage cluster
> failover for that is probably not a good idea. So you might want a
> drive-level redundancy on at least the primary node, at which point
> dual-ported SAS drives in external shelves become economical, even with
> a couple of dual-ported SAS SSDs for caches. So ZFS setup I linked to
> above actually comes with fewer moving parts and all the handy features
> absent from previous-gen filesystems.
> 
> Dima

A NAS is a single point of failure, and after years of managing dozens
of clusters, I could rattle off quite a number of failure scenarios
we've seen in the field. Here are a few;

* Failed voltage regulators taking a node offline without warning.
* Failed backplanes causing multiple disks to be lost.
* User error destroying RAID arrays.
* Bad components used during upgrades causing a node to be offline until
a new part is delivered

Etc.

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould