Introduction to the comparison of array RAID in SQLServer database server read and write performance

1: Introduction to RAID

RAID (Redundant Array of Independent Disk) is a data protection policy.

Two: Several commonly used levels of RAID

1. RAID 0

Improve data I/O through parallel reading, and the read operation efficiency is very high, but it does not provide data fault tolerance and protection. It is not recommended to use as a SQL Server.

2. RAID 1

Mirror protection, there are two drives, one for the main drive and the other for mirroring, so it actually requires twice the drive, and the second is used for redundant use. Using RAID 1, our storage capacity should be (n*s/2).

Write to two disks at a time, so although the write speed will have a slight impact, the read speed is almost twice that of most cases. Because the drive can be accessed in parallel during the read operation, thereby increasing throughput. RAID 1 is limited to two drives.

3. RAID 5

Disk strip with verification. In this type of RAID, data is written in the form of complex stripes to all drives in the array, while all drives have distributed number check blocks. In this way, RAID 5 can use an array of any size consisting of three or more disks, sacrificing only the storage capacity equivalent to one disk for verification. However, this verification is distributed and does not exist alone on any physical disk.

RAID 5 is cost-effective because it sacrifices less storage capacity in large arrays and is widely used. Unlike mirroring, stripes with checks require that each write strip must be calculated between disks, which creates a portion of the overhead. Therefore, throughput is not always an easy-to-calculate project, it depends to a large extent on the computing power of the system when doing verification calculations.

Calculating the capacity of RAID 5 is very simple: that is ((n-1)*s). The RAID 5 array can avoid the loss of any single disk in this column.

Each write operation to RAID5 involves multiple reads for calculation and storage. RAID 5 is not a good choice when there are many write operations for SQL Server and requires high efficiency.

4. RAID 6

Disk strip with double verification. RAID 6 is very similar to RAID 5, but its stripe uses two check blocks instead of one, which enhances protection against disk failures.

RAID 6 is a new member of the RAID family. RAID 6 is an increase in several other RAID types after a few years of standardization. RAID 6 is special because it can withstand any two drives in the array while preventing data loss. But to match the extra redundancy, the RAID 6 array needs to sacrifice the capacity equivalent to two drives in the array and require at least four drives in the true column. The capacity of RAID 6 can be calculated using ((n-2)*s).

5. RAID 10

Mirror with stripes. Technically, RAID 10 is a hybrid RAID, including a pair of RAID images present in a non-check band (RAID 0).

When there are only two drives in an array, many manufacturers will call it RAID 10 (or RAID 10+), but technically this should be RAID 1 because at least four drives in the array will be striped. For RAID 10, the drives must be added one-to-one, so the number of drives in the array can only be even.

RAID 10 can operate normally with nearly half of the drive sets lost, while at most one drive fails or loses at most one drive. RAID 10 does not include verification calculations, which makes it have certain performance advantages over RAID 5 and RAID 6, and the array also requires less computing power. RAID 10 provides read performance over any common type of RAID, because all drives in the array can be used simultaneously during a read operation. However, the write performance of RAID 10 is much lower. The capacity calculation method of RAID 10 is the same as RAID 1, both (n*s/2).

RAID performance comparison

Reading efficiency: Because it is parallel reading, the reading efficiency is very high.

Write efficiency: RAID 0 > RAID 1 > RAID 10 > RAID 5

Disk Utilization: RAID 0 > RAID 5 > RAID 1 = RAID 10

Fault tolerance: RAID 10 = RAID 1 > RAID 5 > RAID 0

DB Server as SQL Server recommends using RAID 1 or RAID10.

Three: RAID and SQL Server

DB server physical disk design with separate RAID volumes for data, log, tempdb and backup files.

Disk architecture:

C: OS [requires good data reading and writing efficiency, and has strong fault tolerance, providing data protection]

D: Do RAID10, store DB data files [requires high reading efficiency, lower writing efficiency, and strong fault tolerance. If the data files are large, it is required to save disk space]

E: Do RAID1, store DB log files [During the DB operation, log reading and writing are frequently read and written, and requires high data reading and writing efficiency]

F: Do RAID10, tempdb for storing database [storing temporary database]

G (optional): Do RAID1, data backup, it is recommended to backup data at the remote end

SSD hard disk: The full English name of SSD is Solid State Disk/Drive, which is translated as solid state hard disk in Chinese. It can be widely used in servers, desktops, notebooks, mobile devices, game consoles, etc., to accelerate startup, improve performance, and reduce power consumption.

advantage:

Fast speed (high I/O).

Durable and shock-resistant

No noise

Light weight

SQL Server can achieve good I/O performance with SSD hard drive

shortcoming:

The price is high and the capacity is small. RAID will further lose capacity, so the cost-effectiveness is not high. It is a different matter for companies that do not consider the price.

The technology is not very mature yet

Four: Summary

Establishing different RAID architectures for different functions can improve data efficiency and utilization.