Protection - RAID Systems

December 2016

Presentation of RAID Technology

RAID technology (acronym for Redundant Array of Inexpensive Disks, or sometimes Redundant Array of Independent Disks) allows user to form one storage unit from several hard drives. The created unit (called a cluster) is therefore highly fault-tolerant (high-availability) or has a higher I/O capacity. The distribution of data on several hard drives allows for increased data security and more reliable associated services.

This technology was developed in 1987 by three researchers (Patterson, Gibson and Katz) at the University of California (Berkeley). Since 1992, the RAID Advisory Board has managed these specifications. This consists in putting together a large capacity (and therefore expensive) drive with the help of smaller, cheaper drives (meaning that the MTBF, Mean Time Between Failure, is small).

According to RAID technology, the assembled drives can be used in different ways, which are called RAID Levels. The University of California defined 5 levels, which were assigned the levels of 0 to 6. Each one of these levels describes the manner in which the data are distributed over the drives:

  • Level 0: called striping
  • Level 1: called mirroring, shadowing or duplexing
  • Level 2: called striping with parity (obsolete)
  • Level 3: called disk array with bit-interleaved data
  • Level 4: called disk array with block-interleaved data
  • Level 5: called disk array with block-interleaved distributed parity
  • Level 6: called disk array with block-interleaved distributed parity

Each of these levels constitutes a way of using the cluster, according to:

  • performance
  • cost
  • access disks

Level 0

The RAID-0 level, called striping (which is sometimes mistakenly called stripping) consists in storing data by spreading them out over all of the cluster's drives. This level had no redundancy and therefore is not fault-tolerant. Indeed, if one of the drives fails, all of the data divided up over all the drives will be lost.

However, given that each drive of the cluster has its own controller, this solution offers a higher data rate.

RAID-0 consists of the logical juxtaposition (aggregation) of several physical hard drives. In RAID-0 mode, data are written in stripes:

Drive 1
Stripe 1
Stripe 4
Stripe 7
Drive 2
Stripe 2
Stripe 5
Stripe 8
Drive 3
Stripe 3
Stripe 6
Stripe 9

The term "striping" is used to characterize the relative size of the fragments (stripes) stored on each physical unit. The average output depends on this factor (the smaller the stripe, the better the output).

If one of the elements of the cluster is bigger than the others, the system for filling the drives with data will be blocked when the smaller disk is full. Therefore, the final size is equal to double to capacity of the smaller of the two drives:

  • two 20 Gb drives means a logical drive of 40 Gb
  • a 10 Gb drive used together with a 27 Gb drive translates into a logical drive of 20 Gb (17 Gb from the second drive will remain unused)
N.B. It is recommended that two drives of identical size be used for RAID-0 because otherwise, the drive with the larger capacity will not be fully exploited.

Level 1

The goal of level 1 is to duplicate the information and store it on several drives. The terms mirroring or shadowing are used to describe this procedure.

Drive 1
Stripe 1
Stripe 2
Stripe 3
Drive 2
Stripe 1
Stripe 2
Stripe 3
Drive 3
Stripe 1
Stripe 2
Stripe 3

Level 1 provides greater data security because if one of the drives fails, the data are saved on the other. In addition, reading the data can be much quicker when both drives are operating. Finally, given that each drive has its own controller, the server can continue to operate even if one of the drives fails, in the same way that a semi-truck can continue to drive if one of its tires bursts because it has several tires on each axle.

Conversely, RAID-1 technology is very expensive given that only half of the storage capacity is in fact being used.

Level 2

Level RAID-2 is now obsolete because it uses Hamming code for error correction (ECC codes - Error Correction Code). Hamming code is now directly integrated in hard drive controllers.

This technology consists in storing data according to the same principle as in RAID-0 but by writing the ECC check bits on a separate unit (normally 3 ECC drives are used for 4 drives of data).

RAID 2 technology offers mediocre performances but a high level of security.

Level 3

Level 3 RAID technology stores data in bytes on each drive and devotes one of the drives to storing a parity bit.

Drive 1
Byte 1
Byte 4
Byte 7
Drive 2
Byte 2
Byte 5
Byte 8
Drive 3
Byte 3
Byte 6
Byte 9
Drive 4
Parity 1+2+3
Parity 4+5+6
Parity 7+8+9
In this way, if one of the drives were to fail, it would be possible to piece together the information from the other drives. After piecing the information together, the content of the faulty drive would again be complete. On the other hand, if two of the drives were to fail simultaneously, it would be impossible to recover any lost data.

Level 4

Level 4 RAID technology is very similar to level 3. The difference is in the parity level: level 4 uses block level striping with a dedicated parity disk, whereas level 3 uses byte-level striping. More precisely this means that the striping is different from RAID 3.

Drive 1
Block 1
Block 4
Block 7
Drive 2
Block 2
Block 5
Block 8
Drive 3
Block 3
Block 6
Block 9
Drive 4
Parity 1+2+3
Parity 4+5+6
Parity 7+8+9

In order to read a reduced number of blocks, the system does not have to access multiple physical drives but only those on which the data are actually stored. Conversely, the drive hosting the control data must have an access time that is equal to the sum of access time of the other disks so as to not limit the performance of the whole.

Level 5

Level 5 is similar to level 4, i.e. parity is calculated at the block level but is spread over all of the cluster's drives.

Drive 1
Block 1
Block 4
Parity 7+8+9
Drive 2
Block 2
Parity 4+5+6
Block 7
Drive 3
Block 3
Block 5
Block 8
Drive 4
Parity 1+2+3
Block 6
Block 9

That way, RAID 5 greatly improves access to data (both in writing and reading) because access to parity bits is spread over the cluster's different drives.

RAID-5 provides performances that are very close to those obtained in RAID-0 while ensuring high fault tolerance. This is why it is one of the best RAID modes in terms of performance and reliability.

N.B. Given that the usable drive space in a cluster of n drives is equal to n-1 drives, it is best to have a large number of drives in order to make RAID-5 "cost-effective".

Level 6

Level 6 was added to the levels defined by the Berkeley researchers. It defines the use of two functions of parity and their storage on two dedicated drives. This level ensures redundancy in case both drives are damaged simultaneously. This means that at least 4 drives are needed to implement a RAID-6 system.

Comparison

The RAID solutions that are generally used are levels 1 and 5.

Choosing a RAID solution depends on three criteria:

  • security: RAID 1 and 5 both offer a high level of security but the drive rebuilding method is different in each solution. In case of system failure, RAID 5 rebuilds the missing drive using the information stored on the other drives, whereas RAID 1 has a copy on each drive.
  • Performance: RAID 1 offers a better performance than RAID 5 in terms of reading but is weak in terms of writing.
  • Cost: cost is directly linked to the storage capacity that must be implemented to have a specific effective capacity. The RAID 5 solution offers a usable volume that represents 80 to 90% of the allotted volume (with the rest being used for error correction). The available volume of the RAID 1 solution, on the other hand, is only 50% of the total volume (given that the information is duplicated).

Implementing a RAID Solution

There are several different ways to implement a RAID solution on a server:

  • software-based RAID: this generally involves a driver at the operating system level of the computer that is capable of creating one logical volume with several drives (SCSI or IDE).
  • hardware-based RAID:
    • with DASDs (Direct Access Storage Device): external storage units with their own energy supply. What is more, these devices have connectors that make it possible to switch drives while they are on (such drives are "hot swappable"). These devices manage their drives themselves, so well in fact that they are recognised as SCSI standard drives.
    • with RAID controllers: cards that fit in PCI or ISA expansion slots and that allow the control of several hard drives.

Related :


Protección - Sistemas RAID
Protección - Sistemas RAID
Protection - Les systèmes RAID
Protection - Les systèmes RAID
Protezione - I sistemi RAID
Protezione - I sistemi RAID
Proteção - Os sistemas RAID
Proteção - Os sistemas RAID
This document entitled « Protection - RAID Systems » from CCM (ccm.net) is made available under the Creative Commons license. You can copy, modify copies of this page, under the conditions stipulated by the license, as this note appears clearly.