Our business and home storage devices contain some of our important and irreplaceable information. In this video, you’ll learn how RAID can be used to provide data redundancy when a drive failure occurs.
<< Previous Video: Storage Devices Next: Motherboard Form Factors >>
Our hard drives store an amazing amount of information. And this is usually really important information. It’s our pictures. It’s our movies. It’s our financial data and so much more.
But one of the challenges we have with hard drives is that they are moving components. And just like anything else that’s moving, eventually, this device is going to break. So what happens to our data if this drive is broken? We need some way to be able to still access our data even if a drive happens to fail.
And we can prepare for that by creating an array of drives and sharing the information over that array. This array of drives is called RAID. RAID stands for a Redundant Array of Independent Disks. You might also see those referred to as a Redundant Array of Inexpensive Disks.
But what this really is referring to is the redundancy that we can get by putting many drives together and sharing information across those drives. There are some RAID levels that are very redundant and provide us with access to our data even if a drive happens to fail. But there are some types of RAID that provide no redundancy.
So we want to be sure, if we’re configuring a RAID array, that we’re using the right type for what we’re trying to do. The types of RAID arrays that we will describe will be RAID 0, which you may also hear it referred to as striping. We’ll talk about RAID 1, which is also referred to as mirroring. We’ll talk about RAID 5, which is striping with parity.
And lastly, we’ll discuss nested RAID, which is RAID 1 plus 0. Sometimes you’ll see this referred to as RAID 10. This is a stripe of mirrors, which combines a number of different RAID types, to provide redundancy and performance.
There’s two major types of RAID configurations you might run into. One is a software-based RAID. And the other one is a hardware-based RAID. Software-based RAID may be already built into the operating system that you’re using.
You can put multiple hard drives into your computer, and then tell the operating system that you would like to use these drives as a RAID array. You don’t need any special hardware. You don’t need any special controllers. And the operating system takes care of all of the RAID configurations.
The other type of RAID you might use is a hardware-based RAID, which means there is a separate piece of hardware or a hard drive controller that’s inside of your computer that’s handling all of the RAID functions. This is a configuration that’s usually done outside of the operating system. So you would configure all of your RAID configurations on the controller. And the operating system sees all of this as one, single drive.
When you’re deciding which one of these types of configurations to use, it usually comes down to cost and performance. Obviously, if it’s built into the operating system, there’s usually little or no cost associated with configuring software-based RAID. But if you need higher performance, you need to be sure that the RAID is going to run as fast as it possibly can. Or if you’re using an operating system that doesn’t support any type of RAID, you may want to install a RAID controller, and use hardware-based RAID.
One common characteristic with RAID arrays is that we are using drives that are hot swappable. This means that we’re able to remove and insert drives as the system continues to run. It’s common to use this hot-swappable drive function in a chassis such as this one, where you can put multiple drives into the chassis, and then remove or add drives at any time. So if we do have a drive outage, we can remove the drive from the chassis, put a new drive into that system, put it back into the chassis, and still maintain that 100% uptime.
The first type of RAID format we’ll look at is RAID 0 or striping. Striping means that we’re going to split a file between two or more different drives. So we would put one block of a file on the first disk. We put another block of the file on another disk. And then we would alternate back and forth until we put the entire file across both of those drives.
This, theoretically, means that we have higher performance. Because we can write a little bit of information to multiple drives simultaneously. But as you’ve probably already guessed, if we’re splitting the file across these multiple drives and one of these drives happens to fail, then we have no access to any of these files after the fact.
That means there’s no redundancy with RAID 0. If you have a single-drive failure, you lose access to all of your data. One easy way to remember this is that RAID 0 means that you have 0 redundancy.
RAID 1 is a type of RAID that mirrors information across different drives. We’ll take this same scenario of having two different drives. But instead of splitting a file across the drives, we will write exactly the same information to both drives, or effectively create a mirror of each other.
This means that we’re going to be using a lot of drive space. We’ll effectively be doubling the amount of drive space required to store a single file. But this also means that, if we lose one of these drives, we still have 100% uptime. Because we have another working drive where an exact duplicate of that information happens to reside.
For a more efficient use of drive space, we may want to use RAID 5, which is also striping with parity. If you remember from RAID 0, we were performing striping, which means we put one block of a file on one drive, one block of a file on another drive, and so on. But the difference with RAID 5 is that we put a parity block on the last drive. That means that, if we lose any of these disks, we’re still able to reconstruct the data. Because we have the parity.
This makes for a very efficient use of drive space. We don’t have to have an exact duplicate of every file. We can simply reconstruct what’s missing if we happen to lose a drive.
If we do happen to lose any one of these drives, all of our data is still available. We still have to reconstruct what’s missing. So there might be a small performance hit as we’re trying to reconstruct this information on the fly. But all of the data remains available, and we have complete redundancy.
With RAID 10– or RAID 1 plus 0, as it’s sometimes called– we have a stripe of mirrors. We’re combining RAID 0 and the performance that comes from RAID 0 with RAID 1 and the mirroring and redundancy that we have with RAID 1. To be able to configure RAID 1 plus 0, we would first have drives where we were striping information across those. So we would take our file, put a block of our file on one drive, a block of our file on the other drive, and a block on the third drive.
But of course, if we lose any one of these drives, we would lose access to all of the data. So to fix that, we would add a mirror of each one of these sets of blocks. So we would have a mirror or RAID 1 of the first block.
We’d have a RAID 1 of the second block and a RAID 1 of the third block. And we would still be striping across all of those mirrors. So now we’ve combined the performance of RAID 0 with the redundancy of RAID 1 and created RAID 1 plus 0.