We rely on our storage devices for access to our most important documents. In this video, you’ll learn how RAID can be used to maintain the uptime and availability of our data when a drive fails.
It’s not uncommon to use hard drives to store large amounts of information. Terabytes upon terabytes of your data can be stored on the single spinning drive. And of course, because there are moving parts inside of these drives, they will eventually fail. Fortunately, there are things you can do so that when a drive fails, your data will always remain available. And in this video, we’ll look at how RAID arrays can be configured to provide this type of data redundancy.
As we go through this video, it may seem that RAID would be a very good way to backup your data. But the data in a RAID array is not a copy of your data. It is the data that you are actively retrieving or storing. And because of that, you have to keep in mind that RAID is not a method of backup. It’s simply a way to maintain uptime and availability if one of those drives happens to fail.
The term RAID is an acronym for Redundant Array of independent Disks. Earlier versions of this acronym used the word Inexpensive Disks instead of Independent, but it’s effectively the same idea. We can use multiple storage drives in a system that work together to maintain this uptime and availability. There are different types of RAID. We refer to these different types as different RAID levels. And although the name is Redundant Array of Independent Disks, there are some RAID levels that do not provide redundancy. So you have to be very careful when you’re initially setting up your RAID array that you’re choosing the right RAID level.
Let’s step through each one of these RAID levels to see how they might be used. And we’ll start with RAID 0 or striping. We refer to this as striping because we have at least two drives in a RAID 0 array. And instead of writing everything to one drive or everything to the other drive, we take all of our data and we split it evenly between these two drives.
So let’s take the example of a file that exists on eight different blocks. And we have a disk 0 and a disk one. That single file will be split evenly between these two disks. So block one is on disk 0. Block 2 is on disk 1. Block 3 is back to disk 0. Block 4 is on disk 1. So you can see that distributing this across multiple drives provides a performance increase because we only have to write or read half of the data if we have two separate drives.
But one of the problems with this is that if we lose one of these drives, none of this data will now be available because you’ve effectively lost half of the data that you’ve stored. You should always think that RAID 0 is the same as having 0 redundancy.
RAID 1 looks very similar to RAID 0 in its structure. You can see there is a disk 0 and a disk 1. But with RAID 1, we are duplicating data between both of these drives. So each drive effectively is a mirror image of the other. That’s why we refer to RAID 1 as mirroring. You can obviously see in this scenario that we are using twice as much drive space than we would use with RAID 0.
Because we are duplicating every single drive, we will effectively need twice as much storage to be able to store this same amount of information. But on the plus side, if we lose either of these physical disks, our data remains available because we have an exact duplicate of the data on the other disk. It’s not uncommon when working with a RAID array that you would lose a drive and not even realize it until you receive an alert or message that the drive is failed.
Instead of duplicating every bit of data that you’re storing on these drives, there is a more efficient method. That method is RAID 5, which we call striping with parity. This is the same as RAID 0, where we take all of the information that we’re storing and we put pieces of that data across different drives. But unlike RAID 0, which of course, has 0 redundancy we have an additional drive where we store some parity information. That parity information allows us to rebuild this data if we happen to lose any of these physical drives.
This is a much more efficient use of drive space because you have all of the data being spread across drives and then on one additional drive some parity information. So if you’re storing this on four separate drives, three of those drives can be used for data and one of those drives can be used for parity. As you can see from our example, we also distribute the parity to different drives, which helps during the recovery process. And that recovery process is important when a single drive fails in that RAID 5 array.
It may seem that losing a single drive will also cause some of our data to be lost. And although that’s true, we’re able to rebuild this data in real time by taking advantage of the parity information that we stored separately. Having to recalculate this data in real time based on the parity that’s left over could cause a performance hit because there are CPU cycles required to do that. But with the proper processing on the RAID controller or the proper CPU inside of your system, you may not even see that a performance issue is occurring.
And the last RAID type we’ll look at is RAID 10. Some people refer to this as RAID 1 plus 0 or a stripe of mirrors. Let’s go back to that original RAID 0 configuration. And in this example, we have RAID 0 with three separate physical drives. And you can see that we are evenly distributing our files across all three of those individual drives. As with RAID 0, if we lose any one of these physical drives, then all of our data is inaccessible.
So with RAID 1 plus 0, we add on the RAID 1 part or the mirroring aspect and we start mirroring all of our RAID 0 arrays. So now we’re still storing three separate stripes of data, but instead of only having one copy of that stripe, we are mirroring each of those stripes of data. That way if we lose any one of these drives, we’re still up and running because we have an exact duplicate of that stripe. And in our example, we could even lose three separate drives and still have access to all of our data as long as the three drives that we’re losing are part of the single pairs in each individual RAID 1 mirror.