RAID (Redundant Array of Independent Disks) refers to storing data distributed on multiple disks (SSDs, HDDs, so long as they’re the same type). We use multiple disks for redundancy to prevent data loss or increase throughput.
Single-number configurations
RAID 0: striped volume store half data on one disk, half on another (odd one, even other) increases throughput since we can pull from both disks at once but: any disk failing means we still lose data
RAID 1: mirror all data across all disks read performance can still be sped up more redundancy for good reliability: one disk failing means we’re okay if we still have other working disks write performance same as a single disk, assuming parallel writes
RAID 4: for “parity” stripes data across disks 3, like RAID 0 uses one extra disk for parity, some bit of information such that if we lose data in one disk, we can recalculate what is lost. each bit is the XOR of all 3 disks — if we lose data in >1 disk we obviously can’t recalculate but: writing any disk needs to recalculate the parity bit
RAID 5: same idea as RAID 4 but distributes the parity disk across all disks instead of a single disk so that writing multiple disks doesn’t need multiple writes to a single disk
RAID 6: one more parity block per stripe Galois field? can recover from 2 simultaneous drive failures
RAID 0 + 1 RAID 1 of two RAID 0s
Raid 1 + 0 RAID 0 of 3 RAID 1s