RAID (Redundant Array of Independent Disks) refers to storing data distributed on multiple disks (SSDs, HDDs, so long as they’re the same type). We use multiple disks for redundancy to prevent data loss or increase throughput.

Single-number configurations

RAID 0: striped volume store half data on one disk, half on another (odd one, even other) increases throughput since we can pull from both disks at once but: any disk failing means we still lose data

RAID 1: mirror all data across all disks read performance can still be sped up more redundancy for good reliability: one disk failing means we’re okay if we still have other working disks write performance same as a single disk, assuming parallel writes

RAID 4: for “parity” stripes data across disks 3, like RAID 0 uses one extra disk for parity, some bit of information such that if we lose data in one disk, we can recalculate what is lost. each bit is the XOR of all 3 disks — if we lose data in >1 disk we obviously can’t recalculate but: writing any disk needs to recalculate the parity bit

RAID 5: same idea as RAID 4 but distributes the parity disk across all disks instead of a single disk so that writing multiple disks doesn’t need multiple writes to a single disk

RAID 6: one more parity block per stripe Galois field? can recover from 2 simultaneous drive failures

RAID 0 + 1 RAID 1 of two RAID 0s

Raid 1 + 0 RAID 0 of 3 RAID 1s