Disk can lie to you when you write to it

https://news.ycombinator.com/rss Hits: 2
Summary

A write-ahead log (WAL) is one of those database concepts that sounds deceptively simple. You write a record to disk before applying it to your in-memory state. If you crash, you replay the log and recover. Done.Except your disk is lying to you.PostgreSQL, SQLite, RocksDB, Cassandra... every production system that claims to be durable relies on a WAL. It's the fundamental contract: "Write here, and I promise your data survives." But making that promise actually stick requires understanding all the ways disk fail silently.The Naive Approach vs RealityLet's say you implement a WAL like this:write(fd, record, sizeof(record)); // Done, right... RIGHT?In a test environment on your laptop, this works great. But when you handle millions of writes a day, those 1 in a million errors happen multiple times a day. Some of these systems will fail in ways your tests never catch:The page cache problem: That write() just copied your data into the kernel's buffer. It hasn't touched the disk, yet. Crash now, and it's gone.The disk that lies about success: Your write() returns success. The kernel tells you it's synced. The disk firmware tells you it's on stable storage. Then a latent sector error silently corrupts it anyway.The ordering chaos: Write operation A starts. Write operation B starts. B completes first. Your recovery code sees B without A and has no idea what happened.The single point of failure: One bad sector on your only copy of the WAL, and you lose everything.This is why people who've lost data in production are paranoid about durability. And rightfully so.Building the Better MousetrapThere are 5 layers of defense that we can use to build a better mouse trap. Think of them as increasingly specific answers to the question: "How can this fail?"Layer 1: Checksums (CRC32C)Every record includes a checksum of its contents. After writing, we verify the checksum hasn't changed. Simple, right?Record Header (20 bytes): [magic_num: 4][sequnce_num: 8][checksum: 4] [payload: variabl...

First seen: 2025-12-14 21:55

Last seen: 2025-12-14 22:55