Michael Elgart

Book Review: Designing Data-Intensive Applications

This textbook is excellent and I wish I had read it sooner.

In particular, I have spent most of my career so far focused on gaining experience at work and on projects. That hands-on experience is key, but getting a larger overall picture of the world is also really important. Designing Data-Intensive Applications came out in 2017, which means I had been out of school for a while where, presumably, this book is more well known.

Many STEM careers can place you in a more academic position where I think reading the most up-to-date textbooks and papers is more common sense. But engineers in the private sector can obviously benefit from some textbook exposure as well to learn about the latest best practices and accumulated knowledge. What’s more, there are clearly benefits to the private firms whose employees would read this book! I’m fairly certain my old job had O’Reilly subscriptions, but I just didn’t know this was the thing to read.

As far as content, this book really focuses on everything you need to know about data backends. I’ve done a fair amount of work on front-end systems and building microservices, but ultimately my data systems exposure has been limited to fairly straightforward implementations. This book teaches you about replication, sharding, consensus and reliability, and also how to think about batch processing vs more static data. The O’Reilly site for the book contains the entire (extensive!) table of contents and I honestly recommend reading that if you have even the slightest interest in software, computer hardware, data systems, or the modern internet.

The detail is tremendous. At the risk of repeating myself, I really didn’t understand what I was missing or what I was getting into. I had thought about this as just another nonfiction book I was reading, but it really is a textbook which means it’s large (not just 500 pages, but textbook sized pages!) and dense. Taking notes, it was hard not to write something down for every paragraph, and so even my notes taken are intensely long.

The book is also quite clear in explaining everything so that it’s understandable. The only tiny quibble I had in the 550+ pages was that the author Martin Kleppmann has obviously not read Paul Sztorc’s “Nothing is Cheaper Than Proof of Work” despite it being written in 2015. But at least this was restricted to the speculative and opinionated final chapter, so I can’t fault it too much.

So of course, I would highly recommend it. I’m also somewhat excited as I’m wondering what other important texts I’ve been missing. I also think I’ve got a slightly better system for finding out what people at my current employer (Facebook) think are good programming/software books, so I’m hopeful it won’t be another 7 years before I read another important text!