Then there's the Jepsen tests for humbling you into the right frame of mind for building these systems. Learn how to break things so you can build better things. jepsen.io/analyses
Jack Vanlightly
1,639 posts
Likes breaking ideas and systems, writing, picking systems apart @confluentinc
Ex @Splunk, @VMware
jack-vanlightly.com
Credit: ESO/B. Tafreshi
- Redpanda bring out benchmark after benchmark claiming performance superiority over Apache Kafka. I decided to run my own tests to see if any of it was true.
- Chapter 4 of The Architecture of Serverless Data Systems: CockroachDB (serverless).
- Introducing "The Architecture of Serverless Data Systems". An ongoing review of real-world serverless, multi-tenant data systems.
- If anyone is interested in the details of the Kafka Replication Protocol, I wrote a Raft paper-style description of the protocol last year.Replying to @eatonphilConsensus for data: * spanner, mongo, dynamodb, cosmosdb (surprised by these three), cockcroach, tidb, yugabyte, redpanda Consensus for metadata: * kafka, foundationdb, memsql, clickhouse, elasticsearch, planetscale, aurora, (edb postgres distributed; product I work on,) etc
- Queue semantics are coming to Apache Kafka (KIP-932) and in fact there are many advantages to building queues on top of logs rather than opting for a more queue-native design.
- I sometimes get asked for advice about how to learn complex distributed systems. I thought about it and wrote this piece.
- I'm digging into Apache Iceberg internals for the final table format consistency model blog post. Part of my process of understanding a project from its code is making a (throwaway) map of the important classes and functions. This is especially important in the early hours of
- BYOC is something I’ve been thinking about recently so I decided to write down the thoughts I have on it and where I think cloud services are going in general.
- The Apache Iceberg post on change query support (including CDC) is out. Delta is next.
- As promised, I have written a complete Kafka replication protocol description (with KIP-966 changes applied) which is inspired by the precise but accessible style and language of the Raft paper.
- Chapter 6 of The Architecture of Serverless Data Systems is out. This chapter focuses on commonalities in how these systems scale according to tenant load. Despite the varied workloads, patterns emerge that we can learn from.
- I've written 18 posts (and counting) on table format internals. I've created a page that contains the list of my writings on the subject, including my formal verification work. Any suggestions on further table format analysis?




