Posts

Cloudspecs: Cloud Hardware Evolution Through the Looking Glass

Image
This paper (CIDR'26) presents a comprehensive analysis of cloud hardware trends from 2015 to 2025, focusing on AWS and comparing it with other clouds and on-premise hardware. TL;DR: While network bandwidth per dollar improved by one order of magnitude (10x), CPU and DRAM gains (again in performance per dollar terms) have been much more modest. Most surprisingly, NVMe storage performance in the cloud has stagnated since 2016. Check out the NVMe SSD discussion below for data on this anomaly. CPU Trends Multi-core parallelism has skyrocketed in the cloud. Maximum core counts have increased by an order of magnitude over the last decade. The largest AWS instance u7in now boasts 448 cores. However, simply adding cores hasn't translated linearly into value. To measure real evolution, the authors normalized benchmarks (SPECint, TPC-H, TPC-C) by instance cost. SPECint benchmarking shows that cost-performance improved roughly 3x over ten years. A huge chunk of that gain comes from AWS G...

The Sauna Algorithm: Surviving Asynchrony Without a Clock

Image
While sweating it out in my gym's sauna recently, I found a neat way to illustrate the happened-before relationship in distributed systems. Imagine I suffer from a medical condition called dyschronometria , which makes me unable to perceive time reliably, such that 10 seconds and 10 minutes feel exactly the same to me. In this scenario, the sauna lacks a visible clock. I'm flying blind here, yet I want to leave after a healthy session. If I stay too short, I get no health benefits. If I stay too long, I risk passing out on the floor. The question becomes: How do I, a distributed node with no local clock, ensure operating within a safety window in an asynchronous environment? Thankfully, the sauna has a uniform arrival of people. Every couple of minutes, a new person walks in. These people don't suffer from dyschronometria and they stay for a healthy session, roughly 10 minutes. My solution is simple: I identify the first person to enter after me, and I leave when he leaves....

Are Database System Researchers Making Correct Assumptions about Transaction Workloads?

Image
In this blog, we had reviewed quite a number of deterministic database papers, including Calvin , SLOG , Detock , which aimed to achieve higher throughput and lower latency. The downside of these systems is sacrificing transaction expressivity. They rely on two critical assumptions: first, that transactions are "non-interactive", meaning they are sent as a single request (one-shot) rather than engaging in a multi-round-trip conversation with the application, and second, that the database can know a transaction's read/write set before execution begins (to lock data deterministically). So when these deterministic database researchers write a paper to validate how these assumptions hold in the real world, we should be skeptical and cautious in our reading. Don't get me wrong, this is a great and valuable paper. And we still need to be critical in our reading.  Summary The study employed a semi-automated annotation tool to analyze 111 popular open-source web applications...

Too Close to Our Own Image?

Image
Recent work suggests we may be projecting ourselves onto LLMs more than we admit. A paper in Nature reports that GPT-4 exhibits "state anxiety". When exposed to traumatic narratives (such as descriptions of accidents or violence), the model's responses score much higher on a standard psychological anxiety inventory. The jump is large, from "low anxiety" to levels comparable to highly anxious humans. The same study finds that therapy works: mindfulness-style relaxation prompts reduce these scores by about a third, though not back to baseline. The authors argue that managing an LLM's emotional state may be important for safe deployment, especially in mental health settings and perhaps in other mission-critical domains. Another recent paper argues that LLMs can develop a form of brain rot. Continual training on what the authors call junk data (short, viral, sensationalist content typical of social media) leads to models developing weaker reasoning, poorer lon...

The Agentic Self: Parallels Between AI and Self-Improvement

2025 was the year of the agent. The goalposts for AGI shifted; we stopped asking AI to merely "talk" and demanded that it "act". As an outsider looking at the architecture of these new agents and agentic system, I noticed something strange. The engineering tricks used to make AI smarter felt oddly familiar. They read less like computer science and more like … self-help advice . The secret to agentic intelligence seems to lie in three very human habits: writing things down, talking to yourself, and pretending to be someone else. They are almost too simple. The Unreasonable Effectiveness of Writing One of the most profound pieces of advice I ever read as a PhD student came from Prof. Manuel Blum, a Turing Award winner. In his essay "Advice to a Beginning Graduate Student", he wrote: "Without writing, you are reduced to a finite automaton. With writing you have the extraordinary power of a Turing machine." If you try to hold a complex argument enti...

Rethinking the Cost of Distributed Caches for Datacenter Services

Image
This paper (HOTNETS'25) re-teaches a familiar systems lesson: caching is not just about reducing latency, it is also about saving CPU! The paper makes this point concrete by focusing on the second-order effect that often dominates in practice: the monetary cost of computation. The paper shows that caching --even after accounting for the cost of DRAM you use for caching-- still yields 3–4x better cost efficiency thanks to the reduction in CPU usage. In today's cloud pricing model, that CPU cost dominates. DRAM is cheap. Well, was cheap... The irony is that after this paper got presented, the DRAM prices jumped by 3-4x ! Damn Machine Learning ruining everything since 2018! Anyways, let's ignore that point conveniently to get back to the paper. Ok, so caches do help, but when do they help the most? Many database-centric or storage-side cache designs miss this point. Even when data is cached at the storage/database cache, an application read still needs to travel there, pay fo...

Randomer Things

I aspire to get bored in the new year I've realized that chess has been eating my downtime. Because it lives on my phone (Lichess), it is frictionless to start a bullet game, and get a quick dopamine hit. The problem is that I no longer get bored. That is bad. I need to get bored so I can start to imagine, daydream, think, self-reflect, plan, or even get mentally prepared for things (like the Stoics talked about). I badly need that empty space back. So bye chess. Nothing personal. I will play only when teaching/playing with my daughters. I may occasionally cheat and play a bullet game on my wife's phone. But no more chess apps on my phone. While I was at it, I installed the  Website Blocker extension for Chrome. I noticed my hands typing reddit or twitter at the first hint of boredom. The blocker is easy to disable, but that is fine. I only need that slight friction to catch myself before opening the site on autopilot. I am disappointed by online discourse In 2008, Reddit had a...

LeaseGuard: Raft Leases Done Right!

Image
Many distributed systems have a leader-based consensus protocol at their heart. The protocol elects one server as the "leader" who receives all writes. The other servers are "followers", hot standbys who replicate the leader’s data changes. Paxos and Raft are the most famous leader-based consensus protocols. These protocols ensure consistent state machine replication , but reads are still tricky. Imagine a new leader L1 is elected, while the previous leader L0 thinks it's still in charge. A client might write to L1, then read stale data from L0, violating Read Your Writes . How can we prevent stale reads? The original Raft paper recommended that the leader communicate with a majority of followers before each read, to confirm it's the real leader. This guarantees Read Your Writes but it's slow and expensive. A leader lease is an agreement among a majority of servers that one server will be the only leader for a certain time. This means the leader can run...

TLA+ modeling tips

Model minimalistically Start from a tiny core, and always keep a working model as you extend. Your default should be omission. Add a component only when you can explain why leaving it out would not work. Most models are about a slice of behavior, not the whole system in full glory: E.g., Leader election, repair, reconfiguration. Cut entire layers and components if they do not affect that slice. Abstraction is the art of knowing what to cut . Deleting should spark joy.  Model specification, not implementation Write declaratively. State what must hold, not how it is achieved. If your spec mirrors control flow, loops, or helper functions, you are simulating code. Cut it out. Every variable must earn its keep. Extra variables multiply the state space (model checking time) and hide bugs. Ask yourself repeatedly: can I derive this instead of storing it? For example, you do not need to maintain a WholeSet variable if you can define it as a state function of existing variables: WholeSet =...

Brainrot

Image
I drive my daughter to school as part of a car pool. Along the way, I am learning a new language, Brainrot. So what is brainrot ? It is what you get when you marinate your brain with silly TikTok, YouTube Shorts, and Reddit memes. It is slang for "my attention span is fried and I like it". Brainrot is a self-deprecating language. Teens are basically saying: I know this is dumb, but I am choosing to speak it anyway. What makes brainrot different from old-school slang is its speed and scale. When we were teenagers, slang spread by word of mouth. It mostly stayed local in our school hallways or neighborhood. Now memes go global in hours. A meme is born in Seoul at breakfast and widespread in Ohio by six seven pm. The language mutates at escape velocity and gets weird fast.  Someone even built a brainrot programming language . The joke runs deep , and is getting some infrastructure. Here are a few basic brainrot terms you will hear right away. He is cooked : It means he is finis...

Best of metadata in 2025

Image
It is that time of year again to look back on a year of posts. I average about sixty posts annually. I don't explicitly plan for the number, and I sometimes skip weeks for travel or work, yet I somehow hit the number by December. Looking back, I always feel a bit proud. The posts make past Murat look sharp and sensible, and I will not argue with that. Here are some of the more interesting pieces from the roughly sixty posts of 2025. Advice Looks like I wrote several advice posts this year. I must be getting old. The Invisible Curriculum of Research Academic chat: On PhD What I'd do as a College Freshman in 2025 My Time at MIT What makes entrepreneurs entrepreneurial? Publish and Perish: Why Ponder Stibbons Left the Ivory Tower Databases Concurrency Control book reading was fun. Also the series on use of time in distributed databases. And it seems like I got hyperfocused on transaction isolation this year.  Concurrency Control and Recovery in Database Systems Book reading series...

Optimize for momentum

Progress comes from motion.  Momentum is the invisible engine of any significant work. A project feels daunting when you face it as a blank page. It feels easier when you built some momentum with some next steps. So, momentum makes the difference between blocked and flowing. Think of a stalled truck on a desert road. You can't lift it with superhuman strength. But by rocking it with small periodic forces at the right rhythm (matching its natural frequency) you can get it rolling. Each tiny push adds to the previous one because the timing aligns with the system's response. The truck starts to move, and then the engine catches. Projects behave the same way. A big project has its own rhythm. If you revisit it daily, even briefly, your pushes line up. Your brain stays warm. Context stays loaded . Ideas from yesterday are still alive today. Each session amplifies the last because you are operating in phase with your own momentum. When you produce something every day, you never feel...

Mitigating Application Resource Overload with Targeted Task Cancellation

Image
The Atropos paper (SOSP'25) argues that overload-control systems are built on a flawed assumption. They monitor global signals (like queue length or tail latency) to adjust admission control (throttling new arrivals or dropping random requests). This works when the bottleneck is CPU or network, but it fails when the real problem is inside the application. This considers only the symptoms but not the source. As a result, it drops the victims rather than the culprits. Real systems often run into overload because one or two unlucky timed requests monopolize an internal logical resource (like buffer pools, locks, and thread-pool queues). These few rogue whales have nonlinear effects. A single ill-timed dump query can thrash the buffer pool and cut throughput in half. A single backup thread combined with a heavy table scan can stall writes in MySQL as seen in Figure 3. The CPU metrics will not show this. Atropos proposes a simple fix to this problem. Rather than throttling or dropping ...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

TLA+ modeling tips

Foundational distributed systems papers

Learning about distributed systems: where to start?

Optimize for momentum

Advice to the young

The Agentic Self: Parallels Between AI and Self-Improvement

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Cloudspecs: Cloud Hardware Evolution Through the Looking Glass