Cutting Python Web App Memory Over 31%

mikeckennedy · 2026-04-10T15:55:13+00:00

Definitely stoked too. I initially like the idea because of startup speed too. But after this analysis, it should have some knock-on memory benefits as well.

mikeckennedy · 2026-04-02T14:47:37+00:00

No memory profiling, though that would have been interesting. Just process monitoring tools like btop and docker stats.

mikeckennedy · 2026-04-01T21:47:47+00:00

I don't know if I'd worry too much about it unless slow queries are an active problem. I was solving a different one. My ODM/ORM did not support async, so I was tired of that. Plus, the library was falling badly out of maintenance (last real release was a few years ago). So I wanted to replace mongoengine with *something*, so I decided this raw pattern was a good fit to try.

The speed up and less memory usage was a sweet bonus.

mikeckennedy · 2026-04-01T21:25:51+00:00

I'm not sure exactly what the entire CPU usage change would be. I think things are better in general. The raw+dc vs ODM/ORM change almost 2x the requests per sec for the same CPU. So that probably dwarfs any other change. Mem caches -> diskcache mean the would share the cache across processes and across restarts, so that is bonus. But a bit slower at runtime I would guess, but very minor.

Less mem used means Python's cycle GC is much more efficient. So when enough container types (classes, lists, dicts, etc) get created, that triggers a GC. The GC has much less memory to scan and Python is mega aggressive about this. If 700 container-types are allocated relative to the ones ref-count collected, that'll trigger a GC. That could easily happen with just a couple of big queries so that might be a real boost too.

I posted graphs for the DB change here: https://mkennedy.codes/posts/raw-dc-a-retrospective/

mikeckennedy · 2026-04-01T21:01:28+00:00

Since you all are discussing the caching part specifically. There was not much complexity change before or after.

We are already using caching, just in memory caching. What I moved it to was diskcache backed cache rather than in-memory caching.

It's not "there was no caching" now "there is caching", it's just in-mem caching via either functools.lru_cache or dict() -> diskcache.

Given we already have diskcache in play before, that's low effort, low risk.

mikeckennedy · 2026-04-01T19:59:36+00:00

Awesome, great to here u/bladeofwinds :) Datastar is neat for sure.

mikeckennedy · 2026-04-01T19:56:38+00:00

> I’m not understanding the problem here we’re trying to solve.

I think we just have different views on running in prod. It took me 3 hours to reduce the running memory of my apps by 3.2GB. In my world, that is time well spent. Just because the server isn't crashing with out of memory doesn't mean a little attention to efficiency is waste.

Again, different strokes.

mikeckennedy · 2026-04-01T19:55:10+00:00

Like u/Substantial-Bed8167 said, diskcache is VERY fast. It uses SQLite and pretty much gets that cached into memory with a disk backing it on flush. Just a quick test. On my mac, diskcache does

writes: ~14,000/sec 40us/op
reads: Reads ~160,000/sec 6us/op

That's 0.00625 millsec per read. That is not perceivable as far as I'm concerned. Even if you read a bunch of items on a request, say 100, you're still only 0.5ms in total. And that is instead of recomputing or hashing and reading 100 items out of a dict which is fast but not insanely faster.

mikeckennedy · 2026-04-01T19:47:51+00:00

Very cool, thanks for the heads up u/vaibeslop

mikeckennedy · 2026-04-01T17:35:28+00:00

Hey, yes, improvements were maybe 75-100MB in total. If you read the article it talks about the nuance.

The part of the app that uses the imports only runs maybe a couple of times a month. The worker processes recycle every 6 hours. So there is a period where the extra 100MB are used for that 6 hour time frame. The worker processes recycle, that code is NOT called again, the memory stays lower almost all the time.

I'm not messing with pruning modules. It's just the way the web processes are managed by Granian.

mikeckennedy · 2026-04-01T17:09:08+00:00

I'm self hosting Talk Python To Me over at https://talkpython.fm It's a custom web app and the files are served out of the bunny.net CDN. We do about 1TB of traffic on episode release day. I have the podcast listed almost everywhere that takes podcasts, Apple, Google, Spotify, YouTube, etc.

mikeckennedy · 2026-03-05T20:47:16+00:00

Hey everyone. I've been podcasting for over 10 years @ the Talk Python To Me and Python Bytes shows. I just launched a tool to help prepare + record + produce podcast episodes that lives along side recording tools like Streamyard/Riverside/Zencastr/etc.

It's called InterviewCue. Check it out at https://interviewcue.com It has a 100% free tier with free transcripts, file conversions, and more.

I'd love your thoughts. I have been refining this for 6 months behind the scenes on my own shows. It's now ready for fellow podcasters to try.

mikeckennedy · 2026-02-25T05:47:54+00:00

Very nice!

mikeckennedy · 2026-02-15T16:38:35+00:00

For this pattern, you'd have to write the migrations yourself for RDBMSes. But that would probably fall to Claude as well if you buy argument of adopting the pattern because AI knows it better.

That said, I primarily use MongoDB and Mongo rarely has migrations. If something is additive, the models usually adapt. I think I've only had to do a handful of migrations over the last 10 years. Makes things operationally easier. One of the reasons I prefer MongoDB of Postgres.

mikeckennedy · 2026-02-13T22:38:27+00:00

Thanks. Let me have a look. :)

mikeckennedy · 2026-02-12T22:51:29+00:00

I’m sure they can be and I hope that they would be. However, my belief is that ORM’s provide better abstractions for developers most of the time whereas raw queries syntax provides better source material for genetic coding tools.

mikeckennedy · 2026-02-12T20:55:42+00:00

I'm sure you read the article before posting this, right? So just as a reminder, here is the section of why:

Claude knows the Beanie ODM great (which I use for some of my apps). But do you know what it knows better? Pure, native MongoDB query syntax. Vanilla.

Look at these stats:

Beanie: 1.4M downloads/month PyPI Stats
PyMongo: 74.2M downloads/month PyPI Stats

PyMongo is 53x times more popular. The syntax it needs is actually identical across the Node/Deno/Bun ecosystem, the PHP ecosystem, and many others. This makes the examples of pymongo (aka native MongoDB) queries likely 1,000x more common. Agentic AI will likely do much better with this query foundation.

mikeckennedy · 2026-02-12T02:14:16+00:00

Hey Maven. Did you read the article? The entire premise is that given that Claude Code is optimized for native query syntax, if we are going to use it on that layer, we should choose a pattern best suited for it that also works well for devs.

I don't think Claude will get tired of writing queries. That guy is a beast.

mikeckennedy · 2026-02-11T18:09:57+00:00

That is an interesting take. It would depend if the other dev was a true expert in raw queries and was kinda bad at ORMs and Python. If that's the case, then actually, I'd feel the same.

All things being equal, I think ORMs are better abstractions for devs. But because of the training AIs get on raw queries, I think they are kind of like this hypothetical dev here.

mikeckennedy · 2026-02-11T18:06:11+00:00

Yes, you definitely do. But take my Python courses website: https://training.talkpython.fm as an example. I don't have concrete numbers exactly, but there have to be at least 50x more reads that writes. For the podcast site, https://talkpython.fm which has way less user generated content/actions, it's probably 500x to 1.

That means 500 reads which would not need to "revalidate" the data on its way out of the DB. But 1 write where we do.

I guess the irony of this example I'm giving is that they are actually built with Beanie (Pydantic + pymongo) so it actually is doing validation everywhere. But this pattern in the article is something I would apply to the sites. I'm just using it on newer projects until I get it dialed in further first.

mikeckennedy · 2026-02-11T15:28:33+00:00

> The point about wanting to be not so strict about typing is well taken though, there are ways to do this in Pydantic, but I can see why you'd rather just go with loose typing than work around Pydantic's strong typing.

Half the time I totally want the super strict behavior of Pydantic, others I don't. :)

> If Pydantic is slower than dataclasses it's because it's doing validation and dataclasses do not, but you have to put that validation into your tranformer functions

Here's the thing with Pydantic-as-database-entities. This is true when you load it with inbound data to be saved in the DB. But you're also validating data in Pydantic when you *read* from the DB. For read-heavy apps, you pay that over and over. But if bad data is in the DB, it's too late already.

So yeah, you do have to validate it either way on the incoming side. But why pay over and over for every query for what is already validate data coming out of the DB?

mikeckennedy · 2026-02-11T15:23:41+00:00

I know, right? What changed you might ask? Honestly, it's the whole "what if you didn't write the data access code" angle that I talk about in the article.

DTOs and repositories, there are patterns I haven't used in awhile but for sure did as well.

Thanks for the kind words on the show. ;)

mikeckennedy

TROPHY CASE

Welcome to Reddit,