Andrew Nesbitt

brief

2026-04-21T10:00:00+00:00

Anyone landing in an unfamiliar repo, whether that’s a new contributor, a security scanner, or an AI coding agent, has to answer the same handful of questions before doing anything useful: what language is this, how do I install dependencies, what’s the test command, which linter do I run before committing, and for a security review, which functions in this stack are the dangerous ones.

The agent case just makes the cost of getting it wrong easiest to watch, because you can see Claude grep for package.json, read the Gemfile, try npm test, get told there’s no test script, try yarn test, discover it’s actually pnpm, and only then get to the work you asked for. The answers are identical for every Rails project or every Go module that has ever existed, and rediscovering them from scratch every time is wasted effort.

brief is a knowledge base of 516 tools across 54 language ecosystems, with a single Go binary in front of it that does the lookup and prints JSON when piped or a human summary on a TTY. The dataset is the part that doesn’t exist anywhere else: invocation commands, config-file locations, and taxonomy for five hundred tools under one machine-readable schema. CI templates, devcontainer generators, and editor onboarding flows were the closest I found, each carrying a slice of it with no shared upstream. I think of the CLI as one view onto that data and expect there to be others.

Point it at a directory, a git URL, or a registry coordinate like gem:rails or npm:express and it reports the toolchain across twenty categories, each with the command to run and the config files that drive it, plus whatever governance and community files (license with SPDX identifier, security policy, CODEOWNERS, FUNDING.yml, and so on) it finds in the usual places.

brief .                       # local directory
brief gem:rails               # registry package, resolved to source repo
brief diff                    # only tools touched by changed files
brief missing                 # baseline categories with no tool configured
brief threat-model            # CWE/OWASP categories implied by the stack
brief sinks                   # dangerous functions in detected tools

Checking all 516 definitions finishes in under 250ms, since anything that runs at the front of every session or pipeline step can’t afford to be the slow part; on this blog’s own repo it picks out Jekyll, Bundler, Rake, Dependabot and GitHub Actions in around 220ms, and on a Go project the output looks like:

$ brief .
Language:        Go
Package Manager: Go Modules (go mod download)
Test:            go test (go test ./...)
Lint:            golangci-lint (golangci-lint run)  [.golangci.yml]
Format:          gofmt (gofmt -w .)
Build:           GoReleaser (goreleaser release --clean)
Security:        govulncheck (govulncheck ./...)
CI:              GitHub Actions  [.github/workflows/]

I run it myself as the first thing after cloning anything, and I have it wired into my global agent instructions so every Claude session opens with brief . before anything else. That onboards the agent to the repo in one tool call and saves the tokens it would otherwise burn on exploratory greps and wrong guesses. On a feature branch brief diff narrows the report to just the tools touched by the changed files, so whoever is reading it knows to run golangci-lint because a .go file changed without also being told about the Python linter in the monorepo’s other half.

Because the JSON output follows a published schema, it also works as a building block for other tooling: brief --json . | jq -r '.tools.test[0].command.run' gives a polyglot CI job the project’s test command without anyone writing per-language cases, that lookup can drive a devcontainer or onboarding script, and the plan is to run it across every repo ecosyste.ms indexes so that stack metadata is available for every package.

The detection rules are TOML rather than Go, which means adding a tool is a single file under knowledge/ with no code changes: a name, a category, the files or dependency names that signal its presence, the command to run it, and optionally a set of oss-taxonomy tags describing what kind of thing it is. That taxonomy is a sibling project: it builds the vocabulary for what a tool is; brief detects which tools a project uses.

The dependency-name matching is driven by the same manifest parser as git-pkgs, so a tool definition can say “present if rspec-core is in the bundle” and brief already knows how to read Gemfiles, package.json, go.mod, Cargo.toml, and the other supported lockfile formats without any of that being reimplemented.

Those tags were originally there so the JSON output could say “web framework” rather than just “build tool”, but once a few hundred definitions carried them they mapped cleanly onto CWE and OWASP categories, and brief threat-model on a Rails project produces SQL injection, mass assignment, XSS, CSRF, and SSTI without scanning a line of code, because that’s what Rails and ActiveRecord are for. The definitions also carry the specific dangerous functions each tool exposes, around 700 across the dataset, which is a reasonable starting grep list for a security review of a stack you’ve never worked in:

$ brief sinks .
ActiveRecord:
  Arel.sql            sql_injection      CWE-89
  find_by_sql         sql_injection      CWE-89
  where               sql_injection      CWE-89   string interpolation only
Rails:
  html_safe           xss                CWE-79
  redirect_to         open_redirect      CWE-601  when target is from params
  render inline:      ssti               CWE-1336
Ruby:
  eval                code_injection     CWE-95
  Marshal.load        deserialization    CWE-502

brief missing inverts the check and reports which of five baseline categories (test, lint, format, typecheck, docs) have no tool configured for the detected ecosystems, naming the canonical choice for each gap. The detection engine is also importable as a Go library if you’d rather not shell out.

Tool definitions live in the knowledge/ directory and PRs adding new ones are the contributions I’m most interested in, particularly for ecosystems I don’t write every day. If you point it at a project and it gets something wrong, open an issue or find me on Mastodon.

brew install git-pkgs/git-pkgs/brief / github.com/git-pkgs/brief

Features everyone should steal from npmx

2026-04-16T10:00:00+00:00

For most of the time GitHub has owned npm, the public-facing website at npmjs.com has been effectively frozen, with the issue tracker accumulating years of requests that nobody on the inside seemed to be reading. In January Daniel Roe started npmx.dev as an alternative web frontend over the same registry data, posted about it on Bluesky, and within a fortnight years of pent-up demand had turned into a thousand issues and pull requests on a repo that would actually merge them, with the contributor count passing a hundred a couple of days after that. It helps that every npmjs.com URL works with the hostname swapped to npmx.dev or xnpmjs.com, the same trick Invidious and Nitter used, so browser extensions and muscle memory carry straight over. The competitive pressure appears to have worked: npmjs.com shipped dark mode last month, the single most upvoted feature request on the tracker for something like five years, and there are signs of other long-dormant tickets being picked up.

Whether or not that continues, npmx has turned into a useful catalogue of ideas for anyone building a package registry website, and the whole thing is MIT licensed where the npm registry and website remain closed source, so every feature below comes with a working reference implementation rather than just screenshots. Prior art from other ecosystems is noted where it exists.

Transitive install size. The number shown is the unpacked size of the package plus every dependency it pulls in, which is what actually lands on disk, rather than the single tarball size that crates.io and PyPI show. JavaScript developers have been getting this from bundlephobia and packagephobia for years.
Install script disclosure. Any preinstall, install, or postinstall script in the manifest is rendered on the package page along with the npx packages those scripts would fetch, with links into the code browser so you can read what runs. Worth having in front of you given how many supply-chain incidents start with a postinstall hook.
Outdated and vulnerable dependency trees. Rather than a flat list of declared dependencies, you get an expandable tree where each node is annotated with how far behind latest it is and whether it appears in OSV, recursively through transitives. Google’s deps.dev does something similar across ecosystems.
Version range resolution. Wherever a semver range like ^1.0.0 appears it is shown alongside the concrete version it currently resolves to, which saves a round trip to the CLI when you are trying to work out what you would actually get.
Module replacement suggestions. Packages that appear in the e18e module-replacements dataset get a banner pointing at the native API or lighter alternative, with MDN links for the native cases.
Module format and types badges. ESM, CJS, or dual is shown next to the package name, as is whether TypeScript types are bundled or need a separate @types/* install, plus the declared Node engine range. JavaScript-specific in the details but the general idea of “will this work with my toolchain” badges travels; crates.io’s MSRV field and edition badge are in the same spirit.
Multi-forge repository stats. Star and fork counts are fetched from GitHub, GitLab, Bitbucket, Codeberg, Gitee, Sourcehut, Forgejo, Gitea, Radicle, and Tangled, depending on where the repository field points, rather than special-casing GitHub.
Cross-registry availability. Scoped packages that also exist on JSR are flagged as such. The npm/JSR pairing is particular to JavaScript but “this is also on registry X” applies anywhere ecosystems overlap, like Maven and Clojars or the various Linux distro repos, and the same lookup doubles as a dependency-confusion check when the name exists elsewhere but the publisher does not match.
Side-by-side package comparison. Up to ten packages can be loaded into a compare view that lays out all the metrics above in a table, plus a scatter plot with aggregate “traction” on one axis and “ergonomics” on the other so the popular-but-heavy and small-but-unknown options separate visually.
Version diffing. Any two published versions can be diffed file-by-file in the browser, which Hex has had for years via diff.hex.pm and which exists in the Rust world through cargo-vet tooling.
Release timeline with size annotations. Every version of a package is plotted on a timeline with markers where install size jumped by a meaningful percentage, which is a neat way to spot the release where someone accidentally started shipping their test fixtures.
Download distribution by version. The weekly download chart can be broken down by major or minor line so you can see how much of the ecosystem is still on v2 of something now on v5, similar to RubyGems’ per-version download counts but rendered as a distribution rather than a table.
Command palette. ⌘K opens a palette with every action available on the current page plus global navigation, and on a package page typing a semver range filters the version list to matches. Borrowed from editors and from GitHub itself rather than from any registry.
Internationalisation. The interface ships in over thirty locales including RTL languages, with PyPI’s Warehouse being the other registry that has invested in this.
Accessibility as a default. Charts and demo videos in the release notes carry long-form aria-label and figcaption text, the command palette works with screen readers, and there is a dedicated accessibility statement.
Playground link extraction. StackBlitz, CodeSandbox, CodePen, JSFiddle, and Replit links found in a package’s README are pulled out into a dedicated panel so you can try the thing without cloning it.
Agent skill detection. Packages that contain Agent Skills manifests have them listed with declared tool compatibility, which is very 2026, though detecting non-code payloads in published packages is useful.
Social features on AT Protocol. Package “likes” are atproto records rather than rows in a private database, blog comment threads are Bluesky threads, and the custom record types are public lexicons in the repo so other tools can read and write the same data without talking to npmx. If you have ever wanted to add reviews or comments to a registry and balked at the moderation burden of running another silo, borrowing an existing network’s identity and content layer is a defensible answer, and while I am personally sceptical that leaning on Bluesky’s infrastructure will work out long term, npmx at least runs its own PDS at npmx.social so the records stay under their control either way.
Local-CLI admin connector. Management actions like claiming a package name or editing access are proxied through your local npm CLI rather than requiring you to log into the site, which sidesteps the need for npmx to hold credentials for a registry it does not own.
Dark mode and custom palettes. Listed last because this is the one npm has now copied, joining pkg.go.dev, crates.io, and PyPI which already had it.

Someone in the .NET world has already built an equivalent: nugx.org, in its own words “inspired by npmx”, is doing the same thing for NuGet.

The Tuesday Test

2026-04-15T10:00:00+00:00

Yesterday I wrote about the fast Homebrew rewrites and ended on the line that the bottleneck for that whole class of project is not Rust or Ruby, it is the absence of a stable declarative package schema. Someone on Mastodon picked up that thread and asked the obvious follow-on: which package managers actually have one? Going through the list, the honest answer is hardly any of them, and there is a quick test that makes the answer easy to check.

Ask this of any package manager: if I install this package on a Tuesday, could it do something different than if I install it on a Wednesday? If the answer is yes, the package manager is not really declarative, no matter what the manifest file looks like on the surface.

Somewhere in the install pipeline there is a place where arbitrary code runs, and that code can read the clock, check an environment variable, look at the hostname, phone a server, or do anything else a program can do. The Tuesday test is a quick way to separate the declarative tools from the ones that have a programming language hiding underneath a declarative-looking file format.

The test is not about whether the code is malicious, or whether it is a supply chain risk, or whether it could in principle do something terrible. Those are all separate questions with their own answers.

It is also not about the registry changing under you between the two days: new versions, yanks and the like are all real concerns, but they are concerns about the data the package manager is fetching rather than about the package manager itself. Pretend the registry is frozen and the lockfile is pinned.

The question here is narrower. Given the same manifest, the same lockfile and the same registry contents, is the install allowed to read anything the manifest does not declare as an input? The day of the week is the simplest example of such a hidden input, but the real point is that a package that passes the Tuesday test has no way to reach outside its declared inputs at all. A package that fails it can, and once it can, the manifest is no longer the whole story. Going through the list of well known package managers from the landscape post, it turns out that almost none of them pass.

Homebrew

Start with the one that started this. A Homebrew formula is a Ruby class with an install method and a post_install hook, and the entire class body is evaluated by the Homebrew client every time it touches the formula. Even the parts that look like data, such as url, sha256, version, and depends_on, are method calls on the formula class, evaluated in a Ruby context that can require anything, shell out to anything, and read the clock from any method.

The cask format is a Ruby DSL with the same property. So is the Brewfile consumed by brew bundle, which was invented as a Homebrew analogue of Bundler’s Gemfile and inherits the same “executable Ruby file posing as a manifest” shape: a Brewfile can call brew, cask, tap, mas, vscode and friends, but it can also if Time.now.wday == 2 in between.

Homebrew fails the Tuesday test by design, which is the whole reason the formula.json API had to exist as a separate thing for fast clients to consume: there is no other way to extract package metadata without running the package definition. The JSON file is what passes the Tuesday test, and it only exists because the formula format does not.

Generating it is not free either. Homebrew’s own brew generate-formula-api command has to flip the Formula class into a special generating_hash! mode and wrap the run in a SimulateSystem block that lies to every formula about the host OS and architecture, so that calls which would normally branch on the real system instead return a stable answer. It is in-process monkey patching to stop the formula from noticing where it is, in order to coax a declarative-looking file out of a format that is anything but.

Ruby

The Gemfile is a Ruby file. The first line is often source "https://rubygems.org", which looks like configuration, but source is a method call on an implicit DSL object, and anything else you can write in Ruby is valid above, below, or inside it. You can open a socket in your Gemfile. You can check Time.now.wday and add a different gem on Tuesdays.

The .gemspec file that ships inside every gem is also Ruby, and it is evaluated every time someone installs the gem, which means a gem author can put arbitrary code in the specification itself and have it run on the installer’s machine before anything has been built. Native extensions run extconf.rb, which is yet more Ruby, and post-install messages are generated at install time.

CocoaPods is the same story in a different namespace. The CocoaPods client is itself a Ruby program, a Podfile is a direct descendant of a Gemfile, and a .podspec is a direct descendant of a .gemspec, right down to the DSL, the block syntax, and the fact that both files are evaluated as Ruby every time you install. Everything said about Ruby above applies to CocoaPods without a single change.

Python

Python is the same story with different file names. A setup.py is a Python script that runs at install time, and setup.py is where Python packaging started, so an enormous amount of the existing ecosystem still goes through it.

The move to pyproject.toml looks like a shift to a declarative manifest, and in the limited sense that the file itself is TOML it is, but the whole job of that TOML file is to nominate a program to run. The [build-system] table points at a build backend, and the build backend is a Python package that executes arbitrary Python to produce a wheel. Setuptools, Hatchling, Poetry-core, PDM-backend, Flit, Maturin and scikit-build-core are all real programs, all capable of reading the date.

Wheels themselves are the one part of the Python pipeline that does pass the test: PEP 427 deliberately has no pre or post install hooks, and installing a wheel is meant to be a pure file-unpacking step. If a wheel does not already exist for your platform, pip and uv and Poetry and pdm will transparently build one from the sdist by invoking the build backend, which puts you back in arbitrary-Python territory.

JavaScript

JavaScript is the canonical example people reach for, because package.json is famously JSON, which is as declarative a format as you can get, and yet npm install runs arbitrary code through the preinstall, install, and postinstall lifecycle scripts. Those scripts are shell commands that run in the package directory, and nothing stops them from checking date +%u and branching on the result.

Yarn, pnpm, and Bun all inherit the same lifecycle script contract for compatibility with the existing ecosystem, though recent pnpm and Bun releases have started refusing to run scripts for dependencies that are not on an explicit allowlist. The contract is still there, the defaults have just got more cautious.

Deno 🌮

Deno fetches and caches modules on demand, either at import time or up front with deno install, and no code the package author supplies runs against the installer’s machine before the module itself is imported. Deno 2 added first-class package.json and node_modules support on top of the existing npm: specifiers, but even then it refuses to run the npm lifecycle scripts by default and requires an explicit --allow-scripts= opt-in for any package that wants them.

Rust

Rust looks declarative at a glance. Cargo.toml is TOML, Cargo resolves everything from the lockfile, and the whole ecosystem leans heavily on the idea that a crate is a well defined thing.

Then you notice build.rs, which is a Rust file that Cargo compiles and runs before building the crate proper, so it can generate source code, link against system libraries, probe the host, and, yes, check the date. Procedural macros are the same story from a different angle: they are Rust code that runs at compile time in the compiler’s own process, and they can do anything a Rust program can do. Both mechanisms are considered normal and widely used.

Go 🌮

Go modules come closer to passing than almost anything else in this list. The go.mod file is a small declarative format with no scripting in it, go get does not run post-install hooks, and the module proxy and checksum database make the fetch step reproducible and auditable in a way that most other ecosystems are not.

The escape hatch is cgo, which invokes the system C compiler with arguments specified by #cgo directives in source files, and those directives can include whatever paths and flags the package author wants. The core dependency resolution and fetching pipeline is declarative. The build pipeline is not, as soon as C is involved.

JVM languages

The JVM ecosystem is split between the declarative-looking and the openly imperative. Maven’s pom.xml is XML and describes the project as data, but a pom can include plugin executions, and Maven plugins are Java code that runs during the build.

Gradle does not even pretend: build.gradle is a Groovy script, and build.gradle.kts is a Kotlin script, and both are full programming languages with access to the filesystem, the network, and the clock. sbt’s build definition is Scala. Leiningen’s project.clj is Clojure. Mill is Scala again.

The JVM world has spent twenty years treating the build file as a program, and the package management step is a side effect of running that program.

Swift

Swift Package Manager is in the same category. Package.swift is a Swift file that is compiled and run to produce the package description, which means every resolve of a Swift package involves executing Swift code from the package author. Apple added a manifest API version comment at the top of the file so that the compiler knows which stable API to expose, but the underlying mechanism is still “run the author’s Swift program.”

Zig

Zig is worth pulling out because it is a modern language that looked at all of the above and decided, deliberately, that the build file should be a real program. build.zig is Zig source compiled and run by the Zig toolchain, and the package manager is a set of APIs exposed to that program. The rationale is that builds in C-adjacent languages are already programs in disguise (makefiles, shell, CMake), and making the language of the build the same as the language of the project is more honest than pretending otherwise. It is a defensible position, and it fails the test completely.

Bazel 🌮

Bazel is the one entry on this list that tries to pass the Tuesday test at the language design level. BUILD files and .bzl extensions are written in Starlark, a dialect of Python that Google stripped back on purpose: no while loops, no recursion, no mutable global state, no way to read the clock, the filesystem outside declared inputs, or the network. Evaluation is guaranteed to terminate, and two evaluations of the same inputs are guaranteed to produce the same output. It is the only manifest language on this page that cannot observe what day it is even if the author wants it to.

The execution side is hedged the same way. Actions run inside a sandbox with only their declared inputs visible, and Bazel’s remote execution and remote cache assume that identical inputs produce identical outputs, so any non-determinism shows up as a cache miss and gets investigated.

The usual escape hatches are still there if you want them: repository_rule can call out to the host to fetch code, genrule runs shell, and custom toolchains can shell out to anything the sandbox allows, so a sufficiently motivated BUILD author can still reach the system date command. But the default posture is the opposite of everywhere else on this list, and the design is organised around passing the Tuesday test as an explicit goal.

Haskell

A Haskell package is described by a .cabal file, which is a custom declarative format, not Haskell source, so the metadata layer on its own passes the Tuesday test. Tools can parse a .cabal file and extract dependencies, versions and compiler flags without running any of the package author’s code.

The escape hatch is the build-type field. build-type: Simple uses a stock Setup script and is fine. build-type: Custom (and the newer Hooks) tells Cabal to compile and run the package’s own Setup.hs, which is a real Haskell program with preBuild, postBuild, preInst and postInst hooks that can do anything Haskell can do, including read the clock.

Because .cabal is declarative metadata, it can also be mechanically translated into something else, which is a large part of why Haskell has such a big footprint in the Nix ecosystem. cabal2nix reads a .cabal file and emits a Nix expression, Nixpkgs ships a Haskell package set regenerated from Hackage and Stackage through that pipeline, and haskell.nix is an alternative infrastructure built around the same idea.

Everything else with a manifest that’s a program

The rest of the list is short because the pattern is by now predictable.

PHP / Composer: composer.json is JSON, but a scripts section hooks events like post-install-cmd and post-update-cmd with shell commands or PHP callables.
Elixir / Mix: mix.exs is Elixir.
Dart / pub: pubspec.yaml is declarative, but pub supports hook scripts for native and data assets, written in Dart and run at build time.
Perl / CPAN: Makefile.PL and Build.PL are Perl programs, and have been since the nineties.
Lua / LuaRocks: rockspecs are Lua tables, but the build section can include a build_command that runs shell.
Nim / Nimble: nimble files support before install and after install hooks written in NimScript.
Julia / Pkg: packages run deps/build.jl at install time, which is a Julia program.
Raku / zef: runs Perl or Raku build scripts.

opam and Portage

OCaml’s opam is unusually honest: the opam file is a declarative-looking S-expression format, but the build and install fields contain explicit lists of shell commands to run, and everyone knows what they are and where they live. The same is true, in a different flavour, of Gentoo’s Portage: an ebuild is a bash script that sources a set of library functions and defines phases like src_compile and src_install, so the package is a program and no one pretends otherwise.

System package managers

System package managers all fail, and most of them fail in several places at once. Debian packages carry preinst, postinst, prerm, and postrm maintainer scripts that dpkg runs around the unpack step, and they are shell by default. RPM packages embed %pre, %post, %preun, and %postun scriptlets, plus file triggers, which are shell scripts.

Arch’s pacman runs .INSTALL scripts from inside the package tarball, which are again shell, and PKGBUILDs themselves are shell programs evaluated at build time. Alpine’s apk has pre and post install scripts, plus APKBUILDs that are shell scripts.

MacPorts Portfiles are Tcl. Chocolatey packages are PowerShell. Conda belongs on this list too, even though it is often filed next to Python: it is a cross-language binary package manager that happens to have grown up in the scientific Python community, and it ships explicit pre-link and post-link shell scripts that run when a package is linked into an environment.

Every one of these can look at the clock and do one thing on Monday and a different thing on Tuesday without bending any rules, and Homebrew at the top of the post is the same shape as all of them.

Nix and Guix 🌮

Nix is the interesting case, because it is the one package manager on the list that has been designed from the start around the idea that the install step should not be allowed to notice what day it is. A Nix expression is a program in the Nix language, but it is a pure lazy functional language with no I/O primitives of the sort you would need to read a clock, so the evaluation step that produces a derivation cannot observe the day of the week at all.

The derivation is then realised by running a builder inside a sandbox that has no network, a scrubbed environment, and its own view of the filesystem. The sandbox itself does not pin the clock, so a determined builder can still call date and get a real answer. In practice Nixpkgs and the wider reproducible-builds.org project paper over this with SOURCE_DATE_EPOCH, an environment variable that well-behaved build tools read instead of the real clock when stamping timestamps into their output, often set to the Unix epoch or the commit time of the source. The Tuesday test passes cleanly at the evaluation layer and passes in most cases at the realisation layer, with the remaining gaps treated as bugs rather than features.

Guix tells the same story with different syntax. The package definitions are written in Guile Scheme, which is a full language in the way that the Nix language deliberately is not, but package records are a restricted form and the build is run inside the same kind of sandbox, inherited from the same derivation model that Eelco Dolstra wrote up in his thesis. Guix ships with a --check mode that rebuilds a package and compares the output to the previous build, and the whole project treats a mismatch as something to fix. Guix passes the Tuesday test about as well as anything on this list does.

The common thread in the failing cases is that building a package and installing a package are the same step. A gemspec is Ruby because gems get built on the installer’s machine from it. System package managers are the opposite shape of the same problem: installing a package means dropping files into a live filesystem and reconciling them with whatever was already there.

Happy Taco Tuesday to Deno, Go, Bazel, Nix and Guix. 🌮

Standing on the shoulders of Homebrew

2026-04-14T10:00:00+00:00

zerobrew and nanobrew have been doing the rounds as fast alternatives to Homebrew, one written in Rust with the tagline “uv-style architecture for Homebrew packages” and the other in Zig with a 1.2 MB static binary and a benchmark table comparing itself favourably against the first. Both are upfront, once you scroll past the speedup numbers, that they resolve dependencies against homebrew-core, download the bottles that Homebrew’s CI built and Homebrew’s bandwidth bill serves, and parse the cask definitions that Homebrew contributors maintain.

They’re alternative clients for someone else’s registry, which is a perfectly reasonable thing to build, but the framing as a replacement glosses over what running a system package manager actually involves.

nanobrew’s README has a “what doesn’t work yet” section listing Ruby post_install hooks, build-from-source with custom options, conditional blocks in Brewfiles, and any complex Ruby DSL, while zerobrew handles source builds by falling back to “Homebrew’s Ruby DSL”, which I read as shelling out to the thing it’s meant to be replacing.

The parts of Homebrew they skip are the parts that are slow for a reason: evaluating arbitrary Ruby to discover what a package needs, running post-install hooks that touch the filesystem in package-specific ways, and handling the long tail of formulae that don’t reduce to “download this tarball and symlink it into a prefix”. Implementing only the bottle path and declaring the rest out of scope covers the easy 80% of packages and most of the benchmark wins.

zerobrew’s table reports a 4.4x speedup installing ffmpeg from a warm cache, nanobrew gets the same operation down to 287 milliseconds, and I keep trying to picture the developer who installs ffmpeg, uninstalls it, and installs it again on the same machine often enough for warm-cache reinstall time to be the number they care about.

A warm install is measuring how quickly you can clonefile a directory out of a content-addressable store, which is a fine thing to optimise but says almost nothing about the experience of setting up a new laptop or adding a tool you didn’t have yesterday. The cold-cache numbers are much closer together, occasionally slower than Homebrew when the bottle is large, because at that point everyone is waiting on the same CDN and there’s no clever data structure that makes bytes arrive faster.

I wrote about why uv is fast a few months ago. The language rewrite was the least interesting part of that story. uv is fast because PEP 658 finally let Python resolvers fetch package metadata without executing setup.py, and because uv dropped eggs and pip.conf and a dozen other legacy paths that pip still carries. Homebrew already shipped its equivalent of PEP 658 in the formula.json API, and that’s the thing that made zerobrew and nanobrew possible in the first place, neither of them is solving the metadata-without-Ruby-evaluation problem because Homebrew already solved it for them.

zerobrew’s content-addressable store and APFS clonefile tricks would work equally well from Ruby, and nanobrew’s parallel downloads have been on by default in Homebrew since 4.7.0 last November. The architectural choices are real improvements but they aren’t “we rewrote it in Zig” improvements, and a zero-startup-time binary matters a lot less when the operation behind it is a 40 MB download either way.

Most of the work in a package manager is the long tail: formulae that want a specific libiconv on an old macOS release, casks with notarisation quirks, post-install scripts that edit config files in ways you can’t predict in advance. None of it benchmarks. Whether either project still has a maintainer paying attention a year from now, once those issues start piling up in the tracker, is an open question. Both also chose Apache-2.0 rather than inheriting Homebrew’s BSD-2-Clause, which is legally fine and suggests the authors see themselves as building independent projects rather than contributing to the ecosystem they depend on.

The formula format is Turing-complete Ruby, which means the package definition and the client that interprets it are effectively the same artefact, and any move toward declarative package data has to either break the existing formulae or ship a Ruby evaluator as part of every client forever.

The formula API currently lists 8,308 formulae in homebrew-core and the cask API another 7,617 casks, plus roughly 34,000 homebrew-* Ruby repositories on GitHub that look like third-party taps, all written against an internal DSL that was never meant to be a stable interchange format. The fast clients get to sidestep that problem by declaring it out of scope, which is a freedom the project they depend on doesn’t have.

The bottleneck isn’t Rust or Ruby, it’s the absence of a stable declarative package schema. Until that exists, every fast client is fast because Homebrew already did the slow work.

Common Package Specification

2026-04-13T10:00:00+00:00

The Common Package Specification went stable in CMake 4.3 last year and the name caught my attention because it sounds like it might be addressing the cross-ecosystem dependency problem I’ve written about before. Reading the spec, the “common” turns out to mean common across build systems rather than common across language ecosystems: it’s a JSON format that CMake and Meson and autotools can all read to find out where an installed library lives and how to link against it, replacing the mix of .pc files and *Config.cmake scripts that currently fill that role.

The schema is full of include paths, preprocessor defines, link flags, component types like dylib and archive and interface for header-only libraries, and feature strings like c++11 and gnu, which makes sense given it came out of Kitware and the C++ tooling study group and is being driven by people building large C++ applications who are tired of every build system having its own incompatible way of describing the same installed library.

Conan can already generate CPS files for everything in ConanCenter, and CMake’s find_package() reads them with fallback to the older formats, so libraries built through that toolchain will start leaving .cps files in install prefixes whether anyone outside the C++ world notices or not. Each one is a small structured record of an installed binary: its location on disk, its version, what other components it requires, what platform it was built for.

For something like the binary dependency tracing Vlad and I have been looking at, that’s a useful data source sitting alongside the symbol tables we’d be extracting anyway, particularly for the version field, which is the thing you can’t reliably recover from nm output and currently have to guess from filenames or distro package databases.

The closer fit is native extension builds in language package managers. Ruby’s mkmf has pkg_config() baked into it and the pkg-config gem reimplementing the format in pure Ruby has tens of millions of downloads, while node-gyp users shell out to pkg-config from binding.gyp action blocks to find headers and libraries at install time. These are doing exactly what CPS is designed to replace, and a CPS reader for mkmf would be a small piece of code, but the libraries that gems actually build against (libxml2, libpq, libsqlite3, openssl) ship .pc files because pkg-config has been around since 2000 and don’t yet ship .cps files because almost nothing outside CMake produces them.

There’s an open proposal to add a package_url field using purl identifiers so a CPS file could record which conan or vcpkg or distro package it came from, which would close a loop between the build-system world’s description format and the identifier scheme everything else has converged on.

Python has been moving on the adjacent problems independently, with PEP 770 reserving .dist-info/sboms/ inside wheels for CycloneDX or SPDX documents describing bundled libraries, and auditwheel already implementing it by querying dpkg or rpm or apk at repair time to find which system package each grafted .so came from before writing the result as purls. CPS wouldn’t help here. Wheel consumers never compile anything, so what they need is provenance for what got bundled, and Python correctly reached for SBOM formats. The numpy 2.2.6 wheel I pulled to check still doesn’t have an SBOM in it despite the spec being accepted a year ago, which mostly tells you how long the tail is on rebuilding the world, and is part of why reconstructing this data from binaries after the fact stays useful even as the metadata standards land.

PEP 725 declares dep:generic/openssl style requirements in pyproject.toml so build tools know what needs to be present before they start, using a purl-derived scheme that again has no relationship to CPS despite covering ground that pkg-config users would recognise.

None of these efforts reference each other much, which is roughly what you’d expect when the C dependency problem gets solved piecewise by whichever community hits it hardest, but the pieces are at least using compatible identifiers now, and a CPS file with a purl in it is something you could trace through to a PEP 770 SBOM entry without anyone having planned for that to work.

Package Registries and Pagination

2026-04-10T10:00:00+00:00

Package registries return every version a package has ever published in a single response, with no way to ask for less. The API formats were designed ten to twenty years ago when packages had tens of versions, not thousands, and they haven’t changed even as the ecosystems grew by orders of magnitude around them.

npm’s registry API dates to 2010 when there were a few hundred packages on the registry. registry.npmjs.org/vite now returns 37MB of JSON for 725 versions (gzip brings that to 4.4MB over the wire, but it’s still 37MB to parse) because each version entry includes the full README (up to 64KB), every dependency, every maintainer, the full package.json as published, and CouchDB revision metadata. typescript is 15MB for 3,758 versions, and even express is 800KB. None of these responses carry pagination headers of any kind, no Link, no X-Total-Count, no X-Per-Page, just Content-Type: application/json and standard cache controls.

npm offers an abbreviated metadata format through an Accept: application/vnd.npm.install-v1+json header that strips READMEs and most metadata, shrinking vite from 37MB to about 2MB, but it’s still unpaginated and the slimmed-down response drops fields like publication timestamps that tools need for dependency cooldown periods, forcing anything that implements cooldown back onto the full 37MB document.

The Renovate project found the hard ceiling when, at 10,451 versions, their package metadata exceeded 100MB and npm publish started returning E406 Not Acceptable: Your package metadata is too large (100.01 MB > 100 MB). The only fix was unpublishing old versions, which also broke their Docker image builds since those depended on the npm package being publishable.

PyPI’s Simple API has roots going back to 2003 with setuptools, and PEP 503 formalized it in 2015 when there were about 70,000 packages. pypi.org/pypi/boto3/json returns all 2,011 releases in a single 2.8MB JSON response, and the Simple API that pip actually uses for resolution (/simple/boto3/) lists every file for every version as HTML anchor elements on one page. PEP 691 modernized the format to JSON in 2022 but didn’t add pagination, and the discussion thread shows nobody even raised it as a possibility. The PEP explicitly constrains against increasing the number of HTTP requests an installer has to make.

Packagist returns all 1,261 versions of laravel/framework inline and has since 2012. RubyGems’ JSON API sends all 516 versions of rails in 465KB, a format largely unchanged since 2009. Hex, pub.dev, Maven Central’s maven-metadata.xml, and Hackage all work the same way, each dating to between 2005 and 2014.

Go’s module proxy, designed in 2019 with the benefit of hindsight, keeps its /@v/list endpoint as plain text with one version string per line, so 1,865 versions of aws-sdk-go is 16KB. Maven’s metadata XML is similarly minimal at 12KB for spring-core. When the format only stores version strings the responses stay small regardless of how many versions accumulate.

NuGet’s V3 API, redesigned in 2015, is the only major registry that paginates version metadata on the server side, splitting versions into pages of 64 in its registration endpoint. Small packages get versions inlined in the index response while larger packages like Microsoft.Extensions.DependencyInjection (159 versions across 3 pages) return page pointers the client fetches separately. Docker Hub also paginates tags at 100 per page with next/previous URLs in the response body. crates.io is halfway there: its versions API has a meta field with total and next_page, but for serde’s 315 versions it returns everything at once with next_page: null, and I haven’t found a crate large enough to trigger the second page.

The reason none of these registries paginate is that package managers need all versions visible at once to resolve dependency constraints. If npm install had to make ten round trips for every transitive dependency, installs would be painfully slow, so registries optimized for CDN cacheability instead: one canonical URL per package, one response, cache it at the edge. That trade-off made sense when the largest packages had a few dozen versions.

RubyGems’ Compact Index, Cargo’s sparse index, and Go’s /@v/list found a better path by stripping the response down to just what a resolver needs, serving it as a static file, and letting CDNs and HTTP range requests handle the rest. RubyGems’ compact index reduced dependency data from 202MB to 2.7MB compressed, and the responses stay small because they contain dependency metadata rather than everything a human might want to browse. npm and PyPI never made that split. When npm install fetches vite, it parses 37MB of READMEs, maintainer lists, and CouchDB revision history just to find out which version satisfies ^6.0.0. Even gzipped, that metadata is eight times the size of the 522KB tarball it points to.

Package Security Defenses for AI Agents

2026-04-09T10:00:00+00:00

Yesterday I wrote about the package security problems AI agents face: typosquatting, registry poisoning, lockfile manipulation, install-time code execution, credential theft, and cascading failures through the dependency graph. Agents inherit all the old package security problems but resolve, install, and propagate faster than any human can review.

There’s no silver bullet for securing agent coding workflows because LLMs can’t reliably distinguish safe packages and metadata from malicious ones, but these defenses can reduce the blast radius when something gets through. Some of them introduce friction, but agents can absorb that friction better than humans.

For people using AI coding platforms

Disable install scripts by default

npm has --ignore-scripts, pip has --only-binary :all: to refuse sdists and force wheels, but neither defaults to off. Agent platforms should ship with install scripts disabled and require explicit opt-in per package. The postinstall script is the single most common vector for malicious packages, and agents have no way to evaluate whether a script is legitimate. Bun already defaults to not running lifecycle scripts for installed dependencies.

Dependency cooldown periods

New package versions shouldn’t be installable by agents for some window after publication, maybe 24-72 hours. I wrote about cooldown support across package managers in more detail last month. Most malicious packages are detected and removed within days of upload. A cooldown means agents only resolve versions that have survived initial community and automated review. npm’s provenance attestations help here but aren’t sufficient alone. This could be enforced at the registry level, the resolver level, or the AI coding platform level.

Sandbox package installation

Agents should install packages in isolated environments with no network access after the download phase and no access to credentials, SSH keys, or environment variables. Container-based sandboxes or something like Landlock on Linux would work here, where the install step gets network access to fetch packages but everything after that runs without it. Even if a malicious install script executes, it can’t reach anything worth stealing.

Limit which registries agents can resolve from

Agent configurations should support an allowlist of registries and scopes. An agent that only needs packages from your company’s private registry and a handful of vetted public packages shouldn’t be able to resolve arbitrary names from npm or PyPI. Companies already do this in their CI pipelines to prevent dependency confusion, and agents need the same treatment.

Pin and verify lockfiles

Agents should never regenerate a lockfile unless explicitly asked to. If a lockfile exists, the agent should install from it exactly. If the agent’s task requires adding a new dependency, it should produce the lockfile diff for review rather than installing and continuing. Lockfile-lint and similar tools should run as a gate before any agent-modified lockfile is accepted.

Require package provenance

Where registries support it (npm with sigstore, PyPI with Trusted Publishers), AI coding platforms should default to requiring provenance attestation. Packages without provenance get flagged or blocked. This doesn’t prevent all supply chain attacks but it makes account takeover and registry compromise harder.

Scope agent permissions to the task

An agent updating a README doesn’t need npm install permissions, and one running tests doesn’t need network access. Agent platforms should support task-scoped permission profiles rather than giving every agent the same broad access, covering both what packages an agent can install and what those packages can do once installed.

Treat agent tool metadata as untrusted input

MCP server descriptions, agent cards, skill descriptors, and plugin manifests should be treated as untrusted input, not as instructions. Agent platforms should parse metadata into structured fields and reject or sanitize freeform text before it reaches the LLM context.

Monitor agent dependency behavior

Log every package install, version resolution, and registry query an agent makes, and diff these against expected behavior for the task. If an agent asked to fix a CSS bug runs npm install crypto-utils, that should page someone the same way an unexpected outbound network connection would in production. If an agent resolves a package version different from what’s in the lockfile, the task should halt and wait for human approval. Traditional package security tooling already surfaces these signals but most AI coding platforms don’t wire them into their agent workflows.

Failed installs matter too. When an agent tries to install a package that doesn’t exist, that’s likely a hallucinated name, and those names are slopsquatting targets. Registries and AI coding platforms that log failed resolution attempts have an early warning system for which package names attackers should be racing to register.

Namespace reservation for agent ecosystems

MCP server registries, A2A discovery services, and skill marketplaces should implement namespace reservation and verification, the way npm has org scopes and PyPI has verified publishers. Unverified packages in agent-specific namespaces should carry visible warnings, and agents should be configurable to reject unverified sources entirely.

For people designing AI coding platforms

Your agent’s dependency resolver is a security boundary

Every time your agent runs a package install, it’s making a trust decision. Treat the resolver the same way you’d treat an authentication system: define what it’s allowed to do, log what it actually does, and fail closed when something unexpected happens. If your agent can install arbitrary packages from public registries without approval, you’ve given the internet write access to your execution environment.

Separate the package installation phase from the execution phase

Don’t let agents install and run in a single step. The install phase should fetch and verify packages against an allowlist or lockfile, and the execution phase should run in a sandboxed environment built from what was installed. You don’t npm install at runtime in production, and your agent shouldn’t either.

Design for the agent not knowing what it doesn’t know

A human developer might hesitate before installing a package they’ve never heard of, but an agent will install whatever it thinks solves the task. Require packages to come from a vetted list, flag new dependencies for human review, and reject packages below a popularity or age threshold.

Treat every MCP server and plugin as a dependency

If your system connects to MCP servers, installs skills, or loads plugins, those are dependencies with the same risk profile as npm packages. Pin versions, verify provenance where possible, and audit what they do at install and runtime. Calling them “tools” or “skills” instead of “packages” doesn’t change the threat model.

Don’t give agents ambient credentials

Agents that inherit the developer’s shell environment get their SSH keys, API tokens, cloud credentials, and registry auth tokens, and a malicious package installed by the agent can read all of it. Provision agents with scoped, short-lived credentials that only cover what the current task requires. If your agent doesn’t need to push to a registry, it shouldn’t have a registry auth token in its environment.

Assume your agent will be prompted to install something malicious

Attackers will try to get your agent to install a bad package, and sometimes they’ll succeed. Design your system so that a single malicious install can’t exfiltrate credentials, can’t persist across tasks, can’t modify other agents’ environments, and can’t propagate to downstream systems. The blast radius of a compromised dependency should be one sandboxed task.

Build a dependency audit trail

Every package your agent installs, every version it resolves, every registry it queries should be logged and attributable to a specific task. When something goes wrong, you need to answer: which agent installed this, when, why, and what else did it touch? Traditional SCA tools can scan the result, but you also need the provenance of how that result was assembled, the same way you’d want reproducible builds.

Don’t forget about dependencies after installation

Agents are good at installing packages and bad at revisiting them. A dependency an agent pulled in six months ago to fix a one-off task is still in the tree, still getting loaded, and nobody has checked whether it’s been flagged since. Human developers at least occasionally see Dependabot PRs or hear about compromised packages through the grapevine. Agents don’t have a grapevine. If your platform lets agents add dependencies, it also needs a mechanism for surfacing when those dependencies go stale, get deprecated, or turn up in vulnerability databases.

Package Security Problems for AI Agents

2026-04-08T10:00:00+00:00

I went through the recent OWASP Top 10 for Agentic Applications and pulled out the scenarios related to package management, which turn up in all ten categories and don’t sort neatly into any one of them, since a typosquatted MCP server is simultaneously a name attack, a registry attack, and a metadata poisoning vector.

Package name attacks

Typosquatting and namespace confusion are some of the oldest problems in package security. Agents make them worse because they resolve packages programmatically, without a human glancing at the name and noticing something is off.

An attacker registers an MCP server package on npm or PyPI with a name one character off from a popular one, and when an agent dynamically discovers and installs tools, it resolves the typosquatted package instead, treating it as legitimate.
A malicious tool package named report gets resolved before the legitimate report_finance because of how the agent’s tool registry handles namespace collisions, causing misrouted queries and unintended data disclosure.
LLMs hallucinating package names during code generation create install targets that don’t exist yet, and attackers can register those names on PyPI or npm with malicious payloads. I wrote about slopsquatting in more detail last year.

Registry and repository attacks

MCP servers, agent skills, and plugins are distributed through the same registries as traditional packages: npm, PyPI, crates.io, and platform-specific marketplaces. The registry trust problems that package managers have dealt with for years (compromised maintainer accounts, malicious uploads, manifest confusion) apply directly.

A compromised package registry serves signed-looking manifests, plugins, or agent descriptors containing tampered components, and because orchestration systems trust the registry, the poisoned artifacts distribute widely before anyone notices.
The first in-the-wild malicious MCP server was published as an npm package impersonating Postmark’s email service, secretly BCC’ing all emails to the attacker while agents that installed it had no indication anything was wrong.
Agent discovery services like A2A function as new package registries, and they inherit the same problems: an attacker can register a fake peer using a cloned schema to intercept coordination traffic between legitimate agents, the same way you’d squat a package name on a public registry.
Agent cards (the /.well-known/agent.json file) are package metadata by another name. A rogue peer can advertise exaggerated capabilities in its card, causing host agents to route sensitive requests through an attacker-controlled endpoint, analogous to a package claiming false capabilities in its manifest.

Metadata and descriptor poisoning

Package metadata has always been a trust boundary: manifest confusion (where published metadata doesn’t match actual package contents) and starjacking (where a package claims association with a popular repo through its metadata) are established attacks. Agent tooling adds a new dimension because agents interpret metadata as instructions, not just data.

Hidden instructions embedded in an MCP server’s published package metadata get interpreted by the host agent as trusted guidance. In one demonstrated case, a malicious MCP tool package hid commands in its descriptor that caused the assistant to exfiltrate private repo data when invoked.
Package READMEs processed through RAG can contain hidden instruction payloads that silently redirect an agent to misuse connected tools or send data to external endpoints. The README is package metadata that traditional security tooling rarely inspects for malicious content.
A popular RAG plugin distributed as a package and fetching context from a third-party indexer can be gradually poisoned by seeding the indexer with crafted entries, biasing the agent over time until it starts exfiltrating sensitive information during normal use.

Dependency resolution and lockfile attacks

Lockfile manipulation and pinning evasion are well-understood supply chain attacks. Agents amplify them because they routinely regenerate lockfiles, install fresh dependencies, and resolve versions without comparing against a known-good baseline.

An agent regenerating a lockfile from unpinned dependency specs during a “fix build” task in an ephemeral sandbox will resolve fresh versions, potentially pulling in a backdoored minor release that wasn’t in the original lockfile.
Agents running automated dependency updates or vibe-coding sessions install packages without verifying them against a known-good lockfile. A coding agent with auto-approved tools that runs npm install or pip install can be manipulated into resolving a different version than a human developer would have chosen, or into installing an entirely new dependency that runs hostile code at install time.

Install-time and import-time code execution

Install scripts (postinstall in npm, setup.py in pip) have been the primary vector for malicious packages for years. The OpenSSF Package Analysis project exists largely to detect this pattern. Agents make it worse because they run installs with broader permissions and less scrutiny than a developer at a terminal.

Malicious package installs escalate beyond a supply-chain compromise when hostile code executes during installation or import with whatever permissions the agent has, which are often broad because the agent needs filesystem and network access to do its job. A developer running npm install might notice a suspicious postinstall script in their terminal output. An agent running the same command as part of a “fix build” or “patch server” task won’t.
During automated dependency updates or self-repair tasks, agents run unreviewed npm install or pip install commands, and any package with a malicious install script executes with the agent’s full permissions before any human sees what happened. The attack surface here is identical to traditional install-script malware, but the window between install and detection is wider because no one is watching.

Credential and secret leakage through packages

Malicious packages exfiltrating credentials at install time is a well-documented pattern across npm, PyPI, and RubyGems. Agents widen the blast radius because they often hold more credentials than a typical developer environment and install packages without human review.

The poisoned nx/debug release on npm was automatically installed by coding agents, enabling a hidden backdoor that exfiltrated SSH keys and API tokens. The compromise propagated across agentic workflows because no human reviewed the install, turning a single malicious package release into a supply-chain breach that moved faster than traditional incident response could track.
Agents that install MCP server packages or plugins grant those packages access to environment variables, API keys, and filesystem paths. A malicious package published under a plausible name can harvest credentials the same way traditional supply chain attacks do, but with access to whatever the agent is authorized to use.

Cascading failures through the dependency graph

Cascading breakage from a single bad release is a familiar problem in package management. When left-pad was unpublished from npm in 2016, thousands of builds broke within hours. When colors.js shipped a sabotaged release in 2022, projects that pinned loosely picked it up automatically. In agent systems the dependency graph includes not just code packages but MCP servers, plugins, and peer agents, and the propagation is faster because agents resolve, install, and deploy without waiting for a human to notice something is wrong.

A poisoned or faulty package release pulled by an orchestrator agent propagates automatically to all connected agents, amplifying the breach beyond its origin. In traditional package management a developer might notice a broken build and pin a version. An agent with auto-approved installs just keeps going, and every downstream agent that depends on the orchestrator’s output inherits the compromised dependency.
When two or more agents rely on each other’s outputs they create a feedback loop that magnifies initial errors. A bad dependency update in one agent’s package tree compounds through the loop: agent A installs a corrupted package, produces bad output, agent B consumes that output and makes decisions based on it, and the error amplifies with each cycle until the system is producing nonsense at scale.

Skill and plugin installation

Agent coding platforms have their own packaging systems for skills, plugins, hooks, and extensions, and these turn out to have the same vulnerabilities that traditional package managers spent years learning about. OpenClaw, which has accumulated 238 CVEs since February 2026, provides the perfect case study. Malicious skill archives can use path traversal sequences to write files outside the intended installation directory during skills install or hooks install (CVE-2026-28486, CVE-2026-28453), and the skill frontmatter name field gets interpolated into file paths unsanitized during sandbox mirroring (CVE-2026-28457). Scoped plugin package names containing .. can escape the extensions directory entirely (CVE-2026-28447).

OpenClaw also auto-discovers and loads plugins from .OpenClaw/extensions/ without verifying trust, so cloning a repository that includes a crafted workspace plugin runs arbitrary code the moment the agent starts (CVE-2026-32920). Hook module paths passed to dynamic import() aren’t constrained, giving anyone with config access a code execution primitive (CVE-2026-28456). The exec allowlist trusts writable package-manager directories like /opt/homebrew/bin and /usr/local/bin by default, so an attacker who can write to those paths (which is anyone who can run brew install or pip install --user) can plant a trojan binary that the allowlist treats as safe (CVE-2026-32009). Environment variables like NODE_OPTIONS or LD_PRELOAD injected through config execute arbitrary code at gateway startup (CVE-2026-22177).

These are familiar problems if you’ve worked on package manager security: path traversal in archives, untrusted input in file paths, auto-loading from working directories, trusting mutable filesystem locations. Agent coding platforms are rebuilding package management from scratch and rediscovering the same bugs. The difference is that the old bugs played out over hours or days, gated by humans reviewing installs, noticing broken builds, and pinning versions. Agents compress that timeline. They resolve, install, execute, and propagate before anyone is in the loop, with broader permissions than a developer typically has.

Who Built This?

2026-04-07T10:00:00+00:00

Michael Stapelberg wrote last week about Go’s automatic VCS stamping: since Go 1.18, every binary built from a git checkout embeds the commit hash, timestamp, and dirty flag, queryable with go version -m or runtime/debug.ReadBuildInfo() at runtime. His argument is that every program should do this, so you can always answer “what version is running in production?” without guessing. Go is unusual in doing this by default, and the rest of the package management landscape varies wildly in how it handles this, if it handles it at all.

Compiled languages

Rust’s Cargo has an open issue proposing that cargo package record the git commit hash in published crates, but nothing has been accepted beyond a .cargo_vcs_info.json file in the packaged crate, so the conventional approach is a build.rs script using crates like vergen or shadow-rs to emit cargo:rustc-env directives that become compile-time environment variables readable with env!(). You get the SHA, branch, timestamp, and dirty flag, but you have to opt in, wire it up, and expose it through a --version flag or similar, and there’s no way to inspect an arbitrary Rust binary externally.

SourceLink, now built into the .NET SDK, makes .NET the closest to Go’s approach. It sets AssemblyInformationalVersion to something like 1.0.0+60002d50a..., embedding the full commit SHA alongside the repository URL for debugger source fetching. MinVer derives the version entirely from git tags with no configuration file, and GitVersion computes semver from branch topology. It’s opt-in, but the tooling is mature enough that a .NET developer who wants stamping can get it with a single package reference and no build script.

Java’s ecosystem relies on git-commit-id-maven-plugin, which generates a git.properties file and can inject metadata into META-INF/MANIFEST.MF. Spring Boot’s actuator /info endpoint reads git.properties automatically, which means a lot of Spring Boot applications in production actually do have VCS info available, even if the developers who configured it don’t think of it as “stamping.” You can inspect a JAR’s manifest with unzip -p foo.jar META-INF/MANIFEST.MF, and Package.getImplementationVersion() reads it at runtime, though without the plugin you get whatever the maintainer put in the POM version field and nothing else. Gradle has equivalents, and sbt needs two plugins (sbt-buildinfo plus sbt-git) to get the same result.

Swift Package Manager has no stamping mechanism at all, and a third-party PackageBuildInfo plugin that shells out to git during the build is about all that exists. SwiftPM has a registry protocol (SE-0292, SE-0391) and private registries exist, but there’s no public centralized registry and most packages still resolve directly from git repositories, so the VCS metadata is right there at build time. It clones the repo, checks out the tagged commit, and then throws away everything except the source files. Of all the compiled language toolchains, SwiftPM would have the easiest time stamping and yet doesn’t.

Bazel’s --workspace_status_command flag runs a user-provided script that prints key-value pairs. Keys prefixed STABLE_ invalidate the build cache when they change; others are “volatile” and stale values may be used without triggering a rebuild. The mechanism is powerful and built-in, but the documentation is notoriously confusing and the stable-vs-volatile distinction trips people up regularly.

Interpreted languages

For interpreted languages, “stamping” means something slightly different, since there’s no compiled binary to embed data in: can you determine what version or commit an installed package came from at runtime?

Composer’s InstalledVersions::getReference('vendor/pkg'), available since version 2.0, returns the git commit SHA of every installed PHP package, backed by vendor/composer/installed.json. This works for both source and dist installs because Packagist records the commit SHA that each tag points to in its API metadata, and Composer preserves it through the lock file into runtime. No other interpreted language package manager preserves this much VCS metadata with this little configuration.

Python’s importlib.metadata.version('pkg') gives you the version string but no VCS revision unless you use setuptools-scm or similar to bake the commit hash in at build time. PEP 610 specifies a direct_url.json for packages installed directly from VCS, which records the commit hash, but anything installed from PyPI lost its git SHA when the sdist or wheel was built. npm, pnpm, and Yarn make package.json version available at runtime but nothing more; npm provenance attestations link published packages to specific commits via Sigstore, though that’s registry metadata rather than something embedded in the package itself. RubyGems exposes version at runtime through Gem::Specification and the gemspec metadata hash allows arbitrary keys, but there’s no standard field for git SHA and no convention for using one.

System package managers

dpkg stores package version (e.g. 1.2.3-1) queryable with dpkg -s, and a Vcs-Git field exists in source package metadata, but that field never propagates to installed binary packages. RPM actually has a dedicated VCS tag (tag 5034) that can store the upstream repository URL and potentially a commit SHA, but most Fedora RPMs don’t bother setting it.

Arch’s pacman has a clever approach for VCS packages: packages suffixed -git run a pkgver() function in the PKGBUILD that encodes the commits-since-last-tag and short hash into the version string itself, like 1.0.3.r12.ga1b2c3d, so the version you see in pacman -Qi actually contains the commit info. Regular packages built from release tarballs just carry the upstream version number, though.

Homebrew records the formula URL (typically a tarball) and its SHA256, plus a Homebrew-specific revision field for rebuilds, but no upstream git commit survives installation. Flatpak and Snap both have version metadata in their app manifests but no VCS revision field in either format.

Nix is where Stapelberg’s post originates, and it’s a good illustration of the problem: store paths encode a content hash, not a VCS revision, and fetchers like fetchFromGitHub download a tarball with no .git directory. Even builtins.fetchGit strips .git for reproducibility. The .rev attribute exists during Nix evaluation but isn’t written to the store, so Stapelberg’s go-vcs-stamping.nix overlay has to bridge that gap for Go specifically, and the underlying problem affects every language built through Nix.

Container images

OCI images have their own annotation spec for this: the org.opencontainers.image.revision label carries the VCS commit hash, and org.opencontainers.image.source points to the repository URL. docker buildx can set these automatically from git context, and GitHub Actions’ docker/metadata-action populates them from the workflow environment, so a CI-built image can carry its commit SHA and repo URL without any manual wiring.

Plenty of Dockerfiles don’t set these labels in practice, and even when they’re present they describe the image build, not necessarily the application inside it. An image built from a Go binary that was itself built without VCS stamping will have the commit that changed the Dockerfile, which may or may not be the commit that changed the application code, so image-level and application-level stamping end up being two separate problems.

Source archives

Git’s own git archive command supports export-subst in .gitattributes, expanding placeholders like $Format:%H$ into the full commit hash, which is the intended mechanism for embedding commit info in archives without .git. GitHub, GitLab, Gitea, and Forgejo all use git archive internally for their downloadable tarballs and zipballs, so export-subst works on all of them. If you add version.txt export-subst to your .gitattributes and put $Format:%H$ in that file, the tarball will contain the full commit hash.

The catch is reproducibility. Abbreviated hash placeholders like $Format:%h$ produce different-length output depending on the number of objects in the repository, and GitHub’s servers don’t always agree on object counts. The same tarball URL can return different contents at different times, which breaks checksum verification. NixOS/nixpkgs#84312 documents this problem in detail. Full hashes (%H) are stable, but ref-dependent placeholders like %d change as branches move. The mechanism works, but anyone who checksums tarballs, which is most package managers, has to treat export-subst repos as a source of non-determinism.

The same thing happens with package archives, where the version from the manifest file survives but the commit that produced it doesn’t unless the build backend explicitly stamped it in. (An npm bug in 6.9.1 once accidentally included .git directories in published tarballs, and it was treated as a serious defect.) A developer tags a commit, CI builds an artifact from that tag, the build process strips .git, and the resulting package carries only the version string.

Trusted publishing and embedded stamping

Trusted publishing through Sigstore addresses this from the registry side. When a package is published from CI with OIDC-based trusted publishing, the registry records which commit, repository, and workflow produced it, with a cryptographic signature in a transparency ledger. npm and PyPI both support this today. The provenance metadata lives at the registry rather than in the artifact, but you can look up an artifact’s attestation by its hash, so if you have the artifact you can trace it back to the commit that produced it without the artifact itself needing to carry that information.

Software Heritage could eventually enable something similar from the source side. They archive public source code repositories and assign intrinsic identifiers (SWHIDs) based on content hashes, so in principle you could go the other direction too: given a source tree or file, look up which commits and repositories it appeared in. That archive is already large and growing, though the tooling to make these lookups practical for everyday debugging isn’t there yet.

All this research got me thinking about how it could integrate with git-pkgs, which already tracks the dependency side of this: who added a package, when it changed, what the version history looks like in your repo. Its browse command opens the installed source of a package in your editor, but that’s the installed files with no git history.

If packages reliably carried their source commit, there’s a more interesting version of that command: clone the upstream repository and check out the exact commit your installed version was built from. You’d get git log, git blame, the full context of what changed between the version you have and the version you’re upgrading to, all from your local terminal. The stamping metadata is the missing link between “I depend on this package at this version” and “here is the code that produced it, with its history.”

The Cathedral and the Catacombs

2026-04-06T10:00:00+00:00

Eric Raymond’s The Cathedral and the Bazaar is almost thirty years old and people are still finding new ways to extend the metaphor. Drew Breunig recently described a third mode, the Winchester Mystery House, for the sprawling codebases that agentic AI produces: rooms that lead nowhere, staircases into ceilings, a single builder with no plan. That piece got me thinking, though it shares a blind spot with every other response to Raymond I’ve read.

As the P2P Foundation pointed out, historical cathedrals were communal projects that mobilized entire communities through donations and voluntary labour, not top-down designs imposed by a single architect, and the bazaar isn’t really a market when nothing is priced and there are no merchants. But the responses all stay within the same frame: process, governance, and who builds.

I find it odd that in nearly three decades of cathedral-and-bazaar discourse, nobody has written about the catacombs: the dependency graph underneath every project, the deep network of transitive packages and shared libraries and unmaintained infrastructure that the visible building rests on, regardless of whether a cathedral architect or a bazaar crowd built it.

When Raymond wrote that “given enough eyeballs, all bugs are shallow”, he was talking about the thing you can see: the project, its source, its public development process. Linus’s law assumes people are looking. The dependency tree breaks that assumption.

A typical JavaScript project can pull in hundreds of transitive dependencies that nobody on the team has read, written by maintainers they’ve never heard of, last updated at various points over the past several years. The cathedral’s architects didn’t inspect the catacombs before building on top of them, and the bazaar’s crowd didn’t either, because in both cases the construction process is what gets all the attention while the foundations are treated as someone else’s concern.

Josh Bressers argued that successful open source projects are really megachurches now, large structured organizations with budgets and governance, while the actual bazaar is the neglected hobbyist layer underneath. He comes closest to this when he identifies that neglected layer, and Nadia Eghbal’s Roads and Bridges documented the same neglect as an infrastructure funding problem back in 2016. But both are talking about maintainers and their working conditions, which is still a question about people and process.

It’s not just that the maintainers of your transitive dependencies are overworked or under-resourced (though they are). It’s that the dependency graph itself is a load-bearing structure that nobody designed and nobody audits as a whole. There are partial efforts: lockfiles, SBOMs, dependency scanners, distro maintainers who vet packages one at a time. But none of them look at the graph as a connected system. It assembled itself through thousands of independent decisions by maintainers who each added whatever looked useful, and the result is an unmapped network of tunnels under the building that happens to hold the floor up.

Real catacombs are underground networks that were built for one purpose, repurposed for another, and eventually forgotten about until someone discovers they’ve been structurally compromised or that unauthorized people have been using them to get into buildings above. A package gets written to solve a small problem, other packages start depending on it, applications pull it in transitively, and eventually it’s load-bearing infrastructure maintained by someone who wrote it on a weekend years ago and barely remembers it exists. Every package ecosystem has some version of this, though npm’s defaults are especially good at making it worse.

And like real catacombs, they get used as ways in. The xz backdoor didn’t try to get through the front door of any distribution. A co-maintainer spent two years building trust in a compression library that sits deep in the dependency graph of almost every Linux system, then planted obfuscated code in the build system. The event-stream attack took over a single abandoned npm package and used it to target a completely different application downstream. Neither attack targeted the cathedral or the bazaar directly, they used the dependency graph as a tunnel network to reach targets that were well-defended at every visible entrance.

Whether your project is built cathedral-style with careful central control, or bazaar-style with open contribution, or Winchester Mystery House-style by an AI that doesn’t know what a staircase is for, makes very little difference to the structural risk underneath. A cathedral with meticulous code review and a strict merge process installs its dependencies from the same registries as the most chaotic bazaar project, inherits the same transitive chains, runs the same lifecycle scripts during build. The governance model describes how the floors are laid, but the dependency graph underneath comes from the same place.

Can you imagine what the basement of the Winchester Mystery House looks like? AI coding agents tend to pull in dependencies much more aggressively than most humans would, extending the graph in ways that are hard to review even in principle. And since early 2026, a growing number of people have been pointing AI at open source projects to find security vulnerabilities, sending automated explorers into the catacombs and filing reports faster than maintainers can triage them.