Adventures in Extension Packaging
I gave a presentation at PGConf.dev last week, Adventures in Extension Packaging. It summarizes stuff I learned in the past year in developing the PGXN Meta v2 RFC, re-packaging all of the extensions on pgt.dev, and experimenting with the CloudNativePG community’s proposal to mount extension OCI images in immutable PostgreSQL containers.
Turns out a ton of work and experimentation remains to be done.
Previous work covers the first half of the talk, including:
- A brief introduction to PGXN, borrowing from the State of the Extensions Ecosystem
- The metadata designed to enable automated packaging of extensions added to the PGXN Meta v2 RFC
- The Trunk Packaging Format, a.k.a., PGXN RFC 2
- OCI distribution of Trunk packages
The rest of the talk encompasses newer work. Read on for details.
Automated Packaging Challenges
Back in December I took over maintenance of the Trunk registry, a.k.a., pgt.dev, refactoring and upgrading all 200+ extensions and adding Postgres 17 builds. This experience opened my eyes to the wide variety of extension build patterns and configurations, even when supporting a single OS (Ubuntu 22.04 “Jammy”). Some examples:
- pglogical requires an extra
makeparam to build on PostgreSQL 17:make -C LDFLAGS_EX="-L/usr/lib/postgresql/17/lib" - Some pgrx extensions require additional params, for example:
- pg_search needs the
--featuresflag to enable icu - vectorscale requires the environment variable
RUSTFLAGS="-C target-feature=+avx2,+fma"
- pg_search needs the
- pljava needs a pointer to
libjvm:mvn clean install -Dpljava.libjvmdefault=/usr/lib/x86_64-linux-gnu/libjvm.so - plrust needs files to be moved around, a shell script to be run, and to be built from a subdirectory
- bson also needs files to be moved around and a pointer to
libbson - timescale requires an environment variable and shell script to run before building
- Many extensions require patching to build for various configurations and OSes, like this tweak to build pguri on Postgres 17 and this patch to get duckdb_fdw to build at all
Doubtless there’s much more. These sorts of challenges led the RPM and APT packaging systems to support explicit scripting and patches for every package. I don’t think it would be sensible to support build scripting in the meta spec.
However, the PGXN meta SDK I developed last year supports the merging of
multiple META.json files, so that downstream packagers could maintain files
with additional configurations, including explicit build steps or lists of
packages, to support these use cases.
Furthermore, the plan to add reporting to PGXN v2 means that downstream packages could report build failures, which would appear on PGXN, where they’d encourage some maintainers, at least, to fix issues within their control.
Dependency Resolution
Dependencies present another challenge. The v2 spec supports third party dependencies — those not part of Postgres itself or the ecosystem of extensions. Ideally, an extension like pguri would define its dependence on the uriparser library like so:
{
"dependencies": {
"postgres": { "version": ">= 9.3" },
"packages": {
"build": {
"requires": {
"pkg:generic/uriparser": 0,
}
}
}
}
}
An intelligent build client will parse the dependencies, provided as purls,
to determine the appropriate OS packages to install to satisfy. For example,
building on a Debian-based system, it would know to install liburiparser-dev
to build the extension and require liburiparser1 to run it.
With the aim to support multiple OSes and versions — not to mention Postgres versions — the proposed PGXN binary registry would experience quite the combinatorial explosion to support all possible dependencies on all possible OSes and versions. While I propose to start simple (Linux and macOS, Postgres 14-18) and gradually grow, it could quickly get quite cumbersome.
So much so that I can practically hear Christoph’s and Devrim’s reactions from here:
Photo of Christoph, Devrim, and other long-time packagers laughing at me.
Or perhaps:
Photo of Christoph and Devrim laughing at me.
I hardly blame them.
A CloudNativePG Side Quest
Gabriele Bartolini blogged the proposal to deploy extensions to
CloudNativePG containers without violating the immutability of the
container. The introduction of the extension_control_path GUC in Postgres
18 and the ImageVolume feature in Kubernetes 1.33 enable the pattern, likely
to be introduced in CloudNativePG v1.27. Here’s a sample CloudNativePG cluster
manifest with the proposed extension configuration:
|
|
The extensions object at lines 9-12 configures pgvector simply by
referencing an OCI image that contains nothing but the files for the
extension. To “install” the extension, the proposed patch triggers a rolling
update, replicas first. For each instance, it takes the following steps:
-
Mounts each extension as a read-only ImageVolume under
/extensions; in this example,/extensions/vectorprovides the complete contents of the image -
Updates
LD_LIBRARY_PATHto include the path to thelibdirectory of the each extension, e.g.,/extensions/vector/lib. -
Updates the
extension_control_pathanddynamic_library_pathGUCs to point to theshareandlibdirectories of each extension, in this example:extension_control_path = '$system:/extensions/vector/share' dynamic_library_path = '$libdir:/extensions/vector/lib'
This works! Alas, the pod restart is absolutely necessary, whether or not any extension requires it,1, because:
- Kubernetes resolves volume mounts, including ImageVolumes, at pod startup
- The
dynamic_library_pathandextension_control_pathGUCs require a Postgres restart - Each extension requires another path to be appended to both of these GUCs,
as well as the
LD_LIBRARY_PATH
Say we wanted to use five extensions. The extensions part of the manifest
would look something like this:
extensions:
- name: vector
image:
reference: ghcr.io/cloudnative-pg/pgvector-18-testing
- name: semver
image:
reference: ghcr.io/example/semver:0.40.0
- name: auto_explain
image:
reference: ghcr.io/example/auto_explain:18
- name: bloom
image:
reference: ghcr.io/example/bloom:18
- name: postgis
image:
reference: ghcr.io/example/postgis:18
To support this configuration, CNPG must configure the GUCs like so:
extension_control_path = '$system:/extensions/vector/share:/extensions/semver/share:/extensions/auto_explain/share:/extensions/bloom/share:/extensions/postgis/share'
dynamic_library_path = '$libdir:/extensions/vector/lib:/extensions/semver/lib:/extensions/auto_explain/lib:/extensions/bloom/lib:/extensions/postgis/lib'
And also LD_LIBRARY_PATH:
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/extensions/vector/lib:/extensions/semver/lib:/extensions/auto_explain/lib:/extensions/"
In other words, every additional extension requires another prefix to be
appended to each of these configurations. Ideally we could use a single prefix
for all extensions, avoiding the need to update these configs and therefore to
restart Postgres. Setting aside the ImageVolume limitation2 for the
moment, this pattern would require no rolling restarts and no GUC updates
unless a newly-added extension requires pre-loading via
shared_preload_libraries.
Getting there, however, requires a different extension file layout than PostgreSQL currently uses.
RFC: Extension Packaging and Lookup
Imagine this:
- A single extension search path GUC
- Each extension in its own eponymous directory
- Pre-defined subdirectory names used inside each extension directory
The search path might look something like:
extension_search_path = '$system:/extensions:/usr/local/extensions'
Looking at one of these directories, /extensions, its contents would be
extension directories:
❯ ls -1 extensions
auto_explain
bloom
postgis
semver
vector
And the contents of one these extension directories would be something like:
❯ tree extensions/semver
extensions/semver
├── doc
│ └── semver.md
├── lib
│ └── semver.so
├── semver.control
└── sql
├── semver--0.31.0--0.31.1.sql
├── semver--0.31.1--0.31.2.sql
├── semver--0.31.2--0.32.0.sql
└── semver--0.5.0--0.10.0.sql
For this pattern, Postgres would look for the appropriately-named
directory with a control file in each of the paths. To find the semver
extension, for example, it would find /extensions/semver/semver.control.
All the other files for the extension would live in specifically-named
subdirectories: doc for documentation files, lib for shared libraries,
sql for SQL deployment files, plus bin, man, html, include,
locale, and any other likely resources.
With all of the files required for an extension bundled into well-defined
subdirectories of a single directory, it lends itself to the layout of the
proposed binary distribution format. Couple it with OCI
distribution and it becomes a natural fit for ImageVolume deployment:
simply map each extension OCI image to a subdirectory of the desired search
path and you’re done. The extensions object in the CNPG Cluster manifest
remains unchanged, and CNPG no longer needs to manipulate any GUCs.
Some might recognize this proposal from a previous RFC post. It not only simplifies the CloudNativePG use cases, but because it houses all of the files for an extension in a single bundle, it also vastly simplifies installation on any system:
- Download the extension package
- Validate its signature & contents
- Unpack its contents into a directory named for the extension in the extension search path
Simple!
Fun With Dependencies
Many extensions depend on external libraries, and rely on the OS to find them. OS packagers follow the dependency patterns of their packaging systems: require the installation of other packages to satisfy the dependencies.
How could a pattern be generalized by the Trunk Packaging Format to work on all OSes? I see two potential approaches:
- List the dependencies as purls that the installing client translates to the appropriate OS packages it installs.
- Bundle dependencies in the Trunk package itself
Option 1 will work well for most use cases, but not immutable systems like
CloudNativePG. Option 2 could work for such situations. But perhaps you
noticed the omission of LD_LIBRARY_PATH manipulation in the packaging and
lookup discussion above. Setting aside the multitude of reasons to avoid
LD_LIBRARY_PATH3, how else could the OS find shared libraries needed by
an extension?
Typically, one installs shared libraries in one of a few directories known to
tools like ldconfig, which must run after each install to cache their
locations. But one cannot rely on ldconfig in immutable environments,
because the cache of course cannot be mutated.
We could, potentially, rely on rpath, a feature of modern dynamic linkers
that reads a list of known paths from the header of a binary file. In fact,
most modern OSes support $ORIGIN as an rpath value4 (or
@loader_path on Darwin/macOS), which refers to the same directory in which
the binary file appears. Imagine this pattern:
- The Trunk package for an extension includes dependency libraries alongside the extension module
- The module is compiled with
rpath=$ORIGIN
To test this pattern, let’s install the Postgres 18 beta and try the pattern
with the pguri extension. First, remove the $libdir/ prefix (as discussed
previously) and patch the extension for Postgres 17+:
perl -i -pe 's{\$libdir/}{}' pguri/uri.control pguri/*.sql
perl -i -pe 's/^(PG_CPPFLAGS.+)/$1 -Wno-int-conversion/' pguri/Makefile
Then compile it with CFLAGS to set rpath and install it with a prefix
parameter:
make CFLAGS='-Wl,-rpath,\$$ORIGIN'
make install prefix=/usr/local/postgresql
With the module installed, move the liburiparser shared library from OS
packaging to the lib directory under the prefix, resulting in these
contents:
❯ ls -1 /usr/local/postgresql/lib
liburiparser.so.1
liburiparser.so.1.0.30
uri.so
The chrpath utility shows that the extension module, uri.so, has its
RUNPATH (the modern implementation of rparth) properly configured:
❯ chrpath /usr/local/postgresql/lib/uri.so
uri.so: RUNPATH=$ORIGIN
Will the OS be able to find the dependency? Use ldd to find out:
❯ ldd /usr/local/postgresql/lib/uri.so
linux-vdso.so.1
liburiparser.so.1 => /usr/local/postgresql/lib/liburiparser.so.1
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6
/lib/ld-linux-aarch64.so.1
The second line of output shows that it does in fact find liburiparser.so.1
where we put it. So far so good. Just need to tell the GUCs where to find them
and restart Postgres:
extension_control_path = '$system:/usr/local/postgresql/share'
dynamic_library_path = '$libdir:/usr/local/postgresql/lib'
And then it works!
❯ psql -c "CREATE EXTENSION uri"
CREATE EXTENSION
❯ psql -c "SELECT 'https://example.com/'::uri"
uri
----------------------
https://example.com/
Success! So we can adopt this pattern, yes?
A Wrinkle
Well, maybe. Try it with a second extension, http, once again building it
with rpath=$ORIGIN and installing it in the custom lib directory:
perl -i -pe 's{$libdir/}{}g' *.control
make CFLAGS='-Wl,-rpath,\$$ORIGIN'
make install prefix=/usr/local/postgresql
Make sure it took:
❯ chrpath /usr/local/postgresql/lib/http.so
http.so: RUNPATH=$ORIGIN
Now use ldd to see what shared libraries it needs:
❯ ldd /usr/local/postgresql/lib/http.so
linux-vdso.so.1
libcurl.so.4 => not found
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6
Naturally it needs libcurl; let’s copy it from another system and try again:
|
|
Line 4 shows it found libcurl.so.4 where we put it, but the rest of the
output lists a bunch of new dependencies that need to be satisfied. These did
not appear before because the http.so module doesn’t depend on them; the
libcurl.so library does. Let’s add libnghttp2 and try again:
|
|
Sadly, as line 7 shows, it still can’t find libnghttp2.so.
It turns out that rpath works only for immediate dependencies. To solve this
problem, liburl and all other shared libraries must also be compiled with
rpath=$ORIGIN — which means we can’t simply copy those libraries from OS
packages5. In th meantime, only deirect dependencies could be
bundled with an extension.
Project Status
The vision of accessible, easy-install extensions everywhere remains intact. I’m close to completing a first release of the PGXN v2 build SDK with support for meta spec v1 and v2, PGXS, and pgrx extensions. I expect the first deliverable to be a command-line client to complement and eventuallly replace the original CLI. It will be put to work building all the extensions currently distributed on PGXN, which will surface new issues and patterns that inform the development and completion of the v2 meta spec.
In the future, I’d also like to:
- Finish working out Trunk format and dependency patterns
- Develop and submit the prroposed
extension_search_pathpatch - Submit ImageVolume feedback to Kubernetes to allow runtime mounting
- Start building and distributing OCI Trunk packages
- Make the pattern available for distributed registries, so anyone can build their own Trunk releases!
- Hack fully-dynamic extension loading into CloudNativePG
Let’s Talk
I recognize the ambition here, but feel equal to it. Perhaps not every bit will work out, but I firmly believe in setting a clear vision and executing toward it while pragmatically revisiting and revising it as experience warrants.
If you’d like to contribute to the project or employ me to continue working on it, let’s talk! Hit me up via one of the services listed on the about page.
-
The feature does not yet support pre-loading shared libraries. Presumably a flag will be introduced to add the extension to
shared_preload_libraries. ↩︎ -
Though we should certainly request the ability to add new
ImageVolumemounts without a restart. We can’t be the only ones thinking about kind of feature, right? ↩︎ -
In general, one should avoid
LD_LIBRARY_PATHfor variety of reasons, not least of which its bluntness. For various security reasons, macOS ignores it unless sip is disabled, and SELinux prevents its propagation to new processes. ↩︎ -
Although not Windows, alas. ↩︎
-
Unless packagers could be pursuaded to build all libraries with
rpath=$ORIGIN, which seems like a tall order. ↩︎