I have attended over one hundred Postgres meetups over the years. The usual format is: food with individual discussion, lecture with group questions, and finally more individual discussion. I just spoke at an Armenia PostgreSQL User Group meetup and the event organizer Emma Saroyan did something different — she did a group mobile phone quiz after my lecture. To create the quiz questions before the event, she studied my slides and viewed a previous recording of my talk. The winner got a copy of her book.
I think the quiz was a great post-lecture group activity, and gave dynamism to the content. I know people will remember the quiz fondly when they think about the meetup. I encourage all meetups to have similar quizzes, though they do take work to create.
In the past there have been many discussions about using UUID as a primary key in PostgreSQL. For some applications, even a BIGINT column does not have sufficient range: it is a signed 8‑byte integer with range −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807. Although these values look big enough, if we think about web services that collect billions or more records daily, this number becomes less impressive. Simple integer values can also cause conflicts of values in distributed system, in Data Lakehouses when combining data from multiple source databases etc.
However, the main practical problem with UUIDv4 as a primary key in PostgreSQL was not lack of range, but the complete randomness of the values. This randomness causes frequent B‑tree page splits, a highly fragmented primary key index, and therefore a lot of random disk I/O. There have already been many articles and conference talks describing this problem. What many of these resources did not do, however, was dive deep into the on‑disk structures. That’s what I wanted to explore here.
UUID (Universally Unique Identifier) is a 16‑byte integer value (128 bits), which has 2^128 possible combinations (approximately 3.4 × 10^38). This range is so large that, for most applications, the probability of a duplicate UUID is practically zero. Wikipedia shows a calculation demonstrating that the probability to find a duplicate within 103 trillion version‑4 UUIDs is about one in a billion. Another often‑quoted rule of thumb is that to get a 50% chance of one collision, you’d have to generate roughly 1 billion UUIDs every second for about 86 years.
Values are usually represented as a 36‑character string with hexadecimal digits and hyphens, for example: f47ac10b-58cc-4372-a567-0e02b2c3d479. The canonical layout is 8‑4‑4‑4‑12 characters. The first character in the third block and the first character in the fourth block have special meaning: xxxxxxxx-xxxx-Vxxx-Wxxx-xxxxxxxxxxxx – V marks UUI
[...]The PostgreSQL Hacking Workshop will be taking a well-earned Christmas break in December of 2025. The future of the workshop is a little bit unclear, because I'm continuing to have a bit of trouble finding enough good talks online to justify doing one per month: the best source of talks for the event is pgconf.dev, but not all of those talks are about hacking on PostgreSQL, and not all of those that are about hacking are equally interesting to potential attendees. Also, we've already eaten through much of the backlog of older hacking-related talks that are posted online. Many conferences have few hacking-related talks, or just don't post any videos (which are expensive and time-consuming to produce).
Read more »In our first JIT episode, we discussed how we could, using copy-patch, easily create a JIT compiler for PostgreSQL, with a slight improvement in performance compared to the PostgreSQL interpreter.
In our second episode, I talked about the performance wall and how hard it was to have a real leap in performance compared to the interpreter. But it ended with a positive outlook, a nice performance jump that I was preparing at that moment…
The interpreter will run each opcode for every record it has to process. Everything it has to do for each record that could be done only once is better done once, obviously. And this is where a JIT can beat it. The JIT compiler can choose optimizations that would require checks at each opcode for the interpreter, and thus self-defeating for the interpreter. For instance, I mentioned creating inlined opcodes for common function calls like int4eq : replacing the indirect call to int4eq with a comparison of the function pointer and then an inlined version would indeed be silly, since the comparison is going to waste a lot of time already.
So, what can’t the interpreter do? It sure can’t easily remove indirect calls, but this is a 1% performance gain, 2% at most. You won’t get to the headlines with that, right? Well, when in doubt, look at the past…
A decade ago, I worked at a small company where I heard the weirdest thing ever regarding system performance: “our application is slower when built in 64 bits mode because the bigger pointer size makes it slower”. I didn’t buy this, spent two days digging into the code, and found that it was the opposite: 64 bits brought such a performance improvement that the entire system collapsed on a mutex that held a core structure in the application… Removing the mutex made the application fly in both 32 and 64 bits, with 64 bits beating 32 bits obviously.
But why is 64 bits faster? We are talking databa
[...]
The secret to unlocking performance gains often lies not just in what you ask in a query, but in how Postgres finds the answer. The Postgres EXPLAIN system is great for understanding how data is being queried. One of secretes to reading EXPLAIN plans is understanding the type of scan done to retrieve the data. The scan type can be the difference between a lightning-fast response or a slow query.
Today I’ll break down the most common scan types, how they work, and when you’ll see them in your queries.
This type of data scan reads the entire table, row by row checking to see what matches the query conditions. If you have a WHERE or FILTER, Postgres just scans each row looking for matches.
Sequence scans are kind of the foundation of how scans are done and for many searches, this is what Postgres will use. For very large data sets, or those queried often, sequential scans are not ideal and an index scan may be faster. For that reason - knowing how to spot a seq scan vs index scan when reading an EXPLAIN plan is one the most important parts of reading a scan type in a query plan.
EXPLAIN select * from accounts;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on accounts (cost=0.00..22.70 rows=1270 width=36)
(1 row)
When you create an index in Postgres, you’re creating a column or multi-column reference that is stored on disk. Postgres is able to use this index as a map to the data stored in the table. A basic index scan uses a B-tree to quickly find the exact location of the data using a a two-step process: first Postgres finds the entry in the index, uses the reference, and then it fetches the rest of the row data from the table.
EXPLAIN select * from accounts where id = '5';
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using accounts_pkey on accounts (cost=0.15..2.37 rows=1 width=36)
Index Cond: (id = 5)
(2 [...]
As a PostgreSQL-specific monitoring solution, pgwatch is mostly known for storing collected metrics in a PostgreSQL database. While great, as you probably should "just Use Postgres for everything" xD... in some scenarios and specific setups, this might be a limitation.
You may want to store your metrics in a particular OLAP database, or a specific format, etc. This is where pgwatch's gRPC client shines, as it enables users to integrate custom storage solutions by developing a compatible gRPC server that writes the metrics to their desired destination.
In this post, we will go through the process of developing a PoC for a custom gRPC server in Go that pgwatch can use to write metrics to ClickHouse.
First, let's understand the pgwatch gRPC client protobuf file.
In the file, you can see the main Receiver service defined with a couple of methods; let's explain them one by one.
UpdateMeasurements(MeasurementEnvelope) returns (Reply).
MeasurementEnvelope data type consists of DBName, which is the monitored source name, Tags, representing the tags associated with this source, MetricName, the metric being measured, and Data, a list of rows that resulted from running the metric query.
Reply.SyncMetric(SyncReq) returns (Reply).
SyncReq data type consists of DBName, representing the source name under which the metric got added/deleted, MetricName, the metric added/deleted, and Operation, either addition or deletion.
Reply.DefineMetrics(google.protobuf.Struct) returns (Reply).
Retrieval-Augmented Generation (RAG) has become one of the most practical ways to give Large Language Models (LLMs) access to your own data. Rather than fine-tuning a model or hoping it somehow knows about your documentation, RAG lets you retrieve relevant content from your own sources and provide it as context to the LLM at query time. The result is accurate, grounded responses based on your actual content.In this three-part series, I'll walk through building a complete RAG server using PostgreSQL as the foundation. We'll cover:
In a PostgreSQL database, sequences provide a convenient way to generate a unique identifier, and are often used for key generation. From the community, PostgreSQL provides functions and SQL language to help manage sequence generation, but the sequences themselves are not without limitations in a multi-master environment. As a result, we introduced 100% open-source Snowflake sequences that work seamlessly in a multi-master PostgreSQL cluster to remove those limitations and ensure your data can thrive at the network edge. (Check out the extension on GitHub!) The easiest way to get started using Snowflake sequences is to try out pgEdge Postgres, either the Enterprise or the Distributed edition, available for self-hosting or fully managed hosting on virtual machines, containers, or the cloud. The pgEdge Control Plane (also 100% open source and available on GitHub) makes it even easier to deploy and orchestrate Postgres clusters, and comes with the snowflake extension by default (amongst others).
PostgreSQL Belgium User Group had a new meetup on November 25, in Haasrode, Leuven, organized by Kim Jansen, An Vercammen, Boriss Mejías and Stefan Fercot.
Speakers:
Related blog post by Kim Jansen
Prague PostgreSQL Meetup hosted a new meetup on November 24, organized by Gulcin Yildirim Jelinek
Speakers:

© Laurenz Albe 2025
Sometimes a string column should only contain one of a limited number of distinct values. Examples would be the current state of a service request or the US state in an address. There are several ways to implement such a column in PostgreSQL, the most interesting being an enum type or a lookup table. In this article, I will explore the benefits and disadvantages of these methods.
To show this implementation, let's define a simplified table of Austrian citizens with their residential state:
CREATE UNLOGGED TABLE person_s (
person_id bigint NOT NULL,
state text NOT NULL,
CHECK (state IN ('Wien', 'Niederösterreich', 'Burgenland',
'Steiermark', 'Oberösterreich', 'Kärnten',
'Tirol', 'Salzburg', 'Vorarlberg'))
);
There are around 8 million Austrians, so let's insert 8 million rows. Not all states are equally populous, so I use a CASE expression to assign states in an approximately realistic distribution:
INSERT INTO person_s (person_id, state)
SELECT i,
CASE WHEN hashint4(i) < -1198295875 THEN 'Wien'
WHEN hashint4(i) < -390842024 THEN 'Niederösterreich'
WHEN hashint4(i) < -249108103 THEN 'Burgenland'
WHEN hashint4(i) < 343597384 THEN 'Steiermark'
WHEN hashint4(i) < 1060856922 THEN 'Oberösterreich'
WHEN hashint4(i) < 1327144894 THEN 'Kärnten'
WHEN hashint4(i) < 1692217115 THEN 'Tirol'
WHEN hashint4(i) < 1958505087 THEN 'Salzburg'
ELSE 'Vorarlberg'
END
FROM generate_series(1, 8000000) AS i;
I'll add the primary key constraint and an index on the states to the table after loading the data because that is faster. Also, I'll run VACUUM and ANALYZE to improve the query performance:
ALTER TABLE person_s ADD CONSTRAINT person_s_pkey PRIMARY KEY (person_id); CREATE INDEX person_s_state_idx ON person_s (state); VACUUM (ANALYZE) person_s;
Say goodbye to the old way of distributing Postgres extensions as part of the main pre-built operand image. Leveraging the Kubernetes ImageVolume feature, CloudNativePG now allows you to mount extensions like pgvector and PostGIS from separate, dedicated images. This new declarative method completely decouples the PostgreSQL core from the extension binaries, enabling dynamic addition, easier evaluation, and simplified updates without ever having to build or manage monolithic custom container images.
This blog provides my opinion on how Postgres, according to the annual Stack Overflow developer survey, became the most admired, desired and used database by developers. This is part three in my series about the Postgres journey to the top with developers. Here are the first parts of this story:
And our last 2025 recording is here! Check out Jay Miller’s talk!
Seattle Postgres Users Group (SEAPUG) maintained the PostgreSQL booth at PASS Data Community Summit 2025 from November 17-21, 2025:
Speakers:
The Mumbai PostgreSQL User Group meetup took place on 20th Nov, 2025, organized by Sovenath Shaw, Ajit Gadge and Deepak Mahto.
Speakers:
PostgreSQL Conference Japan 2025 took place on November 21st. It was organized by Haruka Takatsuka, Kazushiro Takeda, Rintaro Ikeda, Tetsuo Sakata, Kousuke Kida, and Junichi Tado.
During the conference, the following volunteers helped running the conference: Tamotsu Odaka, Taiki Koshino, Nozomi Anzai, Jun Kuwamura, Hiroki Kataoka, Makoto Kaga, Tetsuji Koyama, Seiji Goto, Shinobu Honma, Kenji Sato, Katsuaki Fukushima, Thiduong Dang, Yung-Chung Ku, Wataru Takahashi, and Kouhei Ito.
PASS Data Community Summit 2025 wrapped up last week. This conference originated 25 years ago with the independent, user-led, not-for-profit “Professional Association for SQL Server (PASS)” and the annual summit in Seattle continues to attract thousands of database professionals each year. After the pandemic it was reorganized and broadened as a “Data Community” event, including a Postgres track.
Starting in 2023, volunteers from the Seattle Postgres User Group have staffed a postgres community booth on the exhibition floor. We provide information about Postgres User Groups around the world and do our best to answer all kinds of questions people have about Postgres. The booth consistently gets lots of traffic and questions.
The United States PostgreSQL Association has generously supplied one of their booth kits each year, which has a banner/background and some booth materials like stickers and a map with many user groups and a “welcome to postgres” handout and postgres major version handouts. We supplement with extra pins and stickers and printouts like the happiness hints I’ve put together, a list of common extensions that Rox made, and a list of Postgres events that Lloyd made. Every year, we also bring leftover Halloween candy that we want to get rid of and we put it in a big bowl on the table.
One of the top questions people ask is how and where they can learn more about Postgres. Next year I might just print out the Links section from my blog, which has a bunch of useful free resources. Another idea I have is for RedGate and EnterpriseDB – I think both of these companies have paid training but also give free access to a few introductory classes – it would be nice if they made a small card with a link to their free training. I think we could have a stack of these cards at our user groups and at the PASS booth. The company can promote paid training, but the free content can benefit anyone even if they aren’t interested in the paid training. I might also reach out to other companies who have paid
[...]
Recently, I read an article titled "Alias Settings of Engineers Around the World" (in Japanese). As a PostgreSQL engineer, this got me thinking: "If they are customizing their bash aliases, how are they configuring their psql environments?"
Driven by curiosity, I decided to investigate GitHub repositories to see how developers commonly configure their psqlrc files.
psql is the terminal-based front-end for PostgreSQL, allowing you to execute SQL interactively. Its configuration file is named psqlrc, and it has the following characteristics:
psqlrc file is stored in ../etc/ relative to the directory containing the PostgreSQL executable. This directory can be explicitly set using the environment variable PGSYSCONFDIR.
.psqlrc file is stored in the executing user's home directory. This file can be explicitly set using the environment variable PSQLRC.
.psqlrc-14.
psqlrc.
I used the Search API. Here are the key points to consider:
for loop to fetch all pages.
sleep commands.
#!/bin/bash
GITHUB_TOKEN=hoge # Set your own GitHub token here
# API call for files 0 to 1000 bytes
for size in `seq 0 50 950`;
do
for page in `seq 10`;
do
curl -ksS \
-H "AccepBack in October, before PGConf.EU, I explained the issues impacting the prolonged wait for TDE in PostgreSQL 18. Explanations were needed as users were buzzing with anticipation, and they deserved to understand what caused the delays and what the roadmap looked like. In that blog post I have shared that due to one of the features newly added in 18.0, the Asynchronous IO (AIO), we have decided to give ourselves time until 18.1 has been released to provide a build with TDE. We wanted to ensure best quality of the solution and that takes time.
Quick! You need to create a complex query, but you don't remember how to join the tables. And you need to get it done ASAP.
You are in luck if you have an enterprise edition of DBeaver handy. You can drag and drop the tables and then adjust the clauses as needed without having to dig into the table definitions.
1. Open the database
|
| The PostgreSQL DVD Rental database. |
I am using the PostgreSQL DVDRental dataset for these examples. I have opened the connection and listed the tables.
As you can see, the database contains numerous tables.
2. Open a SQL Console
|
| Open SQL Console |
The second step requires an SQL Console. You will find it under the SQL Editor tab.
3. Select Builder
|
| On the right side, Select Builder |
On the right side of the console pane, tilted 90 degrees, is a tab labeled Builder.
4. You are ready to drag and drop tables
|
| The Query Builder Is Ready to use. |
Now drag the payment, rental, and customer tables from step 1 separately to the top pane, the one that says 'no active query',
You can rearrange the tables to create a harmonious relationship map that aligns with your artistic aesthetic.
You should have something like thi
The call for papers is now open for FOSDEM PGDay 2026. The CfP will run until December 15, 2025 at 11:59 CET. We will aim to notify speakers by December 17th, 2025 and publish the schedule before Christmas.
FOSDEM PGDay 2026 is a one-day conference that will be held ahead of the main FOSDEM event in Brussels, Belgium, on Friday, January 30st, 2025. This will be a PostgreSQL-focused event. This year, FOSDEM PGDay will feature two tracks of talks! This conference day will be for-charge with a registration of EUR 75, and will be held at the Brussels Marriott Hotel. Registration is required to attend and since we have a limited number of seats available for this event, we urge everybody to register as soon as possible once open.
PostgreSQL Europe will not have its usual devroom, but we invite all community members to visit the PostgreSQL booth and the Databases devroom at the main FOSDEM conference. No main FOSDEM events require registration.
For full details about the conference, venue and hotel, see the event website.
We also have a special negotiated room rate with the Brussels Marriott Hotel. For details, see the venue page.
We hope to see you in Brussels!
This blog post recaps my eventful participation in KubeCon + CloudNativeCon North America 2025 in Atlanta, highlighting the key decision by maintainers to formally apply for CNCF Incubation for the CloudNativePG operator. I had the pleasure of delivering three presentations: a CNPG Lightning Talk focused on community contribution; a deep-dive with Yoshiyuki Tabata on implementing modern PostgreSQL authorisation using Keycloak and OAuth for robust database security; and a session with Jeremy Schneider introducing the new quorum-based consistency feature in CNPG 1.28 for safer cluster reconfigurations. Links to the videos of all three talks are shared within the full article.
Features:
branches_to_build keyword UP_TO_ to allow building only certain older branches. e.g. UP_TO_REL_16_STABLE
pg_config.h for meson builds
--restrict-key requirement for dump testing
Because of the reduced log collection, buildfarm owners are highly encouraged to upgrade to the latest release.
The release can be obtained from Github or Buildfarm Server
PostgreSQL 18 has just been born and we are already talking about the blessings of PostgreSQL 19 and beyond? Well, yes, and there is a good reason for it.
Recently, an important series of changes have been committed, which some of our folks (and many others around the world) have been working on for many many years and which have the power to seriously speed up aggregations for basically everyone out there. What is even better is that you don't have to change your code, there is no need to adjust parameters or anything of that sort. You can simply run your code as it is and enjoy the benefits of this important improvement.
To understand why this improvement in the query optimizer is so important, we first have to understand how PostgreSQL has handled grouping up to now. The most simple rule was: "Join first, aggregate later". What does that mean? Consider the following example:
SELECT j.gender_name, count(*)
FROM person AS p, gender AS j
WHERE p.gender_id = j.gender_id
GROUP BY j.gender_name
Let us assume that we only store a handful of genders but millions of people. The way everything before PostgreSQL 19 is handling this type of query is the following. Keep in mind that the example is of course highly simplified to make understanding easy:
person table
gender_name and add to the count in the desired group
What is fundamentally wrong with this approach? Actually nothing. This is how most systems would handle this type of operation. However, there is an inefficiency here: In a large table, millions of people might be female - what the system does is to look up the name for each gender_id again and again. In case almost every id is different, this is fine. However, if there are only a handful of different ids, the operation takes way too long and becomes pretty repetitive.
The big breakthrough in Postgr
[...]Number of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.