Planet PostgreSQL

PostGIS Performance: Simplification

Posted by Paul Ramsey in Crunchy Data on 2025-12-09 at 13:00

There’s nothing simple about simplification! It is very common to want to slim down the size of geometries, and there are lots of different approaches to the problem.

We will explore different methods starting with ST_Letters for this rendering of the letter “a”.

alt

SELECT ST_Letters('a');

This is a good starting point, but to show the different effects of different algorithms on things like redundant linear points, we need a shape with more vertices along the straights, and fewer along the curves.

alt

SELECT ST_RemoveRepeatedPoints(ST_Segmentize(ST_Letters('a'), 1), 1);

Here we add in vertices every one meter with ST_Segmentize and ST_RemoveRepeatedPoints to thin out the points along the curves. Already we are simplifying!

Lets apply the same “remove repeated” algorithm, with a 10 meter tolerance.

alt

WITH a AS (
  SELECT ST_RemoveRepeatedPoints(ST_Segmentize(ST_Letters('a'), 1), 1) AS a
)
SELECT ST_RemoveRepeatedPoints(a, 10) FROM a;

We do have a lot fewer points, and the constant angle curves are well preserved, but some straight lines are no longer legible as such, and there are redundant vertices in the vertical straight lines.

The ST_Simplify function applies the Douglas-Peuker line simplification algorithm to the rings of the polygon. Because it is a line simplifier it does a cruder job preserving some aspects of the polygon area like squareness of the top ligature.

alt

WITH a AS (
  SELECT ST_RemoveRepeatedPoints(ST_Segmentize(ST_Letters('a'), 1), 1) AS a
)
SELECT ST_Simplify(a, 1) FROM a;

The ST_SimplifyVW function applies the Visvalingam–Whyatt algorithm to the rings of the polygon. Visvalingam–Whyatt is better for preserving the shapes of polygons than Douglas-Peuker, but the differences are subtle.

alt

WITH a AS (
  SELECT ST_RemoveRepeatedPoints(ST_Segmentize(ST_Letters('a'), 1), 1) AS a
)
SELECT ST_SimplifyVW(a, 5) FROM a;

Coercing a shape onto a fixed precision grid is another form of simplification, sometimes used to force the edges of adjacent objects

[...]

PostgreSQL High-Availability Architectures

Posted by Hans-Juergen Schoenig in Cybertec on 2025-12-09 at 08:25

PostgreSQL is highly suitable for powering critical applications in all industries. However, to run critical applications, there are key requirements which are absolutely needed: High-Availability and automatic failover.

This document explains which options are available and which problems one can solve with PostgreSQL. We have listed the most common setups and some of the most common recommendations.

Scenario 1: Automatic failover with Patroni

The first scenario has been designed to provide a solution to the following requirements and scenario:

Run servers on premise or on VMs
Automatically handle server errors
Make the cluster transparent to the application
Handle recovery automatically

This is a typical scenario which is commonly seen in real world deployments. An application has to rely on a highly available database and in case of error we simply want to switch to a backup system.

The solution in this case is a Patroni cluster. It uses distributed consensus based on “etcd” and fully automated failover as well as database recovery. The following image shows how this works:

The image shows a 3-node cluster. The primary server is able to handle read as well as write transactions. The remaining two servers will accept reads.

Advantages of this approach

This approach has a couple of advantages that are important to understand:

The primary server is a single source of truth
Replicas allow read scalability
A “Shared nothing infrastructure” is highly resilient
Easy to scale as needed
Replicas can exist in various locations

In general, achieving high availability is possible if you follow a simple design pattern: Keep it simple. The more complexity you add to the system, the more likely it is that something goes wrong or something impacts operations. Simplicity is a key advantage of this approach.

Transparent connection handling

When dealing with clusters, there is an important question which has to be answered: H

[...]

Building a RAG Server with PostgreSQL - Part 2: Chunking and Embeddings

Posted by Dave Page in pgEdge on 2025-12-09 at 06:30

In Part 1 of this series, we loaded our documentation into PostgreSQL using the pgEdge Document Loader. Our documents are sitting in the database as clean Markdown content, ready for the next step: turning them into something an LLM can search through semantically.In this post, we'll use pgEdge Vectorizer to chunk those documents and generate vector embeddings. By the end, you'll have a searchable vector database that can find relevant content based on meaning rather than just keywords.

What Are Embeddings and Why Chunk?

Before we dive in, let's quickly cover the concepts. are numerical representations of text that capture semantic meaning. Similar concepts end up close together in vector space, so "PostgreSQL replication" and "database synchronisation" would have similar embeddings even though they share few words. This is what makes semantic search possible. is necessary because embedding models have token limits (typically 8,000 tokens or so), and more importantly, smaller chunks provide more focused results. If you embed an entire 50-page document as one vector, searching for "how to create an index" might return that whole document when you really want just the relevant paragraph. Breaking documents into smaller pieces gives you more precise retrieval.

Why Vectorize in the Database?

You could run a separate service to generate embeddings, but pgEdge Vectorizer takes a different approach: it runs inside PostgreSQL as an extension. When you insert or update documents, triggers automatically chunk the text and queue it for embedding. Background workers process the queue asynchronously, calling your chosen embedding API and storing the results.This has several advantages:

No external service to deploy and manage

Embeddings stay in sync with your data automatically

Everything lives in one transactional system

You can use standard SQL for similarity search

Prerequisites

Before we start, you'll need:

PostgreSQL 14 or later (we set this up in Part 1)

The pgvector ex

[...]

3x Faster TID Range Scans - Postgres 19

Posted by Robins Tharakan on 2025-12-08 at 19:30

If you've ever had to scrub significantly large tables—whether updating older records or deleting expired ones—you know the pain of trying to do it efficiently without locking the entire table. One common trick is to iterate over the ctid (the physical location of the tuple) to process the table in chunks. Until now, this was strictly a single-threaded affair. But with the latest commit to

PostGIS Day 2025 Recap: AI, Lakehouses and Geospatial Community

Posted by Paul Ramsey in Crunchy Data on 2025-12-08 at 17:23

On Nov. 20, the day after GIS Day, Elizabeth Christensen and I hosted the 7th annual PostGIS Day, a celebration of the Little Spatial Database That Could. Brought to you this year by Snowflake, the event featured an amazing collection of speakers from around the globe — from India to Africa, Europe to North America.

The themes this year were, if you can forgive the pun, all over the map! Moving beyond the generic hype of AI into the practical reality of agents, we saw tools from Felt, Carto and Bunting Labs that can actually write, execute and debug spatial SQL on the fly. We also saw the lakehouse architecture take center stage, with PostGIS acting less as a data silo and more as a high-performance connector to the vast world of object storage, Iceberg and GeoParquet via tools such as pg_lake and Apache Sedona.

But it wasn't all bleeding edge; we grounded the day in massive, industrial-scale success stories. From enabling mapping at IGN France for more than 20 years to powering State Farm’s critical disaster response platforms, PostGIS remains the bedrock of modern geospatial infrastructure. A full playlist of PostGIS Day 2025 is available on the Snowflake YouTube channel. Here's a deeper look at what we learned.

Spatial AI agents: The real deal

We've all heard the buzzwords, but this year, we saw the code. The concept of AI agents was everywhere, but crucially, it wasn't just chatbots hallucinating SQL. We saw systems designed to iterate.

Brendan Ashworth from Bunting Labs drilled down to what "agent" actually means in the geospatial context. It isn't just asking a question and getting an answer. It's a loop. The agent writes a query, runs it against PostGIS, reads the error message if it fails, corrects itself, and runs it again. He showed us Mundi, an AI-native WebGIS that treats PostGIS as its execution engine, solving complex spatial problems without a human holding its hand.

Jaime Sanchez and Mamata Akella from Felt showed how they are teaching Claude to think spatially. They are

[...]

PostgreSQL, MongoDB, and what “cannot scale” really means

Posted by Umair Shahid in Stormatics on 2025-12-08 at 15:58

Last week, I read The Register’s coverage of MongoDB CEO Chirantan “CJ” Desai telling analysts that a “super-high growth AI company … switched from PostgreSQL to MongoDB because PostgreSQL could not just scale.” (The Register)

I believe you can show the value of your own technology without tearing down another. That is really what this post is about.

I run a company that lives inside PostgreSQL production issues every day. We deal with performance, HA and DR design, cost optimisation, and the occasional “we are down, please help” call from a fintech or SaaS platform. My perspective comes from that hands-on work, not from a quarterly earnings script.

Scaling stories are always more complex than a soundbite

When a CEO says, “This AI customer could not scale with PostgreSQL,” a lot of details disappear:

What did the schema and access patterns look like?
How was the application managing connections and transactions?
What was running underneath? Cloud managed PostgreSQL, a home-rolled cluster, or something else?
Were vertical and horizontal scaling options fully explored?
Did the team have PostgreSQL specialists involved, or was everything on default settings?

None of that reduces MongoDB’s strengths. Document databases are excellent for certain workloads: highly variable document shapes, rapid iteration, and development teams that think in JSON first. MongoDB has earned its place.

My concern is with the narrative that PostgreSQL, as a technology, “cannot scale,” especially given that it is the most popular database among professional developers and has overtaken MongoDB in the DB-Engines ranking over the last decade. (DB-Engines)

The method for calculating this ranking is detailed here: https://db-engines.com/en/ranking_definition.

Every dat

[...]

Contributions for week 50, 2025

Posted by Cornelia Biacsics in postgres-contrib.org on 2025-12-07 at 21:26

PGUG.EE met on December 3 2025 in Estonia, organized by Kaarel Moppel & Ervin Weber

Talks

Mayuresh Bagayatkar
Alexander Matrunich
Ants Aasma
Kaarel Moppel

Bruce Momjian spoke at the PG Armenia Community Meetup, organised by Emma Saroyan on December 4 2025.

A Meetup Quiz?

Posted by Bruce Momjian in EDB on 2025-12-05 at 13:00

I have attended over one hundred Postgres meetups over the years. The usual format is: food with individual discussion, lecture with group questions, and finally more individual discussion. I just spoke at an Armenia PostgreSQL User Group meetup and the event organizer Emma Saroyan did something different — she did a group mobile phone quiz after my lecture. To create the quiz questions before the event, she studied my slides and viewed a previous recording of my talk. The winner got a copy of her book.

I think the quiz was a great post-lecture group activity, and gave dynamism to the content. I know people will remember the quiz fondly when they think about the meetup. I encourage all meetups to have similar quizzes, though they do take work to create.

A deeper look at old UUIDv4 vs new UUIDv7 in PostgreSQL 18

Posted by Josef Machytka in credativ on 2025-12-05 at 11:43

In the past there have been many discussions about using UUID as a primary key in PostgreSQL. For some applications, even a BIGINT column does not have sufficient range: it is a signed 8‑byte integer with range −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807. Although these values look big enough, if we think about web services that collect billions or more records daily, this number becomes less impressive. Simple integer values can also cause conflicts of values in distributed system, in Data Lakehouses when combining data from multiple source databases etc.

However, the main practical problem with UUIDv4 as a primary key in PostgreSQL was not lack of range, but the complete randomness of the values. This randomness causes frequent B‑tree page splits, a highly fragmented primary key index, and therefore a lot of random disk I/O. There have already been many articles and conference talks describing this problem. What many of these resources did not do, however, was dive deep into the on‑disk structures. That’s what I wanted to explore here.

What are UUIDs

UUID (Universally Unique Identifier) is a 16‑byte integer value (128 bits), which has 2^128 possible combinations (approximately 3.4 × 10^38). This range is so large that, for most applications, the probability of a duplicate UUID is practically zero. Wikipedia shows a calculation demonstrating that the probability to find a duplicate within 103 trillion version‑4 UUIDs is about one in a billion. Another often‑quoted rule of thumb is that to get a 50% chance of one collision, you’d have to generate roughly 1 billion UUIDs every second for about 86 years.

Values are usually represented as a 36‑character string with hexadecimal digits and hyphens, for example: f47ac10b-58cc-4372-a567-0e02b2c3d479. The canonical layout is 8‑4‑4‑4‑12 characters. The first character in the third block and the first character in the fourth block have special meaning: xxxxxxxx-xxxx-Vxxx-Wxxx-xxxxxxxxxxxx – V marks UUI

[...]

The Future of the PostgreSQL Hacking Workshop

Posted by Robert Haas in EDB on 2025-12-04 at 18:58

The PostgreSQL Hacking Workshop will be taking a well-earned Christmas break in December of 2025. The future of the workshop is a little bit unclear, because I'm continuing to have a bit of trouble finding enough good talks online to justify doing one per month: the best source of talks for the event is pgconf.dev, but not all of those talks are about hacking on PostgreSQL, and not all of those that are about hacking are equally interesting to potential attendees. Also, we've already eaten through much of the backlog of older hacking-related talks that are posted online. Many conferences have few hacking-related talks, or just don't post any videos (which are expensive and time-consuming to produce).

PostgreSQL Contributor Story: Bryan Green

Posted by Floor Drees in EDB on 2025-12-04 at 15:52

Earlier this year we started a program (“Developer U”) to help colleagues who show promise for PostgreSQL Development to become contributors. Meet Bryan Green, working on the Platform Operations team at EDB, who just enjoys understanding how things work at the lowest levels.

JIT, episode III: warp speed ahead

Posted by Pierre Ducroquet on 2025-12-04 at 14:43

Previously…

In our first JIT episode, we discussed how we could, using copy-patch, easily create a JIT compiler for PostgreSQL, with a slight improvement in performance compared to the PostgreSQL interpreter.

In our second episode, I talked about the performance wall and how hard it was to have a real leap in performance compared to the interpreter. But it ended with a positive outlook, a nice performance jump that I was preparing at that moment…

Beating the interpreter, let’s get back to the basics

The interpreter will run each opcode for every record it has to process. Everything it has to do for each record that could be done only once is better done once, obviously. And this is where a JIT can beat it. The JIT compiler can choose optimizations that would require checks at each opcode for the interpreter, and thus self-defeating for the interpreter. For instance, I mentioned creating inlined opcodes for common function calls like int4eq : replacing the indirect call to int4eq with a comparison of the function pointer and then an inlined version would indeed be silly, since the comparison is going to waste a lot of time already.

So, what can’t the interpreter do? It sure can’t easily remove indirect calls, but this is a 1% performance gain, 2% at most. You won’t get to the headlines with that, right? Well, when in doubt, look at the past…

A short story about belief oriented programming…

A decade ago, I worked at a small company where I heard the weirdest thing ever regarding system performance: “our application is slower when built in 64 bits mode because the bigger pointer size makes it slower”. I didn’t buy this, spent two days digging into the code, and found that it was the opposite: 64 bits brought such a performance improvement that the entire system collapsed on a mutex that held a core structure in the application… Removing the mutex made the application fly in both 32 and 64 bits, with 64 bits beating 32 bits obviously.

But why is 64 bits faster? We are talking databa

[...]

Postgres Scan Types in EXPLAIN Plans

Posted by Elizabeth Garrett Christensen in Crunchy Data on 2025-12-04 at 13:00

The secret to unlocking performance gains often lies not just in what you ask in a query, but in how Postgres finds the answer. The Postgres EXPLAIN system is great for understanding how data is being queried. One of secretes to reading EXPLAIN plans is understanding the type of scan done to retrieve the data. The scan type can be the difference between a lightning-fast response or a slow query.

postgres explain plan

Today I’ll break down the most common scan types, how they work, and when you’ll see them in your queries.

Sequential scan

This type of data scan reads the entire table, row by row checking to see what matches the query conditions. If you have a WHERE or FILTER, Postgres just scans each row looking for matches.

Sequence scans are kind of the foundation of how scans are done and for many searches, this is what Postgres will use. For very large data sets, or those queried often, sequential scans are not ideal and an index scan may be faster. For that reason - knowing how to spot a seq scan vs index scan when reading an EXPLAIN plan is one the most important parts of reading a scan type in a query plan.

EXPLAIN select * from accounts;

QUERY PLAN
-------------------------------------------------------------
Seq Scan on accounts  (cost=0.00..22.70 rows=1270 width=36)
(1 row)

Index Scan

When you create an index in Postgres, you’re creating a column or multi-column reference that is stored on disk. Postgres is able to use this index as a map to the data stored in the table. A basic index scan uses a B-tree to quickly find the exact location of the data using a a two-step process: first Postgres finds the entry in the index, uses the reference, and then it fetches the rest of the row data from the table.

EXPLAIN select * from accounts where id = '5';

                                  QUERY PLAN
-------------------------------------------------------------------------------
 Index Scan using accounts_pkey on accounts  (cost=0.15..2.37 rows=1 width=36)
   Index Cond: (id = 5)
(2

[...]

Integrating Custom Storages with pgwatch

Posted by ahmed gouda in Cybertec on 2025-12-04 at 08:32

As a PostgreSQL-specific monitoring solution, pgwatch is mostly known for storing collected metrics in a PostgreSQL database. While great, as you probably should "just Use Postgres for everything" xD... in some scenarios and specific setups, this might be a limitation.

You may want to store your metrics in a particular OLAP database, or a specific format, etc. This is where pgwatch's gRPC client shines, as it enables users to integrate custom storage solutions by developing a compatible gRPC server that writes the metrics to their desired destination.

In this post, we will go through the process of developing a PoC for a custom gRPC server in Go that pgwatch can use to write metrics to ClickHouse.

pgwatch protobuf

First, let's understand the pgwatch gRPC client protobuf file.

In the file, you can see the main Receiver service defined with a couple of methods; let's explain them one by one.

UpdateMeasurements(MeasurementEnvelope) returns (Reply).
- It's the main write function that gets passed the measurement and writes it to the desired destination.
- The passed MeasurementEnvelope data type consists of DBName, which is the monitored source name, Tags, representing the tags associated with this source, MetricName, the metric being measured, and Data, a list of rows that resulted from running the metric query.
- Returns a log message in Reply.
SyncMetric(SyncReq) returns (Reply).
- It's a special function that synchronizes to the server the addition/deletion of a metric under a specific monitored source, so it can pre-allocate/free resources.
- The passed SyncReq data type consists of DBName, representing the source name under which the metric got added/deleted, MetricName, the metric added/deleted, and Operation, either addition or deletion.
- Returns a log message in Reply.
DefineMetrics(google.protobuf.Struct) returns (Reply).
- It's an optional, re

[...]

Building a RAG Server with PostgreSQL - Part 1: Loading Your Content

Posted by Dave Page in pgEdge on 2025-12-04 at 06:30

Retrieval-Augmented Generation (RAG) has become one of the most practical ways to give Large Language Models (LLMs) access to your own data. Rather than fine-tuning a model or hoping it somehow knows about your documentation, RAG lets you retrieve relevant content from your own sources and provide it as context to the LLM at query time. The result is accurate, grounded responses based on your actual content.In this three-part series, I'll walk through building a complete RAG server using PostgreSQL as the foundation. We'll cover:

Part 1
(this post): Creating a schema and loading your documents

Part 2
: Chunking documents and generating embeddings with pgEdge Vectorizer

Part 3
: Deploying a RAG API server for your applications

By the end of the series, you'll have a working RAG system that can answer questions using your own documentation or knowledge base.

Why PostgreSQL for RAG?

If you're already running PostgreSQL (and let's face it, you probably are), adding RAG capabilities to your existing infrastructure makes a lot of sense. With the pgvector extension, Postgres becomes a capable vector database without requiring you to deploy and manage yet another specialised system. Your documents, embeddings, and application data can all live in one place, with the transactional guarantees and operational tooling you already know.

The Architecture

Our RAG system consists of three components:

In this first post, we'll focus on getting your documents into the d[...]

Introducing Snowflake Sequences in a Postgres Extension-2

Posted by Ahsan Hadi in pgEdge on 2025-12-03 at 06:36

In a PostgreSQL database, sequences provide a convenient way to generate a unique identifier, and are often used for key generation. From the community, PostgreSQL provides functions and SQL language to help manage sequence generation, but the sequences themselves are not without limitations in a multi-master environment. As a result, we introduced 100% open-source Snowflake sequences that work seamlessly in a multi-master PostgreSQL cluster to remove those limitations and ensure your data can thrive at the network edge. (Check out the extension on GitHub!) The easiest way to get started using Snowflake sequences is to try out pgEdge Postgres, either the Enterprise or the Distributed edition, available for self-hosting or fully managed hosting on virtual machines, containers, or the cloud. The pgEdge Control Plane (also 100% open source and available on GitHub) makes it even easier to deploy and orchestrate Postgres clusters, and comes with the snowflake extension by default (amongst others).

Why are Sequences an Issue?

In a distributed multi-master database system, sequences can get complicated. Ensuring consistency and uniqueness across the nodes in your cluster is a problem if you use PostgreSQL sequences. The Snowflake extension is designed to step up to automatically mitigate this issue. PostgreSQL sequence values are prepared for assignment in a table in your PostgreSQL database; as each sequence value is used, the next sequence value is incremented. Changes to the next available sequence value are not replicated to the other nodes in your replication cluster. In a simple example, you might have a table on node , with 10 rows, each with a primary key that is assigned a sequence value from 1 to 10; the next prepared sequence value on will be 11. Rows are replicated from to without issue until you add a row on . The PostgreSQL sequence value table on has not been incrementing sequence values in step with the sequence value table on n1. When you add a row on , it will try to use the next available seque[...]

Speed up JOIN Planning - upto 16x Faster!

Posted by Robins Tharakan on 2025-12-02 at 19:30

The hidden cost of knowing too much. That's one way to describe what happens when your data is skewed, Postgres statistics targets are set high, and the planner tries to estimate a join. For over 20 years, Postgres used a simple O(N^2) loop to compare (equi-join) Most Common Values (MCVs) during join estimation. It worked fine when statistics targets are small (default_statistics_target defaults

Contributions for week 49, 2025

Posted by Boriss Mejias in postgres-contrib.org on 2025-12-02 at 12:10

PostgreSQL Belgium User Group had a new meetup on November 25, in Haasrode, Leuven, organized by Kim Jansen, An Vercammen, Boriss Mejías and Stefan Fercot.

Speakers:

An Vercammen
Marco Huygen
Priyanka Mittal
Boriss Mejías

What is better: a lookup table or an enum type?

Posted by Laurenz Albe in Cybertec on 2025-12-02 at 11:01

If people used lookup tables: a man is standing in front of a wall calendar and musing, "'Child 1 birthday' - Now who was child 1?"
© Laurenz Albe 2025

Sometimes a string column should only contain one of a limited number of distinct values. Examples would be the current state of a service request or the US state in an address. There are several ways to implement such a column in PostgreSQL, the most interesting being an enum type or a lookup table. In this article, I will explore the benefits and disadvantages of these methods.

Usually not the best solution: a string with a check constraint

To show this implementation, let's define a simplified table of Austrian citizens with their residential state:

CREATE UNLOGGED TABLE person_s (
   person_id bigint NOT NULL,
   state text NOT NULL,
   CHECK (state IN ('Wien', 'Niederösterreich', 'Burgenland',
                    'Steiermark', 'Oberösterreich', 'Kärnten',
                    'Tirol', 'Salzburg', 'Vorarlberg'))
);

There are around 8 million Austrians, so let's insert 8 million rows. Not all states are equally populous, so I use a CASE expression to assign states in an approximately realistic distribution:

INSERT INTO person_s (person_id, state)
SELECT i,
       CASE WHEN hashint4(i) < -1198295875 THEN 'Wien'
            WHEN hashint4(i) < -390842024 THEN 'Niederösterreich'
            WHEN hashint4(i) < -249108103 THEN 'Burgenland'
            WHEN hashint4(i) < 343597384 THEN 'Steiermark'
            WHEN hashint4(i) < 1060856922 THEN 'Oberösterreich'
            WHEN hashint4(i) < 1327144894 THEN 'Kärnten'
            WHEN hashint4(i) < 1692217115 THEN 'Tirol'
            WHEN hashint4(i) < 1958505087 THEN 'Salzburg'
            ELSE 'Vorarlberg'
       END
FROM generate_series(1, 8000000) AS i;

I'll add the primary key constraint and an index on the states to the table after loading the data because that is faster. Also, I'll run VACUUM and ANALYZE to improve the query performance:

ALTER TABLE person_s
   ADD CONSTRAINT person_s_pkey PRIMARY KEY (person_id);

CREATE INDEX person_s_state_idx ON person_s (state);

VACUUM (ANALYZE) person_s;

Disadvantages of t

[...]

CNPG Recipe 23 - Managing extensions with ImageVolume in CloudNativePG

Posted by Gabriele Bartolini in EDB on 2025-12-01 at 20:35

Say goodbye to the old way of distributing Postgres extensions as part of the main pre-built operand image. Leveraging the Kubernetes ImageVolume feature, CloudNativePG now allows you to mount extensions like pgvector and PostGIS from separate, dedicated images. This new declarative method completely decouples the PostgreSQL core from the extension binaries, enabling dynamic addition, easier evaluation, and simplified updates without ever having to build or manage monolithic custom container images.

Part 3: Postgres Journey to the top with developers

Posted by Tom Kincaid in EDB on 2025-12-01 at 18:33

This blog provides my opinion on how Postgres, according to the annual Stack Overflow developer survey, became the most admired, desired and used database by developers. This is part three in my series about the Postgres journey to the top with developers. Here are the first parts of this story:

Postgres incredible rise to the top with developers. This presented the fact as a result of the changing survey results over the years.
Part 2: PostgreSQL’s incredible trip to the top with developers. This presented my opinion on why Postgres didn’t simply go away when it was a distant second, third or

November Meetup Recording

Posted by Henrietta Dombrovskaya on 2025-12-01 at 13:28

And our last 2025 recording is here! Check out Jay Miller’s talk!

PgPedia Week, 2025-11-30

Posted by Ian Barwick on 2025-12-01 at 09:45

PostgreSQL 19 changes this week debug_exec_backend GUC to show EXEC_BACKEND state pg_buffercache view pg_buffercache_os_pages added following functions added: pg_buffercache_mark_dirty() pg_buffercache_mark_dirty_relation() pg_buffercache_mark_dirty_all() pg_replication_slots column slotsync_skip_reason added pg_stat_replication_slots columns slotsync_skip_count and slotsync_skip_at added Planner will now replace COUNT(ANY) with COUNT(*) , when possible PostgreSQL 19 articles Teaching Query Planner to See Inside C Functions (2025-11-27) - Robins Tharakan Settling COUNT(*) vs COUNT(1) debate in Postgres 19 (2025-11-27) - Robins Tharakan Super fast aggregations in PostgreSQL 19 (2025-11-25) - Hans-Jürgen Schönig / CYBERTEC PostgreSQL 18 articles

more...

Contributions for week 48, 2025

Posted by Floor Drees in postgres-contrib.org on 2025-12-01 at 08:10

Seattle Postgres Users Group (SEAPUG) maintained the PostgreSQL booth at PASS Data Community Summit 2025 from November 17-21, 2025:

Lloyd Albin
Jeremy Schneider
Harry Pierson
Ben Chobot
Deon Gill
Rick Lowe
Pavlo Golub

Speakers:

John Martin
Grant Fritchey
Lukas Fittl
Ryan Booz
Taiob Ali
John Martin
Pavlo Golub
Ramesh Pathuri
Abhishek Shukla
Ezat Karimi
Adam Machanic
Shane Borde
William Mentaze
Alvaro Costa-Neto
James Phillips
Janis Griffin
Jeremy Schneider
Shivam Gulati
Tim Steward
Sudhir Amin
Minesh Chande

The Mumbai PostgreSQL User Group meetup took place on 20th Nov, 2025, organized by Sovenath Shaw, Ajit Gadge and Deepak Mahto.

Speakers:

Jignesh Shah (AWS)
Suraj Kharage(EDB)
Jessica Shelar (DataCloudGaze Consulting)
Aniket Jadhav, Rushikesh Ghantewar (CDAC)
Trushar Borse , Sagar Sidhpura(Google Cloud)

PostgreSQL Conference Japan 2025 took place on November 21st. It was organized by Haruka Takatsuka, Kazushiro Takeda, Rintaro Ikeda, Tetsuo Sakata, Kousuke Kida, and Junichi Tado.

During the conference, the following volunteers helped running the conference: Tamotsu Odaka, Taiki Koshino, Nozomi Anzai, Jun Kuwamura, Hiroki Kataoka, Makoto Kaga, Tetsuji Koyama, Seiji Goto, Shinobu Honma, Kenji Sato, Katsuaki Fukushima, Thiduong Dang, Yung-Chung Ku, Wataru Takahashi, and Kouhei Ito.

Postgres Booth at PASS Data Community Summit

Posted by Jeremy Schneider on 2025-11-30 at 23:40

PASS Data Community Summit 2025 wrapped up last week. This conference originated 25 years ago with the independent, user-led, not-for-profit “Professional Association for SQL Server (PASS)” and the annual summit in Seattle continues to attract thousands of database professionals each year. After the pandemic it was reorganized and broadened as a “Data Community” event, including a Postgres track.

Starting in 2023, volunteers from the Seattle Postgres User Group have staffed a postgres community booth on the exhibition floor. We provide information about Postgres User Groups around the world and do our best to answer all kinds of questions people have about Postgres. The booth consistently gets lots of traffic and questions.

The United States PostgreSQL Association has generously supplied one of their booth kits each year, which has a banner/background and some booth materials like stickers and a map with many user groups and a “welcome to postgres” handout and postgres major version handouts. We supplement with extra pins and stickers and printouts like the happiness hints I’ve put together, a list of common extensions that Rox made, and a list of Postgres events that Lloyd made. Every year, we also bring leftover Halloween candy that we want to get rid of and we put it in a big bowl on the table.

One of the top questions people ask is how and where they can learn more about Postgres. Next year I might just print out the Links section from my blog, which has a bunch of useful free resources. Another idea I have is for RedGate and EnterpriseDB – I think both of these companies have paid training but also give free access to a few introductory classes – it would be nice if they made a small card with a link to their free training. I think we could have a stack of these cards at our user groups and at the PASS booth. The company can promote paid training, but the free content can benefit anyone even if they aren’t interested in the paid training. I might also reach out to other companies who have paid

[...]

Analyzing psqlrc Settings on GitHub: How PostgreSQL Engineers Configure PostgreSQL

Posted by Shinya Kato on 2025-11-30 at 13:48

Introduction

Recently, I read an article titled "Alias Settings of Engineers Around the World" (in Japanese). As a PostgreSQL engineer, this got me thinking: "If they are customizing their bash aliases, how are they configuring their psql environments?"

Driven by curiosity, I decided to investigate GitHub repositories to see how developers commonly configure their psqlrc files.

What is psqlrc?

psql is the terminal-based front-end for PostgreSQL, allowing you to execute SQL interactively. Its configuration file is named psqlrc, and it has the following characteristics:

System-wide settings: The system-wide psqlrc file is stored in ../etc/ relative to the directory containing the PostgreSQL executable. This directory can be explicitly set using the environment variable PGSYSCONFDIR.
User-specific settings: The user-specific .psqlrc file is stored in the executing user's home directory. This file can be explicitly set using the environment variable PSQLRC.
Versioning: You can target specific psql versions by appending the version number to the filename, such as .psqlrc-14.
References: For more details, refer to the official documentation or the PostgreSQL Wiki.

Investigation

Methodology

Use the GitHub API to extract URLs of files containing the word psqlrc.
Insert the contents of these files into a PostgreSQL database.
Analyze the most frequently used settings.

Extracting URLs via GitHub API

I used the Search API. Here are the key points to consider:

The GitHub API returns a maximum of 1,000 results. To bypass this, I sliced the search query by file size.
The API returns only 100 results per page, so I used a for loop to fetch all pages.
To avoid hitting rate limits and angering GitHub, I inserted sleep commands.

#!/bin/bash

GITHUB_TOKEN=hoge # Set your own GitHub token here

# API call for files 0 to 1000 bytes
for size in `seq 0 50 950`;
do
  for page in `seq 10`;
  do
    curl -ksS \
      -H "Accep

[...]

Using JSON: json vs. jsonb, pglz vs. lz4, key optimization, parsing speed?

Posted by Hubert 'depesz' Lubaczewski on 2025-11-29 at 19:16

Recently(ish) I had a conversation on one of PostgreSQL support chats (IRC, Slack, or Discord) about efficient storage of JSON data, which compression to use, which datatype. Unrelated to this, some people (at least two over the last year or so) said that they aren't sure if PostgreSQL doesn't optimize storage between columns, for example, … Continue reading

Teaching Query Planner to See Inside C Functions

Posted by Robins Tharakan on 2025-11-28 at 19:30

The Postgres planner just got a new superpower: X-ray vision for your black-box functions. For years, Postgres has been able to "inline" simple SQL functions. If you wrote CREATE FUNCTION ... LANGUAGE SQL AS 'SELECT ...', the planner would often rip out the function call and paste the raw SQL directly into the main query. This was magic. It meant that WHERE clauses could be pushed down, indexes

TDE is now available for PostgreSQL 18

Posted by Jan Wieremjewicz in Percona on 2025-11-28 at 11:00

Back in October, before PGConf.EU, I explained the issues impacting the prolonged wait for TDE in PostgreSQL 18. Explanations were needed as users were buzzing with anticipation, and they deserved to understand what caused the delays and what the roadmap looked like. In that blog post I have shared that due to one of the features newly added in 18.0, the Asynchronous IO (AIO), we have decided to give ourselves time until 18.1 has been released to provide a build with TDE. We wanted to ensure best quality of the solution and that takes time.

Settling COUNT(*) vs COUNT(1) debate in Postgres 19

Posted by Robins Tharakan on 2025-11-27 at 13:29

A recent commit to the PostgreSQL master branch brings a nice quality-of-life optimization for a very common SQL pattern - improving performance by up to 64% for SELECT COUNT(h) where h is a NOT NULL column. If you've ever wondered whether you should use COUNT(*) or COUNT(1), or if you've been dutifully using COUNT(id) on a non-null column, this change is for you. Note: This feature is currently

Latest Blog Posts

Posted by Paul Ramsey in Crunchy Data on 2025-12-09 at 13:00

Posted by Hans-Juergen Schoenig in Cybertec on 2025-12-09 at 08:25

Scenario 1: Automatic failover with Patroni

Advantages of this approach

Transparent connection handling

Posted by Dave Page in pgEdge on 2025-12-09 at 06:30

What Are Embeddings and Why Chunk?

Why Vectorize in the Database?

Prerequisites

Posted by Robins Tharakan on 2025-12-08 at 19:30

Posted by Paul Ramsey in Crunchy Data on 2025-12-08 at 17:23

Spatial AI agents: The real deal

Posted by Umair Shahid in Stormatics on 2025-12-08 at 15:58

Scaling stories are always more complex than a soundbite

Posted by Cornelia Biacsics in postgres-contrib.org on 2025-12-07 at 21:26

Posted by Bruce Momjian in EDB on 2025-12-05 at 13:00

Posted by Josef Machytka in credativ on 2025-12-05 at 11:43

What are UUIDs

Posted by Robert Haas in EDB on 2025-12-04 at 18:58

Posted by Floor Drees in EDB on 2025-12-04 at 15:52

Posted by Pierre Ducroquet on 2025-12-04 at 14:43

Previously…

Beating the interpreter, let’s get back to the basics

A short story about belief oriented programming…

Posted by Elizabeth Garrett Christensen in Crunchy Data on 2025-12-04 at 13:00

Posted by ahmed gouda in Cybertec on 2025-12-04 at 08:32

pgwatch protobuf

Posted by Dave Page in pgEdge on 2025-12-04 at 06:30

Why PostgreSQL for RAG?

The Architecture

Posted by Ahsan Hadi in pgEdge on 2025-12-03 at 06:36

Why are Sequences an Issue?

Posted by Robins Tharakan on 2025-12-02 at 19:30

Posted by Boriss Mejias in postgres-contrib.org on 2025-12-02 at 12:10

Posted by Laurenz Albe in Cybertec on 2025-12-02 at 11:01

Usually not the best solution: a string with a check constraint

Disadvantages of t

Posted by Gabriele Bartolini in EDB on 2025-12-01 at 20:35

Posted by Tom Kincaid in EDB on 2025-12-01 at 18:33

Posted by Henrietta Dombrovskaya on 2025-12-01 at 13:28

Posted by Ian Barwick on 2025-12-01 at 09:45

Posted by Floor Drees in postgres-contrib.org on 2025-12-01 at 08:10

Posted by Jeremy Schneider on 2025-11-30 at 23:40

Posted by Shinya Kato on 2025-11-30 at 13:48

Introduction

What is psqlrc?

Investigation

Methodology

Extracting URLs via GitHub API

Posted by Hubert 'depesz' Lubaczewski on 2025-11-29 at 19:16

Posted by Robins Tharakan on 2025-11-28 at 19:30

Posted by Jan Wieremjewicz in Percona on 2025-11-28 at 11:00

Posted by Robins Tharakan on 2025-11-27 at 13:29

Top posters

Top teams

Feeds

Planet

Contact