Summary: in this tutorial, you’ll learn how to use PostgreSQL GROUP BY to group rows into groups based on values of one or more columns.
Getting started with PostgreSQL GROUP BY clause #
The PostgreSQL GROUP BY clause allows you to group rows into groups based on values of one or more columns.
Here’s the syntax of the GROUP BY clause:
SELECT column1, column2
FROM table_name
GROUP BY column1, column2;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this syntax:
- First, specify the table name from which you want to select data in the
FROMclause. - Second, provide one or more columns where you want to group values in the
GROUP BYclause. - Third, list the grouped columns in the
SELECTclause.
It’s crucial to note that the columns in the SELECT clause must appear in the GROUP BY clause.
You’ll encounter an error if you specify a column in the SELECT clause that does not appear in the GROUP BY clause.
Basic PostgreSQL GROUP BY clause example #
Suppose we have the following inventories table:
| id | product_name | quantity | warehouse_id | brand_id |
|---|---|---|---|---|
| 1 | iPhone 15 | 55 | 1 | 1 |
| 2 | iPhone 14 Pro Max | 15 | 2 | 1 |
| 3 | iPhone 13 | 35 | 2 | 1 |
| 4 | Galaxy S24 | 50 | 1 | 2 |
| 5 | Galaxy Note 23 | 40 | 2 | 2 |
| 6 | Galaxy Z Fold 6 | 25 | 2 | 2 |
| 7 | Galaxy Z Flip 6 | 15 | 2 | 2 |
| 8 | Pixel 9 | 50 | 1 | 3 |
| 9 | Pixel 8 Pro | 25 | 2 | 3 |
| 10 | Pixel Fold | 60 | 1 | 3 |
SQL script to create tables and populate sample data
CREATE TABLE warehouses(
warehouse_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
warehouse_name VARCHAR(255) NOT NULL UNIQUE
);
CREATE TABLE brands(
brand_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
brand_name VARCHAR(255) NOT NULL UNIQUE
);
CREATE TABLE inventories(
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
quantity INT NOT NULL,
warehouse_id int not null,
brand_id int not null,
FOREIGN KEY (warehouse_id) REFERENCES warehouses(warehouse_id) ON DELETE CASCADE,
FOREIGN KEY (brand_id) REFERENCES brands(brand_id) ON DELETE CASCADE
);
INSERT INTO warehouses (warehouse_name)
VALUES ('San Jose'),
('San Francisco');
INSERT INTO brands (brand_name)
VALUES ('Apple'),
('Samsung'),
('Google');
INSERT INTO inventories (product_name, quantity, warehouse_id, brand_id)
VALUES
('iPhone 15', 55, 1, 1),
('iPhone 14 Pro Max', 15, 2, 1),
('iPhone 13', 35, 2, 1),
('Galaxy S24', 50, 1, 2),
('Galaxy Note 23', 40, 2, 2),
('Galaxy Z Fold 6', 25, 2, 2),
('Galaxy Z Flip 6', 15, 2, 2),
('Pixel 9', 50, 1, 3),
('Pixel 8 Pro', 25, 2, 3),
('Pixel Fold', 60, 1, 3);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)The following statement uses the GROUP BY clause to group rows in the products table by the values in the warehouse_id column:
SELECT
warehouse_id
FROM
inventories
GROUP BY
warehouse_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
warehouse_id
--------------
2
1Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)How it works:
First, the FROM clause starts examining all rows from the inventories table.
Second, the GROUP BY clause selects unique values in the warehouse_id column. For each warehouse_id value, the GROUP BY gathers rows into a group. So we have two groups with warehouse_id 1 and 2:
| warehouse_id | product_name | quantity | brand_id |
|---|---|---|---|
| 1 | iPhone 15 | 55 | 1 |
| Galaxy S24 | 50 | 2 | |
| Pixel 9 | 50 | 3 | |
| Pixel Fold | 60 | 3 | |
| 2 | Galaxy Note 23 | 40 | 2 |
| Galaxy Z Fold 6 | 25 | 2 | |
| Galaxy Z Flip 6 | 15 | 2 | |
| iPhone 14 Pro Max | 15 | 1 | |
| iPhone 13 | 35 | 1 | |
| Pixel 8 Pro | 25 | 3 |
Finally, the SELECT clause selects the grouped column warehouse_id :
| warehouse_id |
|---|
| 1 |
| 2 |
The GROUP BY clause is more useful when used with aggregate functions.
Aggregate functions #
An aggregate function takes a group of values, performs a calculation, and returns a single value.
Here are the most commonly used aggregate functions:
MINfunction returns the minimum value in a set.MAXfunction returns the maximum value in a set.SUMfunction returns the total of a set.AVGfunction returns the average value of a set.COUNTfunction returns the number of items of a set.
MIN function #
The following statement uses the MIN() statement to find the lowest quantity in the inventories table:
SELECT
MIN(quantity)
FROM
inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
min
-----
15Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)How the query works:
- First, the
FROMclause examines every row in theinventoriestable. - Second, the
MINfunction returns the lowest quantity of all the rows.
MAX Function #
The following example uses the MAX() function to find the highest quantity of a product in the inventory:
SELECT
MAX(quantity)
FROM
inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
max
-----
60Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)SUM Function #
The following SELECT statement uses the SUM() function to calculate the total quantity of all products in the inventories table:
SELECT
SUM(quantity)
FROM
inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
sum
-----
370Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)AVG Function #
The following statement uses the AVG() function to calculate the average quantity of all products in the inventories table:
SELECT
AVG(quantity)
FROM
inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
avg
---------------------
37.0000000000000000Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)COUNT Function #
The following statement uses the COUNT() function to return the number of rows in the inventories table:
SELECT
COUNT(*)
FROM
inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
count
-------
10Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Using PostgreSQL GROUP BY clause with aggregate functions #
Here’s the syntax for using a GROUP BY clause with aggregate functions:
SELECT
column1,
aggregate_function (column2)
FROM
table_name
GROUP BY
column1;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this syntax:
First, the GROUP BY clause groups the rows from the table into groups based on the column1.
Second, the aggregate function performs a calculation and returns a single value for each group:
Third, the SELECT clause returns the grouped column (column1) and an aggregated value for each group.
For example, the following shows how to calculate the total quantity for each warehouse:
SELECT
warehouse_id,
SUM(quantity)
FROM
inventories
GROUP BY
warehouse_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
warehouse_id | sum
--------------+-----
2 | 155
1 | 215Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this example:
The FROM clause starts examining all rows in the inventories table.
The GROUP BY clause divides all the rows in the inventories into two groups:
| warehouse_id | sum | product_name | quantity | brand_id |
|---|---|---|---|---|
| 1 | 215 | iPhone 15 | 55 | 1 |
| Galaxy S24 | 50 | 2 | ||
| Pixel 9 | 50 | 3 | ||
| Pixel Fold | 60 | 3 | ||
| 2 | 155 | Galaxy Note 23 | 40 | 2 |
| Galaxy Z Fold 6 | 25 | 2 | ||
| Galaxy Z Flip 6 | 15 | 2 | ||
| iPhone 14 Pro Max | 15 | 1 | ||
| iPhone 13 | 35 | 1 | ||
| Pixel 8 Pro | 25 | 2 |
The SUM function then calculates the total of the quantity for each group.
The SELECT clause returns the warehouse_id and total quantity for each group:
Using PostgreSQL GROUP BY with INNER JOIN #
You can use a join to merge rows of multiple tables and use the GROUP BY clause to group rows into groups.
For example, the following query returns the warehouse name and the total quantity of products for each warehouse:
SELECT
warehouse_name,
SUM(quantity)
FROM
inventories
INNER JOIN warehouses USING (warehouse_id)
GROUP BY
warehouse_name;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
warehouse_name | sum
----------------+-----
San Jose | 215
San Francisco | 155Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)How it works.
First, the FROM examines all rows in the inventories table.
Next, the INNER JOIN merge rows of the inventories table with the rows in the warehouses table based on the values in the warehouse_id column:
| id | product_name | quantity | warehouse_id | brand_id | warehouse_id | warehouse_name |
|---|---|---|---|---|---|---|
| 1 | iPhone 15 | 55 | 1 | 1 | 1 | San Jose |
| 2 | iPhone 14 Pro Max | 15 | 2 | 1 | 2 | San Francisco |
| 3 | iPhone 13 | 35 | 2 | 1 | 2 | San Francisco |
| 4 | Galaxy S24 | 50 | 1 | 2 | 1 | San Jose |
| 5 | Galaxy Note 23 | 40 | 2 | 2 | 2 | San Francisco |
| 6 | Galaxy Z Fold 6 | 25 | 2 | 2 | 2 | San Francisco |
| 7 | Galaxy Z Flip 6 | 15 | 2 | 2 | 2 | San Francisco |
| 8 | Pixel 9 | 50 | 1 | 3 | 1 | San Jose |
| 9 | Pixel 8 Pro | 25 | 2 | 3 | 2 | San Francisco |
| 10 | Pixel Fold | 60 | 1 | 3 | 1 | San Jose |
Then, the GROUP BY group merged rows by the values in the warehouse_name:
| warehouse_name | id | product_name | quantity | warehouse_id | brand_id |
|---|---|---|---|---|---|
| San Jose | 1 | iPhone 15 | 55 | 1 | 1 |
| 4 | Galaxy S24 | 50 | 1 | 2 | |
| 8 | Pixel 9 | 50 | 1 | 3 | |
| 10 | Pixel Fold | 60 | 1 | 3 | |
| San Francisco | 2 | iPhone 14 Pro Max | 15 | 2 | 1 |
| 3 | iPhone 13 | 35 | 2 | 1 | |
| 5 | Galaxy Note 23 | 40 | 2 | 2 | |
| 6 | Galaxy Z Fold 6 | 25 | 2 | 2 | |
| 7 | Galaxy Z Flip 6 | 15 | 2 | 2 | |
| 9 | Pixel 8 Pro | 25 | 2 | 3 |
After that, the SUM function calculates the total for each group:
| warehouse_name | sum | id | product_name | quantity | warehouse_id | brand_id |
|---|---|---|---|---|---|---|
| San Jose | 215 | 1 | iPhone 15 | 55 | 1 | 1 |
| 4 | Galaxy S24 | 50 | 1 | 2 | ||
| 8 | Pixel 9 | 50 | 1 | 3 | ||
| 10 | Pixel Fold | 60 | 1 | 3 | ||
| San Francisco | 155 | 2 | iPhone 14 Pro Max | 15 | 2 | 1 |
| 3 | iPhone 13 | 35 | 2 | 1 | ||
| 5 | Galaxy Note 23 | 40 | 2 | 2 | ||
| 6 | Galaxy Z Fold 6 | 25 | 2 | 2 | ||
| 7 | Galaxy Z Flip 6 | 15 | 2 | 2 | ||
| 9 | Pixel 8 Pro | 25 | 2 | 3 |
Finally, the SELECT clause returns the warehouse names and total quantity of each:
| warehouse_name | sum |
|---|---|
| San Jose | 215 |
| San Francisco | 200 |
Group by multiple columns #
The following statement uses the GROUP BY clause to group rows in the inventories table by warehouse_id and brand_id:
SELECT
warehouse_id,
brand_id,
SUM(quantity)
FROM
inventories
GROUP BY
warehouse_id,
brand_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
warehouse_id | brand_id | sum
--------------+----------+-----
1 | 1 | 55
2 | 3 | 25
1 | 3 | 110
2 | 2 | 80
1 | 2 | 50
2 | 1 | 50Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this example, the combination of values in the warehouse_id and brand_id column forms a group. The SUM() function returns the total of product quantity for each group.
Summary #
- Use the
GROUP BYclause to group rows in an SQL query into summary rows based on one or more columns. - Use aggregate functions with the
GROUP BYclause to calculate a value for each group. - Use
JOINwithGROUP BYto merge rows from multiple table and group rows into groups.