Summary: in this tutorial, you’ll learn how to use PostgreSQL CUBE to generate all possible aggregations of the specified columns.
Introduction to PostgreSQL CUBE #
The GROUP BY clause allows you to group rows and calculate an aggregation of a single grouping set.
To calculate aggregations of all possible combinations of a set of columns, you can use the GROUP BY clause with the CUBE option.
Here’s the syntax of the GROUP BY with the CUBE:
SELECT
column1,
column2,
aggregate_function (column3)
FROM
table_name
GROUP BY
CUBE (column1, column2);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this syntax, CUBE will generate the following groupings:
(column1, column2): Individual groups bycolumn1andcolumn2.(column1, NULL): Groups for values incolumn2.(NULL, column2): Groups for values incolumn1.(NULL, NULL): A group for all columns.
PostgreSQL CUBE Examples #
We’ll use the inventory_reports table to demonstrate the GROUP BY CUBE:
Here’s the script for generating inventory_reports table:
CREATE TABLE inventory_reports(
warehouse VARCHAR NOT NULL,
brand VARCHAR NOT NULL,
quantity INT NOT NULL
);
INSERT INTO inventory_reports(warehouse, brand, quantity)
VALUES
('San Jose', 'Apple', 100),
('San Francisco', 'Apple', 200),
('Texas', 'Apple', 300),
('San Jose', 'Samsung', 50),
('San Francisco', 'Samsung', 100),
('Texas', 'Samsung', 150)
RETURNING *;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Using PostgreSQL CUBE with One Column #
The following query uses a GROUP BY clause to calculate the total quantity of each brand:
SELECT
brand,
SUM(quantity) total_quantity
FROM
inventory_reports
GROUP BY
brand;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
brand | total_quantity
---------+----------------
Samsung | 300
Apple | 600Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)To include a row that calculates the total quantity for all brands, you can use the GROUP BY CUBE as follows:
SELECT
brand,
SUM(quantity) total_quantity
FROM
inventory_reports
GROUP BY
CUBE (brand)
ORDER BY
brand NULLS LAST;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
brand | total_quantity
---------+----------------
Apple | 600
Samsung | 300
NULL | 900 -> grand totalCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this example, the CUBE generates two groupings:
(brand): Calculate the sum of quantity by brand.(): Calculate the sum of the quantity of all brands.
Using CUBE with Multiple Columns #
The following query uses the GROUP BY CUBE to generate the subtotals and total:
SELECT
brand,
warehouse,
SUM(quantity) total_quantity
FROM
inventory_reports
GROUP BY
CUBE (brand, warehouse)
ORDER BY
brand NULLS LAST,
warehouse NULLS LAST;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
brand | warehouse | total_quantity
---------+---------------+----------------
Apple | San Francisco | 200
Apple | San Jose | 100
Apple | Texas | 300
Apple | NULL | 600
Samsung | San Francisco | 100
Samsung | San Jose | 50
Samsung | Texas | 150
Samsung | NULL | 300
NULL | San Francisco | 300
NULL | San Jose | 150
NULL | Texas | 450
NULL | NULL | 900Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this example, the CUBE generates four grouping sets:
(brand, warehouse): Calculate the total quantity by brand and warehouse.(brand, NULL): Calculate the subtotal per brand, summing all warehouses for that brand.NULLin thewarehousecolumn.(NULL, warehouse): Calculate the subtotal per warehouse, summing across all brands.NULLin thebrandcolumn.(NULL, NULL): Calculate the grand total, summing all warehouses and brands together.
Differences Between GROUPING SETS, ROLLUP, and CUBE #
The following table explains the differences between GROUPING SETS, ROLLUP and CUBE:
| Option | Description |
|---|---|
| CUBE | Generates all possible combinations of specified columns. |
| ROLLUP | Generates hierarchical summaries. |
| GROUPING SETS | Generates specified grouping sets instead of generating all possibilities. |
Summary #
- Use PostgreSQL
CUBEto create multi-dimensional summary by calculating all possible aggregations of the specified columns.