Data Science with PostgreSQL – Aggregate Window Functions

Here we go, after weeks for procrastination finally the 2nd post in my series featuring Window Function in PostgreSQL. In this post, I’ll explain how to use Aggregate window functions – that we can use to calculate various aggregations such as average, counts, minimum / maximum values, and sum within each window or partition.

Image

Super Simple test dataset

Before we begin, let’s create a test table and insert sample records to test our queries. Here we have an Orders tables with columns like order id, order date, customer name, City and Amount.

--Create test table
CREATE TABLE IF NOT EXISTS Orders
(
	order_id INT,
	order_date DATE,
	customer_name VARCHAR(250),
	city VARCHAR(100),	
	order_amount INT
)

-- Insert sample rows
INSERT INTO Orders
VALUES	(101,'2022-02-01','David Smith','Charleston',20000),
  		(102,'2022-02-02','David Jones','Savannah',30000),
  		(103,'2022-02-03','John Smith','Seattle',6000),
  		(104,'2022-02-04','Michael Smith','Charleston',16000),
  		(105,'2022-02-05','David Williams','Seattle',8000),
  		(106,'2022-02-06','Paum Smith','Charleston',26000),
  		(107,'2022-02-10','Andrew Smith','Savannah',16000),
  		(108,'2022-02-11','David Brown','Savannah',3000),
  		(109,'2022-02-20','Robert Smith','Seattle',2000),
  		(110,'2022-02-25','Peter Smith','Charleston',600),
  		(111,'2022-02-26','Smith Ross','Savannah',600),
  		(112,'2022-02-27','Mike Pearl','Seattle',600);

Types of Aggregate window functions

1. AVG() – Computes the average of a given list of elements or buckets

2. COUNT() – Computes the total number of elements in a given list of elements or buckets

3. MAX() – Outputs maximum value in a given list of elements or buckets

4. MIN() – Outputs minimum value in a given list of elements or buckets

5. SUM() – Computes the total number of elements in a given list of elements or buckets

Using Aggregate Window Functions

-- AGGREGATE WINDOW FUNCTIONS 
SELECT city,
       customer_name,
       order_amount,
       AVG(order_amount)
         OVER(
           partition BY city) AS "average_order_amount",
       COUNT(order_amount)
         OVER(
           partition BY city) AS "number_of_customers",
       MIN(order_amount)
         OVER(
           partition BY city) AS "minimum_order_amount",
       MAX(order_amount)
         OVER(
           partition BY city) AS "minimum_order_amount",
       SUM(order_amount)
         OVER(
           partition BY city) AS "sum_order_amount"
FROM   orders; 

The key thing to note is this statement OVER(partition BY city) that tells the query to GROUP BY CITY (i.e. Charleston, Savannah etc.) and calculate the Average for the order_amount.

Image

Easy to grasp right! So we’re getting used to this concept of PARTITION BY. In the next post, I’ll cover next category of Ranking Window Functions. To read additional details about postgres window functions, please refer to the PostgreSQL Window Function Docs.

This is 2nd article in four-part series on Window Functions in PostgreSQL
Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏻‍💻

Unknown's avatar

Author: Varun Dhawan

I’m Varun. I used to be a Software Engineer building data applications for large corporations like McKinsey and Target. Now, I’m a Product Manager at Microsoft, making Azure PostgreSQL the go-to platform for running mission-critical workloads (and no, I’m not obsessing over every little detail… I swear). When I’m not working, you can find me blogging at data-nerd.blog, where I help fellow data enthusiasts master PostgreSQL, sharpen their coding skills, and navigate their careers with confidence. And if there’s one thing you absolutely need to know about me, it’s that…I'm apparently a great cook—just don’t ask why I’m the only one who eats my food.

3 thoughts on “Data Science with PostgreSQL – Aggregate Window Functions”

Leave a comment