Data Science with PostgreSQL – Using the Window frame_clause

Image

This is my 5th (and final) post on Window Function in PostgreSQL series. In previous posts on this topic, I have covered window function basics, using aggregate window functionsranking window function and value window functions. While you’re here, I’ll recommend to you check the previous posts on this topic.

In this article, we’ll learn a new concept frame_clause. Let’s jump right in!

What’s a window frame?

Windows function works on a group of rows called partitions (or window). A PARTITION BY clause divides the rows into partitions, so we can choose the specific rows to be included in a computation. A FRAME is a subset of this “current partition” and is defied using a frame clause.

Syntax for frame_clause

ROWS BETWEEN <starting_row> AND <ending_row>

Below are the possible values that can be specified for <starting_row> and the <ending_row> bounds of a window frame:

  • UNBOUNDED PRECEDING — all rows before the current row in the partition, i.e. the first row of the partition
  • [N] PRECEDING — No. of rows before the current row
  • CURRENT ROW — the current row
  • [M] FOLLOWING— No. of rows after the current row
  • UNBOUNDED FOLLOWING — all rows after the current row in the partition, i.e. the last row of the partition
Image

Let’s see this using sample dataset of a car dealership group. Here, we have a table car_sales_by_month that stores car sales information by months.

CREATE TABLE IF NOT EXISTS car_sales_by_month
(
	year INT,
	month INT,
	brand VARCHAR(250),
	model VARCHAR(250),
	type VARCHAR(250),
	quantity INT,
	sales INT
);
INSERT INTO car_sales_by_month VALUES 
	(2022, 1 ,'Toyota','RAV4','Suv',50,1500000),
	(2022, 1 ,'Toyota','Corolla','Sedan',10,2010000),
	(2022, 1 ,'Honda','CRV','Suv',30,8000000),
	(2022, 2 ,'Honda','CRV','Suv',40,33000000),
	(2022, 2 ,'Toyota','RAV4','Suv',30,1000000),
	(2022, 2 ,'Toyota','Corolla','Sedan',20,2050000),
	(2022, 3 ,'Honda','Civic','Sedan',60,10000000),
	(2022, 3 ,'Honda','NSX','Hybrid ',25,2004000),
	(2022, 3 ,'Toyota','Corolla','Sedan',30,1080000),
	(2022, 4 ,'Honda','Civic','Sedan',60,30000000),
	(2022, 4 ,'Honda','NSX','Hybrid ',25,2004000),
	(2022, 4 ,'Toyota','Corolla','Sedan',35,4520000);
Image
Scenario 1. Query to report yearly revenue by make of the car.
SELECT brand,
       SUM(sales) AS "Total sales $"
FROM   car_sales_by_month
	GROUP BY brand;
Image
Scenario 2. Query to report monthly sales revenue changes for a specific car model.
SELECT brand,
       model,
       month,
       sales AS current_month_sales,
       LAG(sales) OVER ( ORDER BY month) AS previous_month_sales,
       sales - LAG (sales) OVER (ORDER BY month) AS difference
FROM car_sales_by_month
WHERE year = 2022
  AND model = 'Corolla'
Image
Scenario 3.  Query to report the total sales for current month and previous month, alongwith the maximum sales in any individual month throughout the year.
SELECT brand,
       model,
       month,
       sales AS current_month_sales,
       LAG(sales) OVER (PARTITION BY brand, model ORDER BY month) AS prev_month,
       MAX(sales) OVER (PARTITION BY brand, model ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_year_sales
FROM car_sales_by_month
WHERE year = 2022
Image

💡 One worthy note is that anytime that you specify an ORDER BY clause, PostgreSQL will automatically set the default window as ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Conclusion

The real magic of window functions is actually the frame i.e. subset of records over which the window function works. Here we just learned that a window frame depends on the current row and is defined by over clause.

If you have followed through all my posts, you now have seen most of the common Window functions/patterns. Great Job! Remember, window functions often get discussed during data engineer / scientist job interviews. So if you want to deepen your knowledge, I’d recommend reading PostgreSQL docs. And I hope the details and examples shared in this series will help in acing your next interview. All the best 👍

Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏻‍💻

Unknown's avatar

Author: Varun Dhawan

I’m Varun. I used to be a Software Engineer building data applications for large corporations like McKinsey and Target. Now, I’m a Product Manager at Microsoft, making Azure PostgreSQL the go-to platform for running mission-critical workloads (and no, I’m not obsessing over every little detail… I swear). When I’m not working, you can find me blogging at data-nerd.blog, where I help fellow data enthusiasts master PostgreSQL, sharpen their coding skills, and navigate their careers with confidence. And if there’s one thing you absolutely need to know about me, it’s that…I'm apparently a great cook—just don’t ask why I’m the only one who eats my food.

2 thoughts on “Data Science with PostgreSQL – Using the Window frame_clause”

Leave a comment