aggregate functions

Using AVG() with DISTINCT in SQL Server

October 3, 2025 by Ian

When working with averages in SQL Server, it’s easy to assume that AVG() just takes all rows into account and calculates a simple mean. And that’s true. By default, AVG() includes every value in the column you point it to. But sometimes, you may want to average only unique values in that column, which is where DISTINCT comes into play.

Let’s explore this with a simple example.

Why Your SQL Server Averages Keep Losing Decimals

September 18, 2025 by Ian

At first glance, calculating an average in SQL Server seems straightforward. Just wrap a column in the AVG() function and you’re done. But there’s a subtle catch when working with integer columns. If you pass an integer to AVG() the result will be an integer, even if the actual average includes a fractional part. If you’re not aware of this when calculating averages, you could potentially draw the wrong conclusion from your query results.

Let’s unpack the behavior and then see how we can fix it.

A Quick Look at SQL Server’s MIN() Function

September 11, 2025 by Ian

The MIN() function in SQL Server returns the smallest value from a set of rows. It’s commonly used to find earliest dates, lowest prices, or in general the minimum of any column. While the function itself is simple, you may encounter it written with options like DISTINCT, ALL, or as a window function with OVER(). Some of these options don’t actually change the result in SQL Server but exist for standards compatibility, so it’s worth understanding what they mean if you ever see them in code.

Let’s take a look at a few simple examples to see how it works.

Using SUM() and AVG() with GROUP BY in SQL Server

September 5, 2025 by Ian

When working with data, we often need to roll up numbers by categories. For example, calculating sales totals by region, or averaging test scores by class. SQL Server’s SUM() and AVG() functions can work perfectly for this scenario when combined with the GROUP BY clause. This combo can provide quick insights without having to do the math yourself. Let’s walk through how this works with an example.

Using MAX() in SQL Server

September 4, 2025 by Ian

The MAX() function is one of SQL Server’s simplest aggregate functions. It returns the largest value from a column. While it’s usually straightforward, there are a few useful ways to apply it depending on whether you’re using it as a plain aggregate or as a window function with OVER().

You might also see MAX() that use a DISTINCT clause. Truth be told, this doesn’t actually change the results. That clause is only for standards compatibility.

In any case, let’s walk through some examples to see how it all works.

Convert a Table to a JSON Document with JSON_GROUP_OBJECT() in DuckDB

July 20, 2025 by Ian

In DuckDB, the json_group_object() function is a convenient way to aggregate data into JSON objects by pairing keys and values across rows within groups.

The function is especially useful when we’re transforming tabular data into a more hierarchical or nested JSON structure for web APIs, reporting, or downstream processing. It helps pivot rows into a single JSON object, making the data more compact and easier to consume in applications that require JSON formats.

Using JSON_TRANSFORM_STRICT() to Perform Strict JSON Transformations in DuckDB

July 11, 2025 by Ian

DuckDB provides us with a json_transform_strict() function that works just like the json_transform() function, except that it throws an error when type casting fails. The json_transform() function on the other hand, returns NULL for the respective field in such cases.

Understanding DuckDB’s ARG_MIN_NULL() Function

June 14, 2025 by Ian

DuckDB has a arg_min_null() function that works in a similar way to the arg_min() function. That is, it finds the row with the minimum value in one column and returns the corresponding value from another column at that row.

But there’s also a difference between these two functions. The main difference is in the way they deal with NULL values. Also, arg_min_null() only accepts two arguments, whereas arg_min() accepts an optional third argument. Additionally, there aren’t any aliases for arg_min_null() at the time of writing (arg_min() has a couple of aliases).

In this article we’ll look at how arg_min_null() works, and we’ll compare it with arg_min() to see how each function handles NULL values.

Convert Column Values to a JSON Array with DuckDB’s JSON_GROUP_ARRAY() Function

June 10, 2025 by Ian

The json_group_array() function in DuckDB is used to aggregate values into a JSON array, making it especially useful when working with structured or semi-structured data such as JSON. This function is part of DuckDB’s JSON extension and works similarly to DuckDB’s string_agg() or the group_concat() function in some other RDBMSs, but instead of returning a delimited string, it returns a well-formed JSON array.

This function is particularly helpful when we need to represent grouped or hierarchical data in a JSON format for export, reporting, or further transformation.

3 Ways to Get the Weighted Average in DuckDB

June 6, 2025 by Ian

Weighted averages are common calculations in data analysis, allowing us to assign different levels of importance to individual values in our dataset. Unlike simple averages, where each value has equal impact, weighted averages let us incorporate the relative significance of each observation. This is particularly valuable for scenarios like calculating GPA (where courses have different credit weights), investment portfolio returns (where assets have varying allocations), or quality ratings (where reviewers have different expertise levels).

In this article, we’ll explore three ways of calculating weighted averages in DuckDB.