Using AVG() with DISTINCT in SQL Server

When working with averages in SQL Server, it’s easy to assume that AVG() just takes all rows into account and calculates a simple mean. And that’s true. By default, AVG() includes every value in the column you point it to. But sometimes, you may want to average only unique values in that column, which is where DISTINCT comes into play.

Let’s explore this with a simple example.

Read more

Why Your SQL Server Averages Keep Losing Decimals

At first glance, calculating an average in SQL Server seems straightforward. Just wrap a column in the AVG() function and you’re done. But there’s a subtle catch when working with integer columns. If you pass an integer to AVG() the result will be an integer, even if the actual average includes a fractional part. If you’re not aware of this when calculating averages, you could potentially draw the wrong conclusion from your query results.

Let’s unpack the behavior and then see how we can fix it.

Read more

A Quick Look at SQL Server’s MIN() Function

The MIN() function in SQL Server returns the smallest value from a set of rows. It’s commonly used to find earliest dates, lowest prices, or in general the minimum of any column. While the function itself is simple, you may encounter it written with options like DISTINCT, ALL, or as a window function with OVER(). Some of these options don’t actually change the result in SQL Server but exist for standards compatibility, so it’s worth understanding what they mean if you ever see them in code.

Let’s take a look at a few simple examples to see how it works.

Read more

Using SUM() and AVG() with GROUP BY in SQL Server

When working with data, we often need to roll up numbers by categories. For example, calculating sales totals by region, or averaging test scores by class. SQL Server’s SUM() and AVG() functions can work perfectly for this scenario when combined with the GROUP BY clause. This combo can provide quick insights without having to do the math yourself. Let’s walk through how this works with an example.

Read more

Using MAX() in SQL Server

The MAX() function is one of SQL Server’s simplest aggregate functions. It returns the largest value from a column. While it’s usually straightforward, there are a few useful ways to apply it depending on whether you’re using it as a plain aggregate or as a window function with OVER().

You might also see MAX() that use a DISTINCT clause. Truth be told, this doesn’t actually change the results. That clause is only for standards compatibility.

In any case, let’s walk through some examples to see how it all works.

Read more

Convert a Table to a JSON Document with JSON_GROUP_OBJECT() in DuckDB

In DuckDB, the json_group_object() function is a convenient way to aggregate data into JSON objects by pairing keys and values across rows within groups.

The function is especially useful when we’re transforming tabular data into a more hierarchical or nested JSON structure for web APIs, reporting, or downstream processing. It helps pivot rows into a single JSON object, making the data more compact and easier to consume in applications that require JSON formats.

Read more

Understanding DuckDB’s ARG_MIN_NULL() Function

DuckDB has a arg_min_null() function that works in a similar way to the arg_min() function. That is, it finds the row with the minimum value in one column and returns the corresponding value from another column at that row.

But there’s also a difference between these two functions. The main difference is in the way they deal with NULL values. Also, arg_min_null() only accepts two arguments, whereas arg_min() accepts an optional third argument. Additionally, there aren’t any aliases for arg_min_null() at the time of writing (arg_min() has a couple of aliases).

In this article we’ll look at how arg_min_null() works, and we’ll compare it with arg_min() to see how each function handles NULL values.

Read more

Convert Column Values to a JSON Array with DuckDB’s JSON_GROUP_ARRAY() Function

The json_group_array() function in DuckDB is used to aggregate values into a JSON array, making it especially useful when working with structured or semi-structured data such as JSON. This function is part of DuckDB’s JSON extension and works similarly to DuckDB’s string_agg() or the group_concat() function in some other RDBMSs, but instead of returning a delimited string, it returns a well-formed JSON array.

This function is particularly helpful when we need to represent grouped or hierarchical data in a JSON format for export, reporting, or further transformation.

Read more

3 Ways to Get the Weighted Average in DuckDB

Weighted averages are common calculations in data analysis, allowing us to assign different levels of importance to individual values in our dataset. Unlike simple averages, where each value has equal impact, weighted averages let us incorporate the relative significance of each observation. This is particularly valuable for scenarios like calculating GPA (where courses have different credit weights), investment portfolio returns (where assets have varying allocations), or quality ratings (where reviewers have different expertise levels).

In this article, we’ll explore three ways of calculating weighted averages in DuckDB.

Read more