Window functions in SQL are powerful tools for performing complex calculations across sets of rows that are related to the current row within a query. These functions are used to perform operations that require awareness of the context of the rows in a result set, such as calculating running totals, averages, or rankings.
Basic Concept of SQL Window Functions
Partitioning Data: Window functions allow you to partition your data into groups, similar to a GROUP BY clause, but without collapsing the rows into a single output row per group. This means you can calculate aggregate values (like sums and averages) while still keeping the individual rows of your data.
Ordering and Framing: Within each partition, you can order the rows and define a frame or window (a subset of rows) to perform calculations. This is useful for operations like running totals or moving averages.
Access to Rows: Window functions can access other rows in the frame without causing a group by effect. This allows calculations across multiple rows while still returning a value for each row in the original dataset.
Examples of Window Functions
Running Total: Calculate a cumulative sum within a partition.
SELECT order_id, amount, SUM(amount) OVER (PARTITION BY seller_id ORDER BY order_date) AS running_total FROM orders;
Ranking: Assign a rank within a partition.
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank FROM employees;
Moving Average: Calculate a moving average in a set of rows.
SELECT order_id, amount, AVG(amount) OVER (PARTITION BY seller_id ORDER BY order_date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS moving_avg FROM orders;
Lead and Lag: Access data from the following or preceding row.
SELECT order_id, amount, LAG(amount, 1) OVER (ORDER BY order_id) AS previous_amount, LEAD(amount, 1) OVER (ORDER BY order_id) AS next_amount FROM orders;
Key Points of Window Functions
Non-Aggregate Calculations: Unlike traditional aggregate functions, window functions do not cause rows to become grouped into a single output row — each row in the result set retains its separate identity.
OVER Clause: The OVER clause defines the window, specifying the partitioning and ordering of rows for the calculation.
Frame Specification: You can further refine the window function by specifying a frame (the subset of rows to be considered in the calculation).