Common Table Expressions

Posted on Jul 9, 2022

Early in my career, I ended up working on Microsoft SQL Server projects as a software developer and a business analyst. Over several years, I became an expert1 in data modeling, performance tuning, ETL, numerous database connection options, numerous database programming libraries. Even after I left my software development job behind, I wrote queries and developed metadata layers for analytical reporting. Over that time, I wrote a ton of SQL queries, and I had a very opinionated way of structuring them, which was essentially clean ANSI SQL.

When I moved to audit, I had no use for SQL for years. Consequently, I did not keep up with Microsoft SQL Server for years. When my job became more technically oriented again, I was surprised to learn that SQL—or at least Microsoft SQL Server’s support T-SQL language—had evolved. One of my favorite features, which premiered in SQL Server 2005, but I was late to learn about, is common table expressions, or CTEs. Learning CTEs changed the way I write queries entirely.

Conceptually, a CTE is a temporary table based on a SQL query. Practically, CTEs let you break up a complex query into several simpler ones. I like to consider each CTE to be a step in a multi-step process that builds the dataset I need.

CTEs are very useful when you want to create a calculation and then group on the calculated value. Without CTEs, you have to copy/paste the entire calculation into the GROUP BY clause, which is error-prone and confusing to read. With CTEs, you can filter the data in one step, then make your calculations in another step, then group by the calculated values in the next step, then sort and select the top 10 calculated values in the final step.

From a performance standpoint, SQL Server optimizes the query you type in. Behind the scenes, it doesn’t actually create a temporary table for every CTE you create. That means that there is usually no, or negligible, impact to using CTEs in your queries.

Now, all but the simplest queries I wrote are a collection of common table expressions. Once you understand what they are, they make complex SQL queries a lot easier to reason through when writing them, and a lot easier to understand when reading them. In my field, audit, understanding the query and being able to reason about its correctness is often the most important consideration.

  1. A local expert, at least. ↩︎