Processing SQL Query

The way we write SQL queries is more on the “declarative way”. For example, we have this query:

SELECT empid, YEAR(orderdate) AS orderyear, COUNT(*) AS numorder
FROM Sales.Orders
WHERE custid = 71
GROUP BY empid, YEAR(orderdate)
HAVING COUNT(*) > 1
ORDER BY empid, orderyear;

It was written, or the intended way to write SQL, is like requesting something in a more English-like/natural language manner. But actually, it was process in a “logical” or “algorithmic way”. If we can rearrange the above query into logical sequence:

FROM Sales.Orders -- get rows from the table
WHERE custid = 71 -- filter the rows
GROUP BY empid, YEAR(orderdate) -- group the rows
HAVING COUNT(*) > 1 -- filter the groups
-- get the specified attributes from each group
SELECT empid, YEAR(orderdate) AS orderyear, COUNT(*) AS numorder
-- finally sort the rows
ORDER BY empid, orderyear;

With that said, if you try to something like this:

SELECT orderid, YEAR(orderdate) AS orderyear
FROM Sales.Orders
WHERE orderyear > 2021;

It will fail because WHERE clause is being processed BEFORE SELECT clause; that’s why it doesn’t know what orderyear is yet.

Same thing with HAVING accessing columns that will be processed only during SELECT clause:

SELECT
	empid,
	YEAR(orderdate) AS orderyear,
	COUNT(*) AS numorders
FROM Sales.Orders
WHERE custid = 71
GROUP BY empid, YEAR(orderdate)
HAVING numorders > 1  -- error!

Including other directives, we have this logical order of how SQL processes queries:

FROM
WHERE
GROUP BY
HAVING
SELECT
1. Expressions
2. DISTINCT
ORDER BY
1. OFFSET and FETCH

References

Ben-Gan, I. (2023) T-SQL Fundamentals (4th Ed.). Pearson Education, Inc.

NoteDat

Explorer

Processing SQL Query

References

Graph View