I’ll link to Microsoft SQL Server, as this is the one I’m working with, but many databases share some concepts so they should be possible to apply in other SQL technologies too.

Operations order

OperationOrder
SELECT5
FROM / JOIN1
WHERE2
GROUP BY3
HAVING4
ORDER BY6
Due to this for example you can’t use column aliases from select in group by / having because those actions are executed before aliases are defined.

Set operators

I’ve many times used UNION ALL, but had no idea there are other operators which work on sets.

UNION / UNION ALL

UNION ALL works like appending rows from second select to the first one, resulting in single table with all the rows.

SELECT 1 As ID, 'Bob' As Name
UNION ALL
SELECT 2 As ID, 'Claire' As Name
UNION ALL
SELECT 1 As ID, 'Bob' As Name
IDName
1Bob
2Claire
1Bob
UNION without ALL works similar, but it removes duplicates after the operation so there will be only one record for Bob.
SELECT 1 As ID, 'Bob' As Name
UNION
SELECT 2 As ID, 'Claire' As Name
UNION
SELECT 1 As ID, 'Bob' As Name
IDName
1Bob
2Claire

INTERSECT

INTERSECT will pick rows which exist in both datasets - first, and next query. In example below - Deborah will be missing, as it doesn’t intersect.

WITH first_query (id, name) AS (
    SELECT 1 As ID, 'Bob' As Name
    UNION ALL
    SELECT 2 As ID, 'Claire' As Name
), second_query (id, name) AS (
    SELECT 1 As ID, 'Bob' As Name
    UNION ALL
    SELECT 2 As ID, 'Claire' As Name
    UNION ALL
    SELECT 3 As ID, 'Deborah' As Name
)
  
SELECT * FROM first_query
INTERSECT
SELECT * FROM second_query
IDName
1Bob
2Claire

MINUS / EXCEPT

EXCEPT is opposite of INTERSECT instead of picking those existing in both sets we are removing from first set values existing in second set - order does matter here.

WITH first_query (id, name) AS (
    SELECT 1 As ID, 'Bob' As Name
    UNION ALL
    SELECT 2 As ID, 'Claire' As Name
), second_query (id, name) AS (
    SELECT 1 As ID, 'Bob' As Name
    UNION ALL
    SELECT 2 As ID, 'Claire' As Name
    UNION ALL
    SELECT 3 As ID, 'Deborah' As Name
 
)
 
SELECT * FROM second_query
EXCEPT
SELECT * FROM first_query
IDName
3Deborah
SELECT * FROM first_query
EXCEPT
SELECT * FROM second_query

Will return 0 rows.

Grouping

Rollup

This is an extension of group by - additionally to aggregated values for specified level, there will be calculated values that summarises on previous scopes.

WITH example_sales (region, country, salesman, sale_value) AS (
    SELECT 'Europe', 'Poland', 'Katarzyna Nowak', 980 UNION ALL
    SELECT 'Europe', 'Romania', 'István Kovács', 1350 UNION ALL
    SELECT 'Europe', 'Romania', 'Andrei Popescu', 1100 UNION ALL
    SELECT 'North America', 'United States', 'John Smith', 2200 UNION ALL
    SELECT 'North America', 'Canada', 'Emily Johnson', 1800 UNION ALL
    SELECT 'North America', 'Canada', 'Michael Brown', 1750 UNION ALL
    SELECT 'North America', 'United States', 'Jessica Williams', 2050 UNION ALL
    SELECT 'Asia', 'China', 'Li Wei', 2500 UNION ALL
    SELECT 'Asia', 'Japan', 'Yuki Tanaka', 2100 UNION ALL
    SELECT 'Asia', 'China', 'Wei Chen', 2300
)
 
SELECT
    region,
    country,
    SUM(sale_value)
FROM
    example_sales
GROUP BY ROLLUP
    (region, country)

This will aggregate on 3 levels:

  • Region + Country
  • Region
  • Total
regioncountrysaleslevel
AsiaChina4800
AsiaJapan2100
AsiaNULL6900Region
EuropePoland980
EuropeRomania2450
EuropeNULL3430Region
North AmericaCanada3550
North AmericaUnited States4250
North AmericaNULL7800Region
NULLNULL18130Total

Cube

Cube works similarly to Rollup, although it will produce all combinantions of specified aggregation columns. Just by changing ROLLUP to CUBE in query above it will produce those aggregation levels:

  • Region + Country
  • Region
  • Country
  • Total
WITH example_sales (region, country, salesman, sale_value) AS (
    SELECT 'Europe', 'Poland', 'Katarzyna Nowak', 980 UNION ALL
    SELECT 'Europe', 'Romania', 'István Kovács', 1350 UNION ALL
    SELECT 'Europe', 'Romania', 'Andrei Popescu', 1100 UNION ALL
    SELECT 'North America', 'United States', 'John Smith', 2200 UNION ALL
    SELECT 'North America', 'Canada', 'Emily Johnson', 1800 UNION ALL
    SELECT 'North America', 'Canada', 'Michael Brown', 1750 UNION ALL
    SELECT 'North America', 'United States', 'Jessica Williams', 2050 UNION ALL
    SELECT 'Asia', 'China', 'Li Wei', 2500 UNION ALL
    SELECT 'Asia', 'Japan', 'Yuki Tanaka', 2100 UNION ALL
    SELECT 'Asia', 'China', 'Wei Chen', 2300
)
  
SELECT
    region,
    country,
    SUM(sale_value)
FROM
    example_sales
GROUP BY CUBE
    (region, country)
regioncountrysaleslevel
North AmericaCanada3550
NULLCanada3550Country
AsiaChina4800
NULLChina4800Country
AsiaJapan2100
NULLJapan2100Country
EuropePoland980
NULLPoland980Country
EuropeRomania2450
NULLRomania2450Country
North AmericaUnited States4250
NULLUnited States4250Country
NULLNULL18130Total
AsiaNULL6900Region
EuropeNULL3430Region
North AmericaNULL7800Region

Grouping Sets

Grouping sets on the opposite to Cube, but it will produce only aggregations specified by you. With Rollup, and Cube additional aggregations are generated automatically, you don’t have to specify totals aggregation, and region level aggregation - with Grouping Sets only specified levels are produced. To achieve same result as in Rollup you would need query like this:

WITH example_sales (region, country, salesman, sale_value) AS (
    SELECT 'Europe', 'Poland', 'Katarzyna Nowak', 980 UNION ALL
    SELECT 'Europe', 'Romania', 'István Kovács', 1350 UNION ALL
    SELECT 'Europe', 'Romania', 'Andrei Popescu', 1100 UNION ALL
    SELECT 'North America', 'United States', 'John Smith', 2200 UNION ALL
    SELECT 'North America', 'Canada', 'Emily Johnson', 1800 UNION ALL
    SELECT 'North America', 'Canada', 'Michael Brown', 1750 UNION ALL
    SELECT 'North America', 'United States', 'Jessica Williams', 2050 UNION ALL
    SELECT 'Asia', 'China', 'Li Wei', 2500 UNION ALL
    SELECT 'Asia', 'Japan', 'Yuki Tanaka', 2100 UNION ALL
    SELECT 'Asia', 'China', 'Wei Chen', 2300
)
 
SELECT
    region,
    country,
    SUM(sale_value)
FROM
    example_sales
GROUP BY GROUPING SETS (
    (region, country),
    (region),
    ()
)

Grouping & Grouping_Id

These are built in functions which are aiming to improve use cases for Rollup, Cube and Grouping Sets, they return information if specified row is being aggregated and on which level. In my opinion Grouping isn’t most readable, at least in example below.

WITH example_sales (region, country, salesman, sale_value) AS (
    SELECT 'Europe', 'Poland', 'Katarzyna Nowak', 980 UNION ALL
    SELECT 'Europe', 'Romania', 'István Kovács', 1350 UNION ALL
    SELECT 'Europe', 'Romania', 'Andrei Popescu', 1100 UNION ALL
    SELECT 'North America', 'United States', 'John Smith', 2200 UNION ALL
    SELECT 'North America', 'Canada', 'Emily Johnson', 1800 UNION ALL
    SELECT 'North America', 'Canada', 'Michael Brown', 1750 UNION ALL
    SELECT 'North America', 'United States', 'Jessica Williams', 2050 UNION ALL
    SELECT 'Asia', 'China', 'Li Wei', 2500 UNION ALL
    SELECT 'Asia', 'Japan', 'Yuki Tanaka', 2100 UNION ALL
    SELECT 'Asia', 'China', 'Wei Chen', 2300
)
 
SELECT
    region,
    country,
    SUM(sale_value) As sales,
    GROUPING(region) As is_region_group,
    GROUPING(country) As is_country_group,
    GROUPING_ID(region, country) As region_group_id,
    CASE GROUPING_ID(region, country)
        WHEN 1 THEN 'Region'
        WHEN 2 THEN 'Country'
        WHEN 3 THEN 'Total'
        ELSE 'Region & Country'
    END As region_group_name
FROM
    example_sales
GROUP BY GROUPING SETS (
    (region, country),
    (region),
    (country),
    ()
)
regioncountrysalesis_region_groupis_country_groupregion_group_idregion_group_name
North AmericaCanada3550000Region & Country
NULLCanada3550102Country
AsiaChina4800000Region & Country
NULLChina4800102Country
AsiaJapan2100000Region & Country
NULLJapan2100102Country
EuropePoland980000Region & Country
NULLPoland980102Country
EuropeRomania2450000Region & Country
NULLRomania2450102Country
North AmericaUnited States4250000Region & Country
NULLUnited States4250102Country
NULLNULL18130113Total
AsiaNULL6900011Region
EuropeNULL3430011Region
North AmericaNULL7800011Region

Query plans

Indexes

CTE

Common Table Expressions are there mainly to increase readability of queries for example with splitting up large selects into smaller parts. You can declare multiple of CTE’s in a single query separating them with a comma (it’s used in Recursion example below). You can also use INSERT/UPDATE/DELETE inside a CTE.

-- A cool example for CTE is for sure example data - you can prepare something like this, and just query it like a table using the name
WITH company_data (employee_id, name, manager_id, job_title) AS (
    SELECT 1, 'Alice', NULL, 'CEO' UNION ALL
    SELECT 2, 'Bob', 1, 'VP of Operations' UNION ALL
    SELECT 3, 'Carla', 1, 'VP of Marketing' UNION ALL
    SELECT 4, 'David', 2, 'Operations Manager' UNION ALL
    SELECT 5, 'Emily', 2, 'Marketing Manager' UNION ALL
    SELECT 6, 'Frank', 4, 'Operations Analyst'
)
 
SELECT * FROM company_data

Recursion

Another use case for CTE is recursion, which might be helpful to retrieve data in a Tree structure. Example below. This query selects CEO, and unions it with all subordinates of that CEO, then for each subordinate it looks for next ones until there won’t be subordinates.

WITH company_data (employee_id, name, manager_id, job_title) AS (
    SELECT 1, 'Alice', NULL, 'CEO' UNION ALL
    SELECT 2, 'Bob', 1, 'VP of Operations' UNION ALL
    SELECT 3, 'Carla', 1, 'VP of Marketing' UNION ALL
    SELECT 4, 'David', 2, 'Operations Manager' UNION ALL
    SELECT 5, 'Emily', 2, 'Marketing Manager' UNION ALL
    SELECT 6, 'Frank', 4, 'Operations Analyst'
), org_chart AS (
    SELECT
        employee_id,
        name,
        manager_id,
        job_title,
        1 AS level,
        CAST(NULL AS NVARCHAR(50)) AS manager_name
    FROM
        company_data
    WHERE
        manager_id IS NULL
 
    UNION ALL
 
    SELECT
        e.employee_id,
        e.name,
        e.manager_id,
        e.job_title,
        oc.level + 1,
        CAST(oc.name AS NVARCHAR(50)) AS manager_name
    FROM
        company_data e
    JOIN
        org_chart oc ON e.manager_id = oc.employee_id
)
 
SELECT * FROM org_chart
ORDER BY level, name;

Analytical functions

PARTITION OVER

LEAD

LAG

User Defined Functions & Stored Procedures

Apply

Transactions

Triggers