SQL GROUP BY Clause: Master Aggregation with HAVING, GROUPING SETS, ROLLUP & Comprehensive Examples

Table of Contents

1. Using GROUP BY

SQL GROUP BY Clause Diagram: How rows are grouped and aggregated with functions like COUNT, SUM, AVG

Q: What is the GROUP BY clause in SQL?

The GROUP BY clause groups rows with identical values in specified columns into summary rows, typically used with aggregate functions (COUNT, SUM, AVG, etc.) to compute statistics for each group.

Syntax:

SELECT column1, aggregate_function(column2) FROM table_name GROUP BY column1;

Q: Why is GROUP BY important?

Q: Can you give an example of GROUP BY with aggregate functions?

-- Create sample table
CREATE TABLE Employees ( EmployeeID INT PRIMARY KEY, FirstName VARCHAR(50), LastName VARCHAR(50), Department VARCHAR(50), Salary DECIMAL(10, 2), Email VARCHAR(100)
); 
-- Insert sample data
INSERT INTO Employees (EmployeeID, FirstName, LastName, Department, Salary, Email)
VALUES (1, 'John', 'Doe', 'IT', 60000.00, '[email protected]'), (2, 'Jane', 'sahil', 'HR', 55000.00, NULL), (3, 'kristal', 'Johnson', 'IT', 65000.00, '[email protected]'), (4, 'Ram', 'Williams', 'HR', 55000.00, NULL), (5, 'hari', 'Brown', 'Marketing', NULL, '[email protected]'), (6, 'Sashi', 'Sashi', 'IT', 62000.00, '[email protected]'); 
-- GROUP BY: Summarize employees by department
SELECT Department, COUNT(*) AS TotalEmployees, SUM(Salary) AS TotalSalary, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY Department;

Output:

Department | TotalEmployees | TotalSalary | AvgSalary
-----------|---------------|-------------|-----------
HR | 2 | 110000.00 | 55000.00
IT | 3 | 187000.00 | 62333.33
Marketing | 1 | NULL | NULL

Note:SUM and AVG ignore NULLs (e.g., hari's NULL salary), and COUNT(*) counts all rows.

2. Filtering Groups with HAVING

SQL HAVING vs WHERE Clause Comparison Diagram: Execution order and filtering stages

Q: What is the HAVING clause in SQL?

The HAVING clause filters groups created by GROUP BY based on conditions applied to aggregate results. It is similar to WHERE but applies after grouping, whereas WHERE filters individual rows before grouping.

Syntax:

SELECT column1, aggregate_function(column2) FROM table_name GROUP BY column1 HAVING condition;

Q: How does HAVING differ from WHERE?

Q: Can you give an example of HAVING?

-- Filter departments with more than 1 employee and total salary > 100000
SELECT Department, COUNT(*) AS TotalEmployees, SUM(Salary) AS TotalSalary
FROM Employees
WHERE Salary IS NOT NULL
GROUP BY Department
HAVING COUNT(*) > 1 AND SUM(Salary) > 100000;

Output:

Department | TotalEmployees | TotalSalary
-----------|---------------|-------------
IT | 3 | 187000.00

Description:

3. Grouping Sets and Rollups

SQL GROUPING SETS, ROLLUP, and CUBE Visualization Example with Subtotals

Q: What are GROUPING SETS and ROLLUP in SQL?

Q: How do GROUPING SETS and ROLLUP work?

Q: Can you give an example of GROUPING SETS and ROLLUP?

-- GROUPING SETS: Summarize by Department, Department+FirstName, and grand total
SELECT Department, FirstName, COUNT(*) AS TotalEmployees, SUM(Salary) AS TotalSalary
FROM Employees
WHERE Salary IS NOT NULL
GROUP BY GROUPING SETS ((Department, FirstName), (Department), ())
ORDER BY Department NULLS LAST, FirstName NULLS LAST; 
-- ROLLUP: Summarize by Department, FirstName hierarchy
SELECT Department, FirstName, COUNT(*) AS TotalEmployees, SUM(Salary) AS TotalSalary
FROM Employees
WHERE Salary IS NOT NULL
GROUP BY ROLLUP (Department, FirstName)
ORDER BY Department NULLS LAST, FirstName NULLS LAST;

Output (GROUPING SETS):

Department | FirstName | TotalEmployees | TotalSalary
-----------|-----------|---------------|-------------
HR | Ram | 1 | 55000.00
HR | Jane | 1 | 55000.00
IT | kristal | 1 | 65000.00
IT | Sashi | 1 | 62000.00
IT | John | 1 | 60000.00
HR | NULL | 2 | 110000.00
IT | NULL | 3 | 187000.00
NULL | NULL | 5 | 297000.00

Output (ROLLUP):

Department | FirstName | TotalEmployees | TotalSalary
-----------|-----------|---------------|-------------
HR | Ram | 1 | 55000.00
HR | Jane | 1 | 55000.00
HR | NULL | 2 | 110000.00
IT | kristal | 1 | 65000.00
IT | Sashi | 1 | 62000.00
IT | John | 1 | 60000.00
IT | NULL | 3 | 187000.00
NULL | NULL | 5 | 297000.00

Description:

4. Comprehensive Example Combining All Concepts

Q: Can you provide a comprehensive example using GROUP BY, HAVING, GROUPING SETS, ROLLUP, and prior concepts (WHERE, aggregate functions, scalar functions)?

-- Comprehensive query combining GROUP BY, HAVING, GROUPING SETS, and prior concepts
SELECT Department, UPPER(FirstName) AS UpperFirstName, COUNT(*) AS TotalEmployees, SUM(Salary) AS TotalSalary, AVG(COALESCE(Salary, 0)) AS AvgSalaryWithZeros, LENGTH(Email) AS EmailLength, NOW() AS QueryTime
FROM Employees
WHERE Department IN ('IT', 'HR') AND FirstName LIKE 'J%'
GROUP BY GROUPING SETS ((Department, FirstName, Email), (Department), ())
HAVING COUNT(*) >= 1
ORDER BY Department NULLS LAST, UpperFirstName NULLS LAST
LIMIT 5; -- MySQL/PostgreSQL/SQLite (use TOP 5 for SQL Server)

Description:

5. Common Mistakes and Best Practices

Q: What are common mistakes when using GROUP BY, HAVING, and GROUPING SETS/ROLLUP?

GROUP BY:

HAVING:

GROUPING SETS/ROLLUP:

Q: What are best practices for grouping and aggregation in SQL?

GROUP BY:

HAVING:

GROUPING SETS/ROLLUP:

General: