The SQL GROUP BY clause is a powerful feature that allows you to group rows based on one or more columns. It's commonly used with aggregate functions to perform calculations on grouped data.
GROUP BY serves two primary purposes:
This clause is particularly useful when you need to analyze data across categories or perform summary calculations.
SELECT column1, column2, aggregate_function(column3)
FROM table_name
GROUP BY column1, column2;
In this syntax, columns after SELECT that are not inside an aggregate function must appear in the GROUP BY clause.
SELECT department, COUNT(*) as employee_count
FROM employees
GROUP BY department;
This query counts the number of employees in each department.
SELECT product_category, AVG(price) as average_price
FROM products
GROUP BY product_category;
Here, we're calculating the average price for each product category.
SELECT department, job_title, AVG(salary) as avg_salary
FROM employees
GROUP BY department, job_title;
This query groups employees by both department and job title, then calculates the average salary for each group.
GROUP BY is often used in conjunction with other SQL clauses to create more complex queries:
SELECT department, COUNT(*) as employee_count
FROM employees
WHERE hire_date > '2020-01-01'
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY employee_count DESC;
This query combines WHERE, GROUP BY, HAVING, and ORDER BY clauses to find departments with more than 5 employees hired after January 1, 2020, sorted by the number of employees.
By mastering the GROUP BY clause, you'll be able to efficiently analyze and summarize data in your SQL queries, unlocking powerful insights from your databases.