Skip to content

SQL GROUP BY Clause

The GROUP BY statement in SQL is used for organizing and summarizing data based on identical values in specified columns. By using the GROUP BY clause, users can apply aggregate functions like SUM, COUNT, AVG, MIN, and MAX to each group, making it easier to perform detailed data analysis.

In this article, we will learn the SQL GROUP BY syntax, explore practical examples with single and multiple columns, and demonstrate advanced use cases with the HAVING clause for conditional grouping. Whether you’re new to SQL or an experienced professional, this article will help you master the GROUP BY clause for efficient data querying.

GROUP BY Clause in SQL

The GROUP BY statement in SQL is used to arrange identical data into groups based on specified columns. If a particular column has the same values in multiple rows, the GROUP BY clause will group these rows together.

Key Points About GROUP BY

  • GROUP BY clause is used with the SELECT statement.

  • In the query, the GROUP BY clause is placed after the WHERE clause.

  • In the query, the GROUP BY clause is placed before the ORDER BY clause if used.

  • In the query, the Group BY clause is placed before the Having clause.

  • Place condition in the having clause.

Syntax

SELECT column1, function_name(column2)
FROM table_name
GROUP BY column1, column2

Key Terms

  • function_name: Name of the function used for example, SUM() , AVG().

  • table_name: Name of the table.

  • condition: Condition used.

Examples of GROUP BY in SQL

Let’s assume that we have two tables Employee and Student Sample Table is as follows after adding two tables we will do some specific operations to learn about GROUP BY.

Employee Table

CREATE TABLE emp (
  emp_no INT PRIMARY KEY,
  name VARCHAR(50),
  sal DECIMAL(10,2),
  age INT
);

Insert some random data into a table and then we will perform some operations in GROUP BY.

INSERT INTO emp (emp_no, name, sal, age) VALUES
(1, 'Aarav', 50000.00, 25),
(2, 'Aditi', 60000.50, 30),
(3, 'Aarav', 75000.75, 35),
(4, 'Anjali', 45000.25, 28),
(5, 'Chetan', 80000.00, 32),
(6, 'Divya', 65000.00, 27),
(7, 'Gaurav', 55000.50, 29),
(8, 'Divya', 72000.75, 31),
(9, 'Gaurav', 48000.25, 26),
(10, 'Divya', 83000.00, 33);
SELECT * from emp;

Output

Employee Table

Student Table

CREATE TABLE student (
  name VARCHAR(50),
  year INT,
  subject VARCHAR(50)
);
INSERT INTO student (name, year, subject) VALUES
('Alice', 1, 'Mathematics'),
('Bob', 2, 'English'),
('Charlie', 3, 'Science'),
('David', 1, 'Mathematics'),
('Emily', 2, 'English'),
('Frank', 3, 'Science');

Output

Student Table

Example 1 : Group By Single Column

Group By single column means, placing all the rows with the same value of only that particular column in one group. Consider the query as shown below:

Query

SELECT name, SUM(sal) FROM emp 
GROUP BY name;

Output

Example 1

Explanation

As you can see in the above output, the rows with duplicate NAMEs are grouped under the same NAME and their corresponding SALARY is the sum of the SALARY of duplicate rows. The SUM() function of SQL is used here to calculate the sum. The NAMES that are added are Aarav, Divya and Gaurav.

Example 2 : Group By Multiple Columns

Group by multiple columns is say, for example, GROUP BY column1, column2. This means placing all the rows with the same values of columns column 1 and column 2 in one group. Consider the below query:

Query

SELECT SUBJECT, YEAR, Count(*)
FROM Student
GROUP BY SUBJECT, YEAR;

Output

Example 2

Explanation

As we can see in the above output the students with both the same SUBJECT and YEAR are placed in the same group. And those whose only SUBJECT is the same but not YEAR belong to different groups. So here we have grouped the table according to two columns or more than one column. The Grouped subject and years are (English,2) , (Mathematics,1) and (Science,3). The above mentioned all groups and years are repeated twice.

HAVING Clause in GROUP BY Clause

We know that the WHERE clause is used to place conditions on columns but what if we want to place conditions on groups? This is where the HAVING clause comes into use. We can use the HAVING clause to place conditions to decide which group will be part of the final result set. Also, we can not use aggregate functions like SUM(), COUNT(), etc. with the WHERE clause. So we have to use the HAVING clause if we want to use any of these functions in the conditions.

Syntax

SELECT column1, function_name(column2)
FROM table_name
WHERE condition
GROUP BY column1, column2
HAVING condition
ORDER BY column1, column2;

Key Terms

  • function_name: Name of the function used for example, SUM() , AVG().

  • table_name: Name of the table.

  • condition: Condition used.

Example

SELECT NAME, SUM(sal) FROM Emp
GROUP BY name
HAVING SUM(sal)>50000; 

Output

Having Clause in Group by Clause

Explanation

In the result, only employees whose total salary (SUM(sal)) exceeds 50,000 are displayed. For example, if Anjali has a total salary less than 50,000, she will be excluded from the output.

Conclusion

The GROUP BY function in SQL organizes identical data into groups, enabling aggregate analysis on each group. It is commonly used with aggregate functions like SUM(), COUNT(), AVG(), etc., to summarize data efficiently. The HAVING clause further refines the results by applying conditions to these grouped records. GROUP BY can operate on single or multiple columns, making it a versatile tool for data retrieval and reporting.