When working with databases in real-life scenarios, you often encounter duplicate records. These duplicates can lead to inaccurate reporting, double counting, or performance issues if they are not removed. In this post, we will explain how to identify duplicate records in SQL and how to remove them while keeping only one copy of each record.
Follow these steps to find and remove duplicate records while keeping just one copy.
Step-1: Create a Sample Table
We will start by creating a table named Employees.
CREATE TABLE Employees ( EmployeeID INT, EmployeeName VARCHAR(50), Department VARCHAR(50) );
Step-2: Insert Sample Records with Duplicates
We will insert some employee data, including duplicate records.
INSERT INTO Employees (EmployeeID, EmployeeName, Department) VALUES (1, 'Sandeep', 'Human Resources'), (2, 'Ravi', 'Finance'), (3, 'Ankit', 'Information Technology'), (3, 'Ankit', 'Information Technology'), -- duplicate (4, 'Neha', 'Human Resources'), (2, 'Ravi', 'Finance'); -- duplicate

Step-3: Identify Duplicate Records
To identify duplicate records, we can use the GROUP BY statement with the HAVING clause:
SELECT EmpID, EmpName, Department, COUNT(*) AS DuplicateCount FROM Employees GROUP BY EmpID, EmpName, Department HAVING COUNT(*) > 1;

When you execute the above query, you will see that EmployeeID 2 and EmployeeID 3 are duplicate records.
Step-4: Remove Duplicate Records and Keep One Record
To remove duplicates while keeping only one copy, we can use a Common Table Expression with the ROW_NUMBER() function:
WITH CTE AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY EmpID, EmpName, Department ORDER BY EmpID ) AS RowNum FROM Employees ) DELETE FROM CTE WHERE RowNum > 1;

This method gives a number to each duplicate row and then deletes all rows after the first one, keeping only the first record from each set of duplicates.

Summary Points:
- Use the
GROUP BYstatement with the HAVING clause to identify duplicate records. - Use the
ROW_NUMBER()function with a Common Table Expression to remove duplicates while keeping one record. - Always test queries on a copy of the table or create a backup before deleting records in a live database.
![]()