How to remove duplicate rows in MySQL?
Problem
Given a table in MySQL, how do you remove duplicate rows?
Input
| first_name | last_name | |
|---|---|---|
| John | Doe | john.doe@email.com |
| Jane | Smith | jane.smith@email.com |
| John | Doe | john.doe@email.com |
| Bob | Johnson | bob.johnson@email.com |
| Alice | Williams | alice.williams@email.com |
| Jane | Smith | jane.smith@email.com |
Try Hands-On: Fiddle
Create Input Table: Gist
Desired Output
| first_name | last_name | |
|---|---|---|
| John | Doe | john.doe@email.com |
| Jane | Smith | jane.smith@email.com |
| Bob | Johnson | bob.johnson@email.com |
| Alice | Williams | alice.williams@email.com |
Solution 1:
Using SELECT DISTINCT
CREATE TABLE employees_no_duplicates AS
SELECT DISTINCT *
FROM employees;
Explanation:
To remove duplicate rows from the employees table, you can use the DISTINCT keyword in a SELECT statement to identify unique rows and then insert the results into a new table.
Alternately you can use the DELETE statement to remove duplicates from the original table. Let’s see how to do that.
Solution 2:
Using a SELF JOIN
This approach works if there is an id field which does not have the values duplicated.
DELETE e1
FROM employees e1
JOIN employees e2
ON e1.id < e2.id
AND e1.first_name = e2.first_name
AND e1.last_name = e2.last_name
AND e1.email = e2.email;
Output:
| id | first_name | last_name | |
|---|---|---|---|
| 3 | John | Doe | john.doe@email.com |
| 4 | Bob | Johnson | bob.johnson@email.com |
| 5 | Alice | Williams | alice.williams@email.com |
| 6 | Jane | Smith | jane.smith@email.com |
Try Hands-On: Fiddle
Explanation:
This approach uses a DELETE statement with a self-join to remove duplicate rows from the original table.
It directly modifies the existing table, so be cautious when using this approach, as it permanently removes data.
Recommended Courses
Recommended Tutorial
More SQL Questions
Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.
Start Free Course →