How can I remove duplicate records in my tables?

By: Denny Cherry

Published On: May 4, 2009

All to often we end up with duplicate rows in a table. The best way to keep duplicate rows out of the database is to not let them in. But assume that they are there. This bit of sample code shows how to delete those duplicate rows quickly and easily in a single statement. No temp tables required (I use a temp table to put the data into for example purposes). This code is for SQL 2005 and up as it uses some features which were introduced in SQL Server 2005. SQL Server 2000 would require a totally different technique.

CREATE TABLE #DuplicateRows /*Create a new table*/
(Col1 INT,
Col2 INT,
Col3 INT)

INSERT INTO #DuplicateRows /*Load up duplicate rows*/
SELECT 1,1,1
UNION ALL
SELECT 1,1,1
UNION ALL
SELECT 1,1,1
UNION ALL
SELECT 2,2,2
UNION ALL
SELECT 2,2,2
UNION ALL
SELECT 2,2,2

SELECT *
FROM #DuplicateRows; /*Check that the data is actually hosed*/

WITH Cleaning AS (SELECT ROW_NUMBER() OVER(ORDER BY Col1, Col2, Col3) as row,
Col1,
Col2,
Col3
FROM #DuplicateRows)

DELETE FROM Cleaning /*Delete the rows which are duplicates*/
WHERE Row NOT IN (SELECT row FROM (SELECT Col1, Col2, Col3, MIN(row) row
FROM Cleaning a
GROUP BY Col1, Col2, Col3) b)

SELECT * /*Check the table to see that it is clean*/
FROM #DuplicateRows

DROP TABLE #DuplicateRows /*Clean up the table*/

Hopefully you find this code useful.

Denny

One Response

Sqllion says:

May 28, 2009 at 7:03 pm

Thank you so much for the valuable information on removing redundancy from a table. Please follow the link below to get more information on Deleting duplicate records from a table efficiently.
http://www.sqllion.com/2009/05/delete-duplicate-records/

Reply

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?

How can I remove duplicate records in my tables?

Share

One Response

Leave a ReplyCancel reply

Trust DCAC with your data