Finding duplicate values in a SQL table
It's easy to find duplicates
with one field:
SELECT name, COUNT(email)
FROM users
GROUP BY email
HAVING COUNT(email) > 1
So if we have a table
ID NAME EMAIL
1 John asd@asd.com
2 Sam asd@asd.com
3 Tom asd@asd.com
4 Bob bob@asd.com
5 Tom asd@asd.com
This query will give us John, Sam, Tom, Tom because they all have the same email
.
However, what I want is to get duplicates with the same email
and name
.
That is, I want to get "Tom", "Tom".
The reason I need this: I made a mistake, and allowed to insert duplicate name
and email
values. Now I need to remove/change the duplicates, so I need to find them first.
SELECT
name, email, COUNT(*)
FROM
users
GROUP BY
name, email
HAVING
COUNT(*) > 1
Simply group on both of the columns.
Note: the ANSI standard is to have all non aggregated columns in the GROUP BY. MySQL allows you to avoid this, but results are unpredictable:
In MySQL you need sql_mode=only_full_group_by
try this:
declare @YourTable table (id int, name varchar(10), email varchar(50))
INSERT @YourTable VALUES (1,'John','John-email')
INSERT @YourTable VALUES (2,'John','John-email')
INSERT @YourTable VALUES (3,'fred','John-email')
INSERT @YourTable VALUES (4,'fred','fred-email')
INSERT @YourTable VALUES (5,'sam','sam-email')
INSERT @YourTable VALUES (6,'sam','sam-email')
SELECT
name,email, COUNT(*) AS CountOf
FROM @YourTable
GROUP BY name,email
HAVING COUNT(*)>1
OUTPUT:
name email CountOf
---------- ----------- -----------
John John-email 2
sam sam-email 2
(2 row(s) affected)
if you want the IDs of the dups use this:
SELECT
y.id,y.name,y.email
FROM @YourTable y
INNER JOIN (SELECT
name,email, COUNT(*) AS CountOf
FROM @YourTable
GROUP BY name,email
HAVING COUNT(*)>1
) dt ON y.name=dt.name AND y.email=dt.email
OUTPUT:
id name email
----------- ---------- ------------
1 John John-email
2 John John-email
5 sam sam-email
6 sam sam-email
(4 row(s) affected)
to delete the duplicates try:
DELETE d
FROM @YourTable d
INNER JOIN (SELECT
y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank
FROM @YourTable y
INNER JOIN (SELECT
name,email, COUNT(*) AS CountOf
FROM @YourTable
GROUP BY name,email
HAVING COUNT(*)>1
) dt ON y.name=dt.name AND y.email=dt.email
) dt2 ON d.id=dt2.id
WHERE dt2.RowRank!=1
SELECT * FROM @YourTable
OUTPUT:
id name email
----------- ---------- --------------
1 John John-email
3 fred John-email
4 fred fred-email
5 sam sam-email
(4 row(s) affected)
尝试这个:
SELECT name, email
FROM users
GROUP BY name, email
HAVING ( COUNT(*) > 1 )
链接地址: http://www.djcxy.com/p/5164.html
上一篇: 一个表,三列MYSQL查询问题
下一篇: 在SQL表中查找重复的值