MySQL - 查询列的重复项并返回原始行和重复行
I have a table that I use to store some systematically chosen "serial numbers" for each product that is bought...
The problem is, a CSV was uploaded that I believe contained some duplicate "serial numbers", which means that when the application tries to modify a row, it may not be modifying the correct one.
I need to be able to query the database and get all rows that are a double of the serial_number
column. It should look something like this:
ID, serial_number, meta1, meta2, meta3
3, 123456, 0, 2, 4
55, 123456, 0, 0, 0
6, 345678, 0, 1, 2
99, 345678, 0, 1, 2
So as you can see, I need to be able to see both the original row and the duplicate row and all of it's columns of data ... this is so I can compare them and determine what data is now inconsistent.
我有一张表,用于为每个购买的产品存储一些系统选择的“序列号”。 。 p>
问题是,上传的CSV我认为包含一些重复的“序列号”,这意味着当应用程序尝试修改行时,可能无法修改正确的行 一个。 p>
我需要能够查询数据库并获取 所以你可以看到,我需要能够看到 原始行和重复行及其所有数据列...这样我就可以比较它们并确定哪些数据现在不一致。 p>
div> serial_number code>列的两倍的所有行。 它看起来像这样: p>
ID,serial_number,meta1,meta2,meta3
3,123456,0,2,4
55,123456,0,0,0
6,345678,0,1,2
99,345678,0,1,2
code> pre>
Some versions of MySQL implement in
with a subquery very inefficiently. A safe alternative is a join:
SELECT t.*
FROM t join
(select serial_number, count(*) as cnt
from t
group by serial_number
) tsum
on tsum.serial_number = t.serial_number and cnt > 1
order by t.serial_number;
Another alternative is to use an exists
clause:
select t.*
from t
where exists (select * from t t2 where t2.serial_number = t.serial_number and t2.id <> t.id)
order by t.serial_number;
Both these queries (as well as the one proposed by @fthiella) are standard SQL. Both would benefit from an index on (serial_number, id)
.
SELECT *
FROM
yourtable
WHERE
serial_number IN (SELECT serial_number
FROM yourtable
GROUP BY serial_number
HAVING COUNT(*)>1)
ORDER BY
serial_number, id