在连接两个表时使用 OPENQUERY 提取数据时如何忽略重复键?
我正在尝试使用OPENQUERY"将记录从 MS SQL Server 插入 MySQL 数据库,但我试图做的是忽略重复的键消息.所以当查询遇到重复时,请忽略它并继续.
I am trying to insert records into MySQL database from a MS SQL Server using the "OPENQUERY" but what I am trying to do is ignore the duplicate keys messages. so when the query run into a duplicate then ignore it and keep going.
我可以做些什么来忽略重复项?
What ideas can I do to ignore the duplicates?
这是我正在做的:
- 使用OpenQuery"从 MySQL 中提取记录以定义 MySQLA.record_id"
- 将这些记录加入到 MS SQL Server 中的具有特定条件而不是直接 id"的记录中,我在 SQL Server 中找到了一个新的相关B.new_id"记录标识符.
- 我想将找到的结果插入到 MySQL 中的一个新表中,例如 A.record_id、B.new_id 在新表中,我将 A.record_id 设置为该表的主键.
问题是,有时将表 A 连接到表 B 时,我在表 B 中找到 2+ 条记录,与我正在查找的条件匹配,这会导致数据集中的值 A.record_id 在插入之前达到 2+ 次进入导致问题的表A.注意我可以使用聚合函数来消除记录.
The problem is that when joining table A to Table B some times I find 2+ records into table B matching the criteria that I am looking for which causes the value A.record_id to 2+ times in my data set before inserting that into table A which causes the problem. Note I can use aggregate function to eliminate the records.
我认为没有特定的选项.但这很容易做到:
I don't think there is a specific option. But it is easy enough to do:
insert into oldtable(. . .)
select . . .
from newtable
where not exists (select 1 from oldtable where oldtable.id = newtable.id)
如果有不止一组唯一键,您可以添加额外的not exists
语句.
If there is more than one set of unique keys, you can add additional not exists
statements.
对于修改后的问题:
insert into oldtable(. . .)
select . . .
from (select nt.*, row_number() over (partition by id order by (select null)) as seqnum
from newtable nt
) nt
where seqnum = 1 and
not exists (select 1 from oldtable where oldtable.id = nt.id);
row_number()
函数为一组行中的每一行分配一个序列号.该组由 partition by
语句定义.数字从 1 开始并从那里递增.order by
子句表示您不关心顺序.每个 id 正好有一行的值为 1.重复行的值将大于 1.seqnum = 1
为每个 id 选择一行.
The row_number()
function assigns a sequential number to each row within a group of rows. The group is defined by the partition by
statement. The numbers start at 1 and increment from there. The order by
clause says that you don't care about the order. Exactly one row with each id will have a value of 1. Duplicate rows will have a value larger than one. The seqnum = 1
chooses exactly one row per id.