在连接两个表时使用 OPENQUERY 提取数据时如何忽略重复键?

问题描述：

我正在尝试使用OPENQUERY"将记录从 MS SQL Server 插入 MySQL 数据库，但我试图做的是忽略重复的键消息.所以当查询遇到重复时，请忽略它并继续.

I am trying to insert records into MySQL database from a MS SQL Server using the "OPENQUERY" but what I am trying to do is ignore the duplicate keys messages. so when the query run into a duplicate then ignore it and keep going.

我可以做些什么来忽略重复项?

What ideas can I do to ignore the duplicates?

这是我正在做的:

使用OpenQuery"从 MySQL 中提取记录以定义 MySQLA.record_id"
将这些记录加入到 MS SQL Server 中的具有特定条件而不是直接 id"的记录中，我在 SQL Server 中找到了一个新的相关B.new_id"记录标识符.
我想将找到的结果插入到 MySQL 中的一个新表中，例如 A.record_id、B.new_id 在新表中，我将 A.record_id 设置为该表的主键.

问题是，有时将表 A 连接到表 B 时，我在表 B 中找到 2+ 条记录，与我正在查找的条件匹配，这会导致数据集中的值 A.record_id 在插入之前达到 2+ 次进入导致问题的表A.注意我可以使用聚合函数来消除记录.

The problem is that when joining table A to Table B some times I find 2+ records into table B matching the criteria that I am looking for which causes the value A.record_id to 2+ times in my data set before inserting that into table A which causes the problem. Note I can use aggregate function to eliminate the records.

答

我认为没有特定的选项.但这很容易做到:

I don't think there is a specific option. But it is easy enough to do:

insert into oldtable(. . .)
    select . . .
    from newtable
    where not exists (select 1 from oldtable where oldtable.id = newtable.id)

如果有不止一组唯一键，您可以添加额外的not exists 语句.

If there is more than one set of unique keys, you can add additional not exists statements.

对于修改后的问题:

insert into oldtable(. . .)
    select . . .
    from (select nt.*, row_number() over (partition by id order by (select null)) as seqnum
          from newtable nt
         ) nt
    where seqnum = 1 and
          not exists (select 1 from oldtable where oldtable.id = nt.id);

row_number() 函数为一组行中的每一行分配一个序列号.该组由 partition by 语句定义.数字从 1 开始并从那里递增.order by 子句表示您不关心顺序.每个 id 正好有一行的值为 1.重复行的值将大于 1.seqnum = 1 为每个 id 选择一行.

The row_number() function assigns a sequential number to each row within a group of rows. The group is defined by the partition by statement. The numbers start at 1 and increment from there. The order by clause says that you don't care about the order. Exactly one row with each id will have a value of 1. Duplicate rows will have a value larger than one. The seqnum = 1 chooses exactly one row per id.

在连接两个表时使用 OPENQUERY 提取数据时如何忽略重复键?

相关推荐