SQL查找两个表之间匹配的大多数行

问题描述:

我正在使用SQL Server 2012,我有两个如下表

I am using SQL Server 2012 I have two tables like the following

表1和表2都有许多组,由组列指示.该组的名称可能在两个表中都匹配,但是可能不匹配.重要的是找到表2上与表1的组中的成员匹配最多的组.

Table1 and Table 2 both have many groups, indicated by the group column. The name of the group may match in both tables, but it may not. What is important is finding the group on Table 2 that has the most members that match members in a group on Table1.

我首先尝试使用vlookup进行此操作,但问题是vlookup会拉出具有匹配项的组"列中的第一个条目,而不是匹配项最多的组.在vlookup以下将拉动BBB,但正确的结果是CCC.

I first tried doing this with a vlookup, but the problem is vlookup pulls the first entry in the Group column that has a match, not the group with the most matches. Below vlookup would pull BBB, but the correct result is CCC.

可能会发生束缚.表2上可能有多个组与表1的成员数相同,因此最好的办法是计算匹配数,但是有成千上万的组,因此对带有计数的列进行排序和筛选并不理想.我需要类似case语句的内容,其中如果存在MAX(match),则Table1将在派生列BestMatch中显示带有MAX(Match)的组名.最理想的情况是,该列可以显示表2上所有具有MAX(Match的组,它们可以是一个或多个.也许可以用逗号分隔.

Ties may occur. There might be more than one group on Table2 that match Table1 with the same number of members thus the best thing may be to count the number of matches, but there are thousands of groups so it's not ideal to sort and sift through a column with counts. I need something like a case statement where if there is a MAX(match) then Table1 would show the group name with MAX(Match) in the derived column BestMatch. It'd be most ideal if the column could display all the groups on table 2 that have MAX(Match which may be one or more. Perhaps it could be comma separated.

如果不是,如果该列只能说领带,我可以寻找领带,那么这是最好的选择,这是理想的选择,当出现领带一词时,它会在匹配的每个成员旁边重复出现,所以我会知道适用于匹配哪个帐户以及匹配多少个帐户的组.

If not if the column could just say tie and I could look for the tie, it'd be ideal if this is the best option, when the word tie appears it repeats besides every member that matches so I will know to look for groups that matching which accounts and how many that matched.

我们确实可以使用一些预期的输出来帮助澄清问题.

We really could do with some expected output to help clarify the question.

但是,如果我对您的理解正确,那么此查询将使您接近所需的结果:

If I understand you correctly however, this query will get you close to the results you require:

;with cte as
(    SELECT t1a.[group] AS Group1
          , t2a.[Group] AS Group2
          , RANK() OVER(PARTITION BY t1a.[group] 
                            ORDER BY COUNT(t2a.[Group]) DESC) AS MatchRank
       FROM Table1 t1a
       JOIN Table2 t2a
         ON t1a.member = t2a.member
   GROUP BY t1a.[group], t2a.[GRoup])
 SELECT * 
   FROM cte
  WHERE MatchRank=1

该查询无法识别平局,但会显示所有相关结果...

The query doesn't identify ties, but it will display any tied results...

如果您是通用表表达式(; with语句)的新手,则有一个有用的描述

If you are a newbie to common table expressions(the ;with statement) there is a useful description here.