带有 EXISTS 与 IN 的子查询 - MySQL

问题描述:

下面两个查询是子查询.两者都是一样的,对我来说都很好.但问题是方法 1 查询执行大约需要 10 秒,而方法 2 查询需要不到 1 秒.

Below two queries are subqueries. Both are the same and both works fine for me. But the problem is Method 1 query takes about 10 secs to execute while Method 2 query takes under 1 sec.

我能够将方法 1 查询转换为方法 2,但我不明白查询中发生了什么.我一直试图自己弄清楚.我真的很想了解以下两个查询之间的区别以及性能提升是如何发生的?它背后的逻辑是什么?

I was able to convert method 1 query to method 2 but I don't understand what's happening in the query. I have been trying to figure it out myself. I would really like to learn what's the difference between below two queries and how does the performance gain happen ? what's the logic behind it ?

我是这些先进技术的新手.我希望有人能帮助我.鉴于我阅读了 docs 这没有给我任何线索.

I'm new to these advance techniques. I hope someone will help me out here. Given that I read the docs which does not give me a clue.

方法一:

SELECT
   *       
FROM
   tracker       
WHERE
   reservation_id IN (
      SELECT
         reservation_id                                 
      FROM
         tracker                                 
      GROUP  BY
         reservation_id                                 
      HAVING
         (
            method = 1                                          
            AND type = 0                                          
            AND Count(*) > 1 
         )                                         
         OR (
            method = 1                                              
            AND type = 1                                              
            AND Count(*) > 1 
         )                                         
         OR (
            method = 2                                              
            AND type = 2                                              
            AND Count(*) > 0 
         )                                         
         OR (
            method = 3                                              
            AND type = 0                                              
            AND Count(*) > 0 
         )                                         
         OR (
            method = 3                                              
            AND type = 1                                              
            AND Count(*) > 1 
         )                                         
         OR (
            method = 3                                              
            AND type = 3                                              
            AND Count(*) > 0 
         )
   )

方法二:

SELECT
   *                                
FROM
   `tracker` t                                
WHERE
   EXISTS (
      SELECT
         reservation_id                                              
      FROM
         `tracker` t3                                              
      WHERE
         t3.reservation_id = t.reservation_id                                              
      GROUP BY
         reservation_id                                              
      HAVING
         (
            METHOD = 1 
            AND TYPE = 0 
            AND COUNT(*) > 1
         ) 
         OR                                                     
         (
            METHOD = 1 
            AND TYPE = 1 
            AND COUNT(*) > 1
         ) 
         OR                                                    
         (
            METHOD = 2 
            AND TYPE = 2 
            AND COUNT(*) > 0
         ) 
         OR                                                     
         (
            METHOD = 3 
            AND TYPE = 0 
            AND COUNT(*) > 0
         ) 
         OR                                                     
         (
            METHOD = 3 
            AND TYPE = 1 
            AND COUNT(*) > 1
         ) 
         OR                                                     
         (
            METHOD = 3 
            AND TYPE = 3 
            AND COUNT(*) > 0
         )                                             
   )

An Explain Plan 会告诉您为什么应该使用 Exists.通常问题是Exists vs Count(*).Exists 更快.为什么?

An Explain Plan would have shown you why exactly you should use Exists. Usually the question comes Exists vs Count(*). Exists is faster. Why?

  • 关于NULL 带来的挑战:当子查询返回Null 时,对于IN,整个查询变成Null.所以你也需要处理它.但是使用Exist,它只是一个false.应付起来容易多了.简单的 IN 无法与 Null 进行比较,但 Exists 可以.

  • With regard to challenges present by NULL: when subquery returns Null, for IN the entire query becomes Null. So you need to handle that as well. But using Exist, it's merely a false. Much easier to cope. Simply IN can't compare anything with Null but Exists can.

例如Exists (Select * from yourtable where bla = 'blabla');找到/匹配一个命中时,你会得到真/假.

e.g. Exists (Select * from yourtable where bla = 'blabla'); you get true/false the moment one hit is found/matched.

在这种情况下 IN 排序采用 Count(*) 的位置来选择 ALL 匹配行基于WHERE 因为它正在比较所有值.

In this case IN sort of takes the position of the Count(*) to select ALL matching rows based on the WHERE because it's comparing all values.

但也不要忘记这一点:

  • EXISTSIN 高速执行:当子查询结果非常大时.
  • IN 领先于 EXISTS :当子查询结果非常小时.
  • EXISTS executes at high speed against IN : when the subquery results is very large.
  • IN gets ahead of EXISTS : when the subquery results is very small.

参考: