使用LEFT JOIN查询时,MySQL空表返回TRUE

问题描述:

我刚刚在MySQL 5.7中遇到了一个奇怪的行为(错误?),但在MySQL 5.5中却没有出现.

I just encountered a strange behaviour (bug?) in MySQL 5.7 that doesn't appear in MySQL 5.5.

很尴尬,我需要一个例子来解释它.

So awkward, I need an example to explain it .

  1. 在2个表上使用左联接创建表
  2. 确保第二个表为空(无记录),但是通过在其中一个字段中写入静态值来构造它.

无条件的左联接产生N行(如预期)

The Left join with no conditions produces N lines (as expected)

具有一个永远不匹配的条件的左联接ALSO会产生N行.

The Left join with a condition that never matches ALSO produces N lines.

### EXAMPLE ###

## CREATE TABLES

create table PCPL (K1 int);  ## Table 1
create table AUX (K2 int);   ## Table 2

## FILL IN TABLES

insert into PCPL values (1),(2),(3);    ## fill main table with 3 values
truncate table AUX;                     ## No need to do this, just to make things clearer

## TEST 1 : "Dry Left join" => RESULT OK : Resulting Table has 3 rows

select PCPL.K1 as K1 , DERIVED.K2  as K2 
from PCPL
LEFT JOIN (select K2, 1 as staticValue from AUX) DERIVED
ON PCPL.K1 = DERIVED.K2; 

+------+------+
| K1   | K2   |
+------+------+
|    1 | NULL |
|    2 | NULL |
|    3 | NULL |
+------+------+
3 rows in set (0,00 sec)


## TEST 2 : "Never matching condition" => STRANGE : Resulting Table NOT empty 

select PCPL.K1 as K1 , DERIVED.K2  as K2  
from PCPL
LEFT JOIN (select K2, 1 as staticValue from AUX) DERIVED
ON PCPL.K1 = DERIVED.K2
where staticValue=1;   ##### THIS CONDITION IS NEVER MET SINCE TABLE AUX IS EMPTY

+------+------+
| K1   | K2   |
+------+------+
|    1 | NULL |
|    2 | NULL |
|    3 | NULL |
+------+------+
3 rows in set (0,00 sec)

THIS SHOULDN'T HAPPEN !

MySQL 5.5不会发生此行为

This behaviour does not occur with MySQL 5.5

这是我在5.7中忘记设置的5.5中的错误还是某些参数?

Is this a bug or some parameter in 5.5 that I forgot to set in 5.7?

感谢您的时间!

这特别看起来像是子查询合并中的错误.这是一种使MySQL避免实现子查询的开销的策略.

This specifically looks like a bug in subquery merging. This is a strategy that lets MySQL avoid the overhead of materializing subqueries.

请考虑以下两个查询(第一个是原始查询):

Consider the following two queries (the first is the original):

select PCPL.K1 as K1, DERIVED.K2 as K2,
       (CASE WHEN DERIVED.K2 IS NULL THEN 'is null' ELSE 'not null' END) as K2_null,
       staticValue
       (CASE WHEN staticValue IS NULL THEN 'is null' ELSE 'not null' END) as staticValue_null
from PCPL LEFT JOIN
     (select K2, 1 as staticValue from AUX) DERIVED
     ON PCPL.K1 = DERIVED.K2
where staticValue IS NOT NULL;

和此修改后的形式唯一的区别在于DERIVED:

And this modified form where the only difference in DERIVED:

select PCPL.K1 as K1, DERIVED.K2 as K2,
       (CASE WHEN DERIVED.K2 IS NULL THEN 'is null' ELSE 'not null' END) as K2_null,
       staticValue
       (CASE WHEN staticValue IS NULL THEN 'is null' ELSE 'not null' END) as staticValue_null
from PCPL LEFT JOIN
     (select K2, 1 as staticValue from AUX order by k2 limit 3) DERIVED
-------------------------------------------^XXXXXXXXXXXXXXXXXX
     ON PCPL.K1 = DERIVED.K2
where staticValue IS NOT NULL;

order by k2 limit 3的目的是强制实现子查询.第一个版本返回三行(不正确).第二个版本返回零行(正确).在这两种情况下,子查询都返回相同的行集.

The purpose of order by k2 limit 3 is to force the materialization of the subquery. The first version returns three rows (incorrect). The second version returns zero rows (correct). In both cases, the subquery returns the same row set.

区别在于,强制实现解决了该问题.因此,这看起来像是子查询合并中的错误.

The difference is that the forced materialization fixes the problem. Hence, this looks like a bug in subquery merging.

您可以在文档中了解有关子查询合并的信息. .

一小篇社论说明.复杂的数据库将查询解析为描述处理的DAG(有向无环图).然后,他们有了复杂的规则来推动操作通过图表-尤其是过滤操作和列计算.

A small editorial note. Sophisticated databases parse a query into a DAG (directed acyclic graph) that describes the processing. They then have sophisticated rules for pushing operations through the graph -- particularly filtering operations and column computations.

MySQL从更简单的查询模型开始.开发人员正在朝着更复杂的模型发展.像这样的小错误并不是意料之外的,但是要解决这些错误需要一两个版本.

MySQL starts from a simpler model of the query. The developers are moving toward a more sophisticated model. Small bugs like this are not unexpected, but it will take a version or two to work them out.