使用MySQL或SQL进行的BETWEEN和IN之间的性能差异一般吗?
我要根据它们的主键获取一组连续的行,这是一个自动递增的整数.假设没有漏洞,那么它们之间是否有任何性能?
I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:
SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);
和:
SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;
在这种情况下,
BETWEEN
应该胜过IN
(但是 do 衡量并检查执行计划) ,也是如此!),尤其是随着n
的增长以及统计信息的准确性.假设:
BETWEEN
should outperform IN
in this case (but do measure and check execution plans, too!), especially as n
grows and as statistics are still accurate. Let's assume:
-
m
是表的大小 -
n
是范围的大小
-
m
is the size of your table -
n
is the size of your range
-
从理论上讲,
BETWEEN
可以通过对主键索引进行一次范围扫描"(Oracle讲)来实现,然后最多遍历n
索引叶节点.复杂度将为O(n + log m)
In theory,
BETWEEN
can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at mostn
index leaf nodes. The complexity will beO(n + log m)
IN
通常在主键索引上实现为一系列n
范围扫描"的循环(循环).使用m
作为表的大小,复杂度将始终为O(n * log m)
...,这总是更糟(对于非常小的表m
或非常小的范围n
而言是可忽略的)
IN
is usually implemented as a series (loop) of n
"range scans" on the primary key index. With m
being the size of the table, the complexity will always be O(n * log m)
... which is always worse (neglibile for very small tables m
or very small ranges n
)
无论如何,您将获得全表扫描并评估每一行的谓词:
In any case, you'll get a full table scan and evaluate the predicate on each row:
-
BETWEEN
需要评估两个谓词:一个谓词下限,一个谓词上限.复杂度是O(m)
BETWEEN
needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity isO(m)
IN
最多需要评估n
谓词.复杂度是O(m * n)
...再次变得更糟,或者如果数据库可以将IN
列表优化为哈希映射,而不是谓词列表,则复杂度可能为O(m)
.
IN
needs to evaluate at most n
predicates. The complexity is O(m * n)
... which is again always worse, or perhaps O(m)
if the database can optimise the IN
list to be a hashmap, rather than a list of predicates.