使用MySQL或SQL进行的BETWEEN和IN之间的性能差异一般吗?

问题描述：

我要根据它们的主键获取一组连续的行，这是一个自动递增的整数.假设没有漏洞，那么它们之间是否有任何性能?

I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:

SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);

和:

SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;

在这种情况下，

答

BETWEEN 应该胜过IN(但是 do 衡量并检查执行计划) ，也是如此！)，尤其是随着n的增长以及统计信息的准确性.假设:

BETWEEN should outperform IN in this case (but do measure and check execution plans, too!), especially as n grows and as statistics are still accurate. Let's assume:

m是表的大小
n是范围的大小

m is the size of your table
n is the size of your range

从理论上讲，BETWEEN可以通过对主键索引进行一次范围扫描"(Oracle讲)来实现，然后最多遍历n索引叶节点.复杂度将为O(n + log m)

In theory, BETWEEN can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n index leaf nodes. The complexity will be O(n + log m)

IN通常在主键索引上实现为一系列n范围扫描"的循环(循环).使用m作为表的大小，复杂度将始终为O(n * log m) ...，这总是更糟(对于非常小的表m或非常小的范围n而言是可忽略的)

IN is usually implemented as a series (loop) of n "range scans" on the primary key index. With m being the size of the table, the complexity will always be O(n * log m) ... which is always worse (neglibile for very small tables m or very small ranges n)

无论如何，您将获得全表扫描并评估每一行的谓词:

In any case, you'll get a full table scan and evaluate the predicate on each row:

BETWEEN需要评估两个谓词:一个谓词下限，一个谓词上限.复杂度是O(m)

BETWEEN needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)

IN最多需要评估n谓词.复杂度是O(m * n) ...再次变得更糟，或者如果数据库可以将IN列表优化为哈希映射，而不是谓词列表，则复杂度可能为O(m).

IN needs to evaluate at most n predicates. The complexity is O(m * n) ... which is again always worse, or perhaps O(m) if the database can optimise the IN list to be a hashmap, rather than a list of predicates.

使用MySQL或SQL进行的BETWEEN和IN之间的性能差异一般吗?

相关推荐