mysql 分页话语优化
测试数据表结构
现在有两张表 customer,以及order_test, customerId的主键是order_test的外键
CREATE TABLE `customer` ( `customerId` int(11) NOT NULL AUTO_INCREMENT, `phone` varchar(11) NOT NULL COMMENT '手机号', `password` varchar(20) NOT NULL COMMENT '密码', PRIMARY KEY (`customerId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 CREATE TABLE `order_test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `orderNo` char(17) COLLATE utf8_bin NOT NULL, `customerId` int(11) DEFAULT NULL, PRIMARY KEY (`id`), KEY `111_idx` (`customerId`), CONSTRAINT `111` FOREIGN KEY (`customerId`) REFERENCES `customer` (`customerId`) ON DELETE NO ACTION ON UPDATE NO ACTION ) ENGINE=InnoDB AUTO_INCREMENT=100880001 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
customer表又1亿多条数据,order_test有9000w条数据.
下面针对上面两张表做一些常用的语句进行优化
limit 分页优化方法
explain用法详见: http://zcf9916.iteye.com/blog/2409731
1.假设现在要查处customer表的按customerId排序的第500000到500200中的200条数据
正常的写法就是这样
SELECT customerId FROM customer LIMIT 500000,200; 执行时间是0.108秒
这个语句会查出前500200条数据,然后再丢弃前500000条数据,所以速度很慢
执行explain + sql,可以看到type=all,进行了全表扫描
- id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
- +----+-------------+---------+------+---------------+------+---------+------+--------+----------+-------+
- | 1 | SIMPLE | customer | ALL | NULL | NULL | NULL | NULL | 99808800 | 100.00 | |
如果limit 后面只有一个数字,limit 200,那么数据库只会扫描这两百条,所以我们可以从这里入手.
1. 假设id是递增的,且数据没有删除,
建立主键或唯一索引, 利用索引(假设每页10条)
---语句样式:mysql中,可用如下方法: SELECT * FROM customer WHERE customerId > 500000 LIMIT 200,执行时间0.0032秒
---适应场景: 适用于数据量多的情况(元组数上万)
---原因: 索引扫描,速度会很快.
执行explain + sql,可以看到type=range, possible_keys=primary 说明使用了主键索引
- (ps: range:只检索给定范围的行,使用一个索引来选择行。)
- id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
- +----+-------------+---------+------+---------------+------+---------+------+--------+----------+-------+
- | 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 49808800 | 100.00 |
2.假设id是递增的,且有数据被删除过,假设第499800-499999数据被删除过,那么就不能单纯的靠customerId > 500000来定位了,因为实际id范围是50020-500400了.
---语句样式:可用如下方法: SELECT * FROM customer WHERE customerId >=
(SELECT customerId FROM customer order by customerId LIMIT 500000,1) ORDER BY customerId LIMIT 200
- id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
- +----+-------------+---------+------+---------------+------+---------+------+--------+----------+-------+
- | 1 | PRIMARY | customer | range | PRIMARY | PRIMARY | 4 | NULL | 49808800 | 100.00 | using where
- | 2 | SUBQUERY | customer | index | | PRIMARY | 4 | NULL | 500001 | 100.00 | using index
PS:如果limit offset,数量. 如果offset值过大,那么查询时间还是会很长,所以当offset超过一定量的时候,最好就不做处理。
PS:在实验中发现 。SELECT customerId FROM customer order by customerId LIMIT 0,20;和SELECT customerId FROM customer LIMIT 0,20;虽然cutomer是递增的,但是两句话的结果可能不同,因为不加order就会根据索引的指针顺序去访问,不一定是按顺序来的.