LinkNemo


mysql 之 in 和exists区别

CODY 发表于2017/09/21 18:08:18 473次赏阅 1个点评

问题:

前段时间,一朋友面试的时候,问到sql优化时,说sql查询效率 exists大于in,果真如此?

准备

新建users

/* 用户表 */
drop table if exists users;
create table users(
	id int primary key auto_increment,
	name varchar(20)
);
insert into users(name) values ('A');
insert into users(name) values ('B');
insert into users(name) values ('C');
insert into users(name) values ('D');
insert into users(name) values ('E');
insert into users(name) values ('F');
insert into users(name) values ('G');
insert into users(name) values ('H');
insert into users(name) values ('I');
insert into users(name) values ('J');        

新建orders

/* 订单表 */
drop table if exists orders;
create table orders(
	id int primary key auto_increment,/*订单id*/
	order_no varchar(20) not null,/*订单编号*/
	title varchar(20) not null,/*订单标题*/
	goods_num int not null,/*订单数量*/
	money decimal(7,4) not null,/*订单金额*/
	user_id int not null	/*订单所属用户id*/
)engine=myisam default charset=utf8 ;

创建订单存储过程

delimiter $
drop procedure batch_orders $

create procedure batch_orders(in max int)
begin
declare start int default 0;
declare i int default 0;
set autocommit = 0;  
 while i < max do
	set i = i + 1;
	insert into orders(order_no,title,goods_num,money,user_id) 
	values (concat('NCS-',floor(1 + rand()*1000000000000 )),concat('订单title-',i),i%50,(100.0000+(i%50)),i%10);
 end while;
commit;
end $
delimiter ;

call batch_orders(10000000);  # 创建1000W数据

模拟

场景一: 子查询  <  主查询


mysql> select count(1) from orders where user_id in (select id from users) ;
+----------+
| count(1) |
+----------+
|  9000000 |
+----------+
1 row in set (9.47 sec)

mysql> select count(1) from orders where exists (select id from users where orders.user_id = users.id);
+----------+
| count(1) |
+----------+
|  9000000 |
+----------+
1 row in set (12.18 sec)    

场景二:子查询  >  主查询

mysql> select count(1) from users where id in (select user_id from orders);
+----------+
| count(1) |
+----------+
|        9 |
+----------+
1 row in set (4.13 sec)

mysql> select count(1) from users where exists (select 1 from orders where users.id=orders.user_id);
+----------+
| count(1) |
+----------+
|        9 |
+----------+
1 row in set (1.35 sec)

分析:

in执行顺序:先执行in中的子查询,作为我们最外层循环,主查询作为内层循环

exists: 主查询作为最外层循环,子查询作为最内层循环(工作原理先将主查询的结果作为子查询的条件)

                       

结论

exists 性能大于in 视情况而定,

    如果in中子查询<主循环,则exists

    如果in中子查询>主循环,则exists > in;    

永远小表驱动大表是最优的选择方式


Nemo 评论于

没有对比,就没有伤害。论对比性比较的必要性。



评论