淺析MySQL中exists與in的使用

exists對外表用loop逐條查詢梯浪，每次查詢都會(huì)查看exists的條件語句，當(dāng) exists里的條件語句能夠返回記錄行時(shí)(無論記錄行是的多少礼预，只要能返回)虏劲，條件就為真，返回當(dāng)前l(fā)oop到的這條記錄励堡，反之如果exists里的條件語句不能返回記錄行吻育，則當(dāng)前l(fā)oop到的這條記錄被丟棄，exists的條件就像一個(gè)bool條件摊趾，當(dāng)能返回結(jié)果集則為true，不能返回結(jié)果集則為 false

如下：

select * from user where exists (select 1);

對user表的記錄逐條取出漩绵，由于子條件中的select 1永遠(yuǎn)能返回記錄行肛炮，那么user表的所有記錄都將被加入結(jié)果集，所以與 select * from user;是一樣的

又如下

select * from user where exists (select * from user where userId = 0);

可以知道對user表進(jìn)行l(wèi)oop時(shí)碍扔，檢查條件語句(select * from user where userId = 0),由于userId永遠(yuǎn)不為0秕重，所以條件語句永遠(yuǎn)返回空集，條件永遠(yuǎn)為false二拐，那么user表的所有記錄都將被丟棄

not exists與exists相反凳兵，也就是當(dāng)exists條件有結(jié)果集返回時(shí)庐扫，loop到的記錄將被丟棄，否則將loop到的記錄加入結(jié)果集

總的來說聚蝶，如果A表有n條記錄，那么exists查詢就是將這n條記錄逐條取出，然后判斷n遍exists條件?

in查詢相當(dāng)于多個(gè)or條件的疊加验靡，這個(gè)比較好理解雏节，比如下面的查詢

select * from user where userId in (1, 2, 3);

等效于

select * from user where userId = 1 or userId = 2 or userId = 3;

not in與in相反，如下

select * from user where userId not in (1, 2, 3);

等效于

select * from user where userId != 1 and userId != 2 and userId != 3;

總的來說辞州，in查詢就是先將子查詢條件的記錄全都查出來寥粹，假設(shè)結(jié)果集為B，共有m條記錄媚狰，然后在將子查詢條件的結(jié)果集分解成m個(gè)，再進(jìn)行m次查詢

值得一提的是类嗤，in查詢的子條件返回結(jié)果必須只有一個(gè)字段辨宠，例如

select * from user where userId in (select id from B);

而不能是

select * from user where userId in (select id, age from B);

而exists就沒有這個(gè)限制

下面來考慮exists和in的性能

考慮如下SQL語句

1: select * from A where exists (select * from B where B.id = A.id);

2: select * from A where A.id in (select id from B);

查詢1.可以轉(zhuǎn)化以下偽代碼嗤形，便于理解

for ($i = 0; $i < count(A); $i++) {

　　$a = get_record(A, $i); #從A表逐條獲取記錄

　　if (B.id = $a[id]) #如果子條件成立

　　　　$result[] = $a;

}

return $result;

大概就是這么個(gè)意思黄伊，其實(shí)可以看到,查詢1主要是用到了B表的索引，A表如何對查詢的效率影響應(yīng)該不大

假設(shè)B表的所有id為1,2,3,查詢2可以轉(zhuǎn)換為

select * from A where A.id = 1 or A.id = 2 or A.id = 3;

這個(gè)好理解了派殷，這里主要是用到了A的索引还最，B表如何對查詢影響不大

下面再看not exists 和 not in

1. select * from A where not exists (select * from B where B.id = A.id);

2. select * from A where A.id not in (select id from B);

看查詢1，還是和上面一樣毡惜，用了B的索引

而對于查詢2拓轻，可以轉(zhuǎn)化成如下語句

select * from A where A.id != 1 and A.id != 2 and A.id != 3;

可以知道not in是個(gè)范圍查詢，這種!=的范圍查詢無法使用任何索引,等于說A表的每條記錄经伙，都要在B表里遍歷一次扶叉，查看B表里是否存在這條記錄

故not exists比not in效率高

mysql中的in語句是把外表和內(nèi)表作hash 連接，而exists語句是對外表作loop循環(huán)帕膜，每次loop循環(huán)再對內(nèi)表進(jìn)行查詢枣氧。一直大家都認(rèn)為exists比in語句的效率要高，這種說法其實(shí)是不準(zhǔn)確的达吞。這個(gè)是要區(qū)分環(huán)境的。

如果查詢的兩個(gè)表大小相當(dāng)荒典，那么用in和exists差別不大酪劫。

如果兩個(gè)表中一個(gè)較小，一個(gè)是大表寺董，則子查詢表大的用exists覆糟，子查詢表小的用in：?

例如：表A（小表），表B（大表）

1：

select * from A where cc in (select cc from B)效率低遮咖，用到了A表上cc列的索引滩字；

select * from A where exists(select cc from B where cc=A.cc)效率高，用到了B表上cc列的索引。

相反的

2：

select * from B where cc in (select cc from A)效率高麦箍，用到了B表上cc列的索引漓藕；

select * from B where exists(select cc from A where cc=B.cc)?效率低，用到了A表上cc列的索引内列。

not in 和not exists如果查詢語句使用了not in 那么內(nèi)外表都進(jìn)行全表掃描撵术，沒有用到索引；而not extsts 的子查詢依然能用到表上的索引话瞧。所以無論那個(gè)表大嫩与，用not exists都比not in要快。

in 與 =的區(qū)別?

select name from student where name in ('zhang','wang','li','zhao');?

與?

select name from student where name='zhang' or name='li' or name='wang' or name='zhao'?

的結(jié)果是相同的交排。