一、SELECT語句關(guān)鍵字的定義順序
SELECT DISTINCT <select_list>
FROM <left_table>
<join_type> JOIN <right_table>
ON <join_condition>
WHERE <where_condition>
GROUP BY <group_by_list>
HAVING <having_condition>
ORDER BY <order_by_condition>
LIMIT <limit_number>
二主之、SELECT語句的關(guān)鍵字執(zhí)行順序
(7) SELECT
(8) DISTINCT <select_list>
(1) FROM <left_table>
(3) <join_type> JOIN <right_table>
(2) ON <join_condition>
(4) WHERE <where_condition>
(5) GROUP BY <group_by_list>
(6) HAVING <having_condition>
(9) ORDER BY <order_by_condition>
(10) LIMIT <limit_number>
三酵幕、準(zhǔn)備表和數(shù)據(jù)
- 新建一個(gè)測(cè)試數(shù)據(jù)庫TestDB;
create database TestDB;
2.創(chuàng)建測(cè)試表table1和table2缓艳;
CREATE TABLE table1
(
customer_id VARCHAR(10) NOT NULL,
city VARCHAR(10) NOT NULL,
PRIMARY KEY(customer_id)
)ENGINE=INNODB DEFAULT CHARSET=UTF8;
CREATE TABLE table2
(
order_id INT NOT NULL auto_increment,
customer_id VARCHAR(10),
PRIMARY KEY(order_id)
)ENGINE=INNODB DEFAULT CHARSET=UTF8;
3.插入測(cè)試數(shù)據(jù)校摩;
INSERT INTO table1(customer_id,city) VALUES('163','hangzhou');
INSERT INTO table1(customer_id,city) VALUES('9you','shanghai');
INSERT INTO table1(customer_id,city) VALUES('tx','hangzhou');
INSERT INTO table1(customer_id,city) VALUES('baidu','hangzhou');
INSERT INTO table2(customer_id) VALUES('163');
INSERT INTO table2(customer_id) VALUES('163');
INSERT INTO table2(customer_id) VALUES('9you');
INSERT INTO table2(customer_id) VALUES('9you');
INSERT INTO table2(customer_id) VALUES('9you');
INSERT INTO table2(customer_id) VALUES('tx');
INSERT INTO table2(customer_id) VALUES(NULL);
準(zhǔn)備工作做完以后,table1和table2看起來應(yīng)該像下面這樣:
mysql> select * from table1;
+-------------+----------+
| customer_id | city |
+-------------+----------+
| 163 | hangzhou |
| 9you | shanghai |
| baidu | hangzhou |
| tx | hangzhou |
+-------------+----------+
rows in set (0.00 sec)
mysql> select * from table2;
+----------+-------------+
| order_id | customer_id |
+----------+-------------+
| 1 | 163 |
| 2 | 163 |
| 3 | 9you |
| 4 | 9you |
| 5 | 9you |
| 6 | tx |
| 7 | NULL |
+----------+-------------+
rows in set (0.00 sec)
四 準(zhǔn)備SQL邏輯查詢測(cè)試語句
#查詢來自杭州阶淘,并且訂單數(shù)少于2的客戶衙吩。
SELECT a.customer_id, COUNT(b.order_id) as total_orders
FROM table1 AS a
LEFT JOIN table2 AS b
ON a.customer_id = b.customer_id
WHERE a.city = 'hangzhou'
GROUP BY a.customer_id
HAVING count(b.order_id) < 2
ORDER BY total_orders DESC;
五 執(zhí)行順序分析
在這些SQL語句的執(zhí)行過程中,都會(huì)產(chǎn)生一個(gè)虛擬表溪窒,用來保存SQL語句的執(zhí)行結(jié)果(這是重點(diǎn))坤塞,我現(xiàn)在就來跟蹤這個(gè)虛擬表的變化,得到最終的查詢結(jié)果的過程澈蚌,來分析整個(gè)SQL邏輯查詢的執(zhí)行順序和過程摹芙。
執(zhí)行from語句
第一步,執(zhí)行FROM語句宛瞄。我們首先需要知道最開始從哪個(gè)表開始的浮禾,這就是FROM告訴我們的。現(xiàn)在有了<left_table>和<right_table>兩個(gè)表,我們到底從哪個(gè)表開始盈电,還是從兩個(gè)表進(jìn)行某種聯(lián)系以后再開始呢蝴簇?它們之間如何產(chǎn)生聯(lián)系呢?——笛卡爾積
關(guān)于什么是笛卡爾積挣轨,請(qǐng)自行Google補(bǔ)腦军熏。經(jīng)過FROM語句對(duì)兩個(gè)表執(zhí)行笛卡爾積,會(huì)得到一個(gè)虛擬表卷扮,暫且叫VT1(vitual table 1)荡澎,內(nèi)容如下:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 9you | shanghai | 1 | 163 |
| baidu | hangzhou | 1 | 163 |
| tx | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| 9you | shanghai | 2 | 163 |
| baidu | hangzhou | 2 | 163 |
| tx | hangzhou | 2 | 163 |
| 163 | hangzhou | 3 | 9you |
| 9you | shanghai | 3 | 9you |
| baidu | hangzhou | 3 | 9you |
| tx | hangzhou | 3 | 9you |
| 163 | hangzhou | 4 | 9you |
| 9you | shanghai | 4 | 9you |
| baidu | hangzhou | 4 | 9you |
| tx | hangzhou | 4 | 9you |
| 163 | hangzhou | 5 | 9you |
| 9you | shanghai | 5 | 9you |
| baidu | hangzhou | 5 | 9you |
| tx | hangzhou | 5 | 9you |
| 163 | hangzhou | 6 | tx |
| 9you | shanghai | 6 | tx |
| baidu | hangzhou | 6 | tx |
| tx | hangzhou | 6 | tx |
| 163 | hangzhou | 7 | NULL |
| 9you | shanghai | 7 | NULL |
| baidu | hangzhou | 7 | NULL |
| tx | hangzhou | 7 | NULL |
+-------------+----------+----------+-------------+
總共有28(table1的記錄條數(shù) * table2的記錄條數(shù))條記錄。這就是VT1的結(jié)果晤锹,接下來的操作就在VT1的基礎(chǔ)上進(jìn)行摩幔。
執(zhí)行ON過濾
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| 9you | shanghai | 3 | 9you |
| 9you | shanghai | 4 | 9you |
| 9you | shanghai | 5 | 9you |
| tx | hangzhou | 6 | tx |
+-------------+----------+----------+-------------+
VT2就是經(jīng)過ON條件篩選以后得到的有用數(shù)據(jù),而接下來的操作將在VT2的基礎(chǔ)上繼續(xù)進(jìn)行鞭铆。
添加外部行
這一步只有在連接類型為OUTER JOIN時(shí)才發(fā)生或衡,如LEFT OUTER JOIN、RIGHT OUTER JOIN和FULL OUTER JOIN车遂。在大多數(shù)的時(shí)候封断,我們都是會(huì)省略掉OUTER關(guān)鍵字的,但OUTER表示的就是外部行的概念舶担。
LEFT OUTER JOIN把左表記為保留表坡疼,得到的結(jié)果為:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| 9you | shanghai | 3 | 9you |
| 9you | shanghai | 4 | 9you |
| 9you | shanghai | 5 | 9you |
| tx | hangzhou | 6 | tx |
| baidu | hangzhou | NULL | NULL |
+-------------+----------+----------+-------------+
RIGHT OUTER JOIN把右表記為保留表,得到的結(jié)果為:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| 9you | shanghai | 3 | 9you |
| 9you | shanghai | 4 | 9you |
| 9you | shanghai | 5 | 9you |
| tx | hangzhou | 6 | tx |
| NULL | NULL | 7 | NULL |
+-------------+----------+----------+-------------+
FULL OUTER JOIN把左右表都作為保留表衣陶,得到的結(jié)果為:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| 9you | shanghai | 3 | 9you |
| 9you | shanghai | 4 | 9you |
| 9you | shanghai | 5 | 9you |
| tx | hangzhou | 6 | tx |
| baidu | hangzhou | NULL | NULL |
| NULL | NULL | 7 | NULL |
+-------------+----------+----------+-------------+
添加外部行的工作就是在VT2表的基礎(chǔ)上添加保留表中被過濾條件過濾掉的數(shù)據(jù)柄瑰,非保留表中的數(shù)據(jù)被賦予NULL值,最后生成虛擬表VT3剪况。
由于我在準(zhǔn)備的測(cè)試SQL查詢邏輯語句中使用的是LEFT JOIN教沾,過濾掉了以下這條數(shù)據(jù):
| baidu | hangzhou | NULL | NULL |
現(xiàn)在就把這條數(shù)據(jù)添加到VT2表中,得到的VT3表如下:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| 9you | shanghai | 3 | 9you |
| 9you | shanghai | 4 | 9you |
| 9you | shanghai | 5 | 9you |
| tx | hangzhou | 6 | tx |
| baidu | hangzhou | NULL | NULL |
+-------------+----------+----------+-------------+
接下來的操作都會(huì)在該VT3表上進(jìn)行译断。
執(zhí)行WHERE過濾
對(duì)添加外部行得到的VT3進(jìn)行WHERE過濾授翻,只有符合<where_condition>的記錄才會(huì)輸出到虛擬表VT4中。當(dāng)我們執(zhí)行WHERE a.city = 'hangzhou'的時(shí)候孙咪,就會(huì)得到以下內(nèi)容藏姐,并存在虛擬表VT4中:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| 163 | hangzhou | 2 | 163 |
| tx | hangzhou | 6 | tx |
| baidu | hangzhou | NULL | NULL |
+-------------+----------+----------+-------------+
但是在使用WHERE子句時(shí),需要注意以下兩點(diǎn):
由于數(shù)據(jù)還沒有分組该贾,因此現(xiàn)在還不能在WHERE過濾器中使用where_condition=MIN(col)這類對(duì)分組統(tǒng)計(jì)的過濾;
由于還沒有進(jìn)行列的選取操作捌臊,因此在SELECT中使用列的別名也是不被允許的杨蛋,如:SELECT city as c FROM t WHERE c='shanghai';是不允許出現(xiàn)的。
執(zhí)行GROUP BY分組
GROU BY子句主要是對(duì)使用WHERE子句得到的虛擬表進(jìn)行分組操作。我們執(zhí)行測(cè)試語句中的GROUP BY a.customer_id逞力,就會(huì)得到以下內(nèi)容(默認(rèn)只顯示組內(nèi)第一條):
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| 163 | hangzhou | 1 | 163 |
| baidu | hangzhou | NULL | NULL |
| tx | hangzhou | 6 | tx |
+-------------+----------+----------+-------------+
得到的內(nèi)容會(huì)存入虛擬表VT5中曙寡,此時(shí),我們就得到了一個(gè)VT5虛擬表寇荧,接下來的操作都會(huì)在該表上完成举庶。
執(zhí)行HAVING過濾
HAVING子句主要和GROUP BY子句配合使用,對(duì)分組得到的VT5虛擬表進(jìn)行條件過濾揩抡。當(dāng)我執(zhí)行測(cè)試語句中的HAVING count(b.order_id) < 2時(shí)户侥,將得到以下內(nèi)容:
+-------------+----------+----------+-------------+
| customer_id | city | order_id | customer_id |
+-------------+----------+----------+-------------+
| baidu | hangzhou | NULL | NULL |
| tx | hangzhou | 6 | tx |
+-------------+----------+----------+-------------+
SELECT列表
現(xiàn)在才會(huì)執(zhí)行到SELECT子句,不要以為SELECT子句被寫在第一行峦嗤,就是第一個(gè)被執(zhí)行的蕊唐。
我們執(zhí)行測(cè)試語句中的SELECT a.customer_id, COUNT(b.order_id) as total_orders,從虛擬表VT6中選擇出我們需要的內(nèi)容烁设。我們將得到以下內(nèi)容:
+-------------+--------------+
| customer_id | total_orders |
+-------------+--------------+
| baidu | 0 |
| tx | 1 |
+-------------+--------------+
執(zhí)行DISTINCT子句
如果在查詢中指定了DISTINCT子句替梨,則會(huì)創(chuàng)建一張內(nèi)存臨時(shí)表(如果內(nèi)存放不下,就需要存放在硬盤了)装黑。這張臨時(shí)表的表結(jié)構(gòu)和上一步產(chǎn)生的虛擬表VT7是一樣的副瀑,不同的是對(duì)進(jìn)行DISTINCT操作的列增加了一個(gè)唯一索引,以此來除重復(fù)數(shù)據(jù)恋谭。
由于我的測(cè)試SQL語句中并沒有使用DISTINCT糠睡,所以,在該查詢中箕别,這一步不會(huì)生成一個(gè)虛擬表铜幽。
執(zhí)行ORDER BY子句
對(duì)虛擬表中的內(nèi)容按照指定的列進(jìn)行排序,然后返回一個(gè)新的虛擬表串稀,我們執(zhí)行測(cè)試SQL語句中的ORDER BY total_orders DESC除抛,就會(huì)得到以下內(nèi)容:
+-------------+--------------+
| customer_id | total_orders |
+-------------+--------------+
| tx | 1 |
| baidu | 0 |
+-------------+--------------+
執(zhí)行LIMIT子句
LIMIT子句從上一步得到的VT8虛擬表中選出從指定位置開始的指定行數(shù)據(jù)。對(duì)于沒有應(yīng)用ORDER BY的LIMIT子句母截,得到的結(jié)果同樣是無序的到忽,所以,很多時(shí)候清寇,我們都會(huì)看到LIMIT子句會(huì)和ORDER BY子句一起使用喘漏。
MySQL數(shù)據(jù)庫的LIMIT支持如下形式的選擇:
LIMIT n, m
表示從第n條記錄開始選擇m條記錄。而很多開發(fā)人員喜歡使用該語句來解決分頁問題华烟。對(duì)于小數(shù)據(jù)翩迈,使用LIMIT子句沒有任何問題,當(dāng)數(shù)據(jù)量非常大的時(shí)候盔夜,使用LIMIT n, m是非常低效的负饲。因?yàn)長(zhǎng)IMIT的機(jī)制是每次都是從頭開始掃描堤魁,如果需要從第60萬行開始,讀取3條數(shù)據(jù)返十,就需要先掃描定位到60萬行妥泉,然后再進(jìn)行讀取,而掃描的過程是一個(gè)非常低效的過程洞坑。所以盲链,對(duì)于大數(shù)據(jù)處理時(shí),是非常有必要在應(yīng)用層建立一定的緩存機(jī)制(現(xiàn)在的大數(shù)據(jù)處理迟杂,大都使用緩存)