正文
這里我們會(huì)遇到subquery,它可以出現(xiàn)在select子句
中或者where子句
或者from子句
中。它會(huì)產(chǎn)生一個(gè)對(duì)應(yīng)的結(jié)果表格蠢笋,我們可以給這個(gè)表示命名。
數(shù)據(jù)集
我們這一篇文章采用PostgreSQL的SQL語(yǔ)法鳞陨。重點(diǎn)我們關(guān)注select...from...where
這種讀操作昨寞,分析query (analytical query)。
數(shù)據(jù)集在 https://hyper-db.de/interface.html 可以直接使用炊邦。另外在這個(gè)網(wǎng)頁(yè)不允許進(jìn)行寫(xiě)操作:insert
, update
, delete
之類(lèi)的transactional query编矾。當(dāng)然create table
和drop table
也不被允許。
架構(gòu) Schema:
下載:
https://db.in.tum.de/teaching/ws1920/grundlagen/uni_mysql.sql?lang=de
Schma和大部分SQL語(yǔ)句來(lái)自Prof. Alfons Kemper, Ph.D.的課件和書(shū)馁害。
課件:
- https://db.in.tum.de/teaching/bookDBMSeinf/folien/?lang=de
- https://db.in.tum.de/teaching/bookDBMSeinf/folien/pdf/Kapitel4.pdf?lang=de
書(shū): https://db.in.tum.de/teaching/bookDBMSeinf/?lang=de
中級(jí)SQL
- 在pruefen中搜索note小于平局值的:
select *
from pruefen
where note < (
select avg(note)
from pruefen
)
- 對(duì)每一個(gè)professoren,對(duì)應(yīng)的vorlesungen的sws求和:
-- correlated sub-query
select p.persnr, p.name, (
select sum(v.sws) as lehrbelastung
from vorlesungen v
where v.gelesenvon = p.persnr
)
from professoren p
-- no sub-query
select p.persnr, p.name, sum(sws)
from professoren p left outer join vorlesungen v on p.persnr = v.gelesenvon
group by p.name, p.persnr
- 搜索上課數(shù)大于2的學(xué)生:
select tmp.matrnr, tmp.name, tmp.vorlanzahl
from (select s.matrnr, s.name, count(*) as vorlanzahl
from studenten s, hoeren h
where s.matrnr = h.matrnr
group by s.matrnr, s.name) tmp
where tmp.vorlanzahl > 2
這時(shí)候我們對(duì)這個(gè)subquery的結(jié)果表格進(jìn)行命名tmp
蹂匹。當(dāng)然我們可以用with子句
來(lái)做同樣的事情碘菜。我主觀上更喜歡用with
,它很清晰地把暫時(shí)需要的表格寫(xiě)在最上方,而且對(duì)debug也更加友好限寞。當(dāng)然兩者是結(jié)果等價(jià)忍啸,運(yùn)行時(shí)間也等價(jià)的。
with tmp as (select s.matrnr, s.name, count(*) as vorlanzahl
from studenten s, hoeren h
where s.matrnr = h.matrnr
group by s.matrnr, s.name)
select tmp.matrnr, tmp.name, tmp.vorlanzahl
from tmp
where tmp.vorlanzahl > 2
- 計(jì)算每一個(gè)vorlesungen的人數(shù)占比:
select h.vorlnr, h.anzProVorl, g.gesamtAnz, cast(h.anzProVorl as decimal(6, 1)) / g.gesamtAnz as MarkAnteil
from (select vorlnr, count(*) as anzProVorl
from hoeren
group by vorlnr) as h,
(select count(*) as gesamtAnz
from studenten) g
-- with子句版本
with h as (select vorlnr, count(*) as anzProVorl
from hoeren
group by vorlnr),
g as (select count(*) as gesamtAnz
from studenten)
select h.vorlnr, h.anzProVorl, g.gesamtAnz, cast(h.anzProVorl as decimal(6, 1)) / g.gesamtAnz as MarkAnteil
from h, g
- 計(jì)算每一個(gè)professoren通過(guò)上課認(rèn)識(shí)的studenten個(gè)數(shù)以及比例:
with kenntSich as (
select distinct v.gelesenvon as profpersnr, h.matrnr as studmatrnr
from hoeren h join vorlesungen v on h.vorlnr =v.vorlnr
),
kenntAnzahl as (
select profpersnr, count(*) as anzstudenten
from kenntSich
group by profpersnr),
wieviel as (
select count(*) as gesamtanz
from studenten)
select k.profpersnr, p.name, k.anzstudenten, w.gesamtanz, 1.00 * k.anzstudenten / w.gesamtanz as bekanntheitsgard
from kenntAnzahl k, wieviel w, professoren p
where k.profpersnr = p.persnr
order by bekanntheitsgard desc
- 搜索聽(tīng)了所有sws=4 vorlesungen的學(xué)生:
SELECT s.*
FROM studenten s
where not exists(
select *
from vorlesungen v
where v.sws = 4 and not exists(
select *
from hoeren h
where h.vorlnr = v.vorlnr and h.matrnr = s.matrnr
)
)
SQL92中沒(méi)有定義for all Quantifier(全稱(chēng)量詞)履植。所以我們只能改寫(xiě)關(guān)系代數(shù):
我們先把改寫(xiě)成:
再把改寫(xiě)成:
再用DeMorgan律簡(jiǎn)化一下:
用中文說(shuō):不存在一門(mén)sws=4的課计雌,沒(méi)有被這個(gè)學(xué)生聽(tīng)。這樣我們可以對(duì)應(yīng)關(guān)系代數(shù)到上面的SQL玫霎。
</br>
另外一種trick解法凿滤,使用count
:
-- 先把hoeren變成sws=4hoeren: hoerenStudentenWith4SWS
with hoerenStudentenWith4SWS (matrnr, vorlnr) as (
select h.matrnr, v.vorlnr
from hoeren h, vorlesungen v
where h.vorlnr = v.vorlnr and v.sws = 4
)
-- 再看學(xué)生是不是聽(tīng)完了所有hoerenStudentenWith4SWS
select h.matrnr
from hoerenStudentenWith4SWS h
group by h.matrnr
having count(*) = (select count(*) from vorlesungen v where v.sws = 4)
- (對(duì)上面的類(lèi)似練習(xí)) 搜索學(xué)生所有考過(guò)的試對(duì)應(yīng)的科目,都是這個(gè)同學(xué)所聽(tīng)過(guò):
select s.*
from studenten s
where not exists(
select *
from pruefen p
where p.matrnr = s.matrnr and not exists(
select *
from hoeren h
where h.vorlnr = p.vorlnr and h.matrnr = s.matrnr
)
)
用中文說(shuō):沒(méi)有一門(mén)被考過(guò)的科目庶近,沒(méi)有出現(xiàn)在對(duì)應(yīng)學(xué)生hoeren表格中翁脆。
另外因?yàn)檫@個(gè)要求是獨(dú)立得應(yīng)用在每一個(gè)學(xué)生上,每一個(gè)學(xué)生因?yàn)榭荚嚥煌侵郑幸舐?tīng)的科目也不同反番。因此上面那題的trick
不再適用。trick
應(yīng)用條件是對(duì)所有學(xué)生需要普遍性叉钥,而排除獨(dú)立性 -- 一視同仁
罢缸。
- (對(duì)上面的類(lèi)似練習(xí)) 搜索學(xué)生所有聽(tīng)過(guò)的科目,都考試并通過(guò)(
note
<=4):
select *
from Studenten s
where not exists (
select *
from hoeren h
where h.MatrNr = s.MatrNr and not exists (
select *
from pruefen p
where p.MatrNr = s.MatrNr and p.VorlNr = h.VorlNr and p.Note <= 4
)
)
用中文說(shuō):沒(méi)有一門(mén)上過(guò)課的科目投队,沒(méi)有出現(xiàn)在對(duì)應(yīng)學(xué)生pruefen
表格中并沒(méi)有通過(guò)枫疆。
這個(gè)依舊很難用trick
。
- 求至少聽(tīng)Sokrates一門(mén)課的學(xué)生們的平均學(xué)期數(shù):
with vl_von_sokrates as (
select *
from vorlesungen v, professoren p
where v.gelesenvon = p.persnr and p.name = 'Sokrates'
), studenten_von_sokrates as (
select distinct s.name, s.matrnr, s.semester
from studenten s, hoeren h, vl_von_sokrates v
where s.matrnr = h.matrnr and h.vorlnr = v.vorlnr
)
select avg(semester)
from studenten_von_sokrates;
這題一定要注意,可能一個(gè)學(xué)生聽(tīng)了Sokrates的很多課蛾洛,但是這種同學(xué)不能被重復(fù)計(jì)數(shù)养铸。我們可以用distinct
雁芙。
但是我們也有一種解法不需要distinct
,它不用join
,而是帶exists
的correlated subquery:
with vl_von_sokrates as (
select *
from vorlesungen v, professoren p
where v.gelesenvon = p.persnr and p.name = 'Sokrates'
), studenten_von_sokrates as (
select *
from studenten s
where exists(
select *
from hoeren h, vl_von_sokrates vl
where h.matrnr = s.matrnr and h.vorlnr = vl.vorlnr
)
)
select avg(semester)
from studenten_von_sokrates;
- 求每個(gè)學(xué)生聽(tīng)?zhēng)坠?jié)課钞螟,需要考慮不聽(tīng)任何課的學(xué)生:
select count(*) as hcount
from hoeren
),
s as (
select count(*) as scount
from studenten
)
select hcount / (scount * 1.00) as avg_vl
from h, s
或者
with h as (
select count(*) as hcount
from hoeren
),
s as (
select count(*) as scount
from studenten
)
select hcount / (cast(scount as decimal(10, 4))) as avg_vl
from h, s
該文章遵循創(chuàng)作共用版權(quán)協(xié)議 CC BY-NC 3.0兔甘,要求署名、非商業(yè) 鳞滨、保持一致洞焙。在滿(mǎn)足創(chuàng)作共用版權(quán)協(xié)議 CC BY-NC 3.0 的基礎(chǔ)上可以轉(zhuǎn)載,但請(qǐng)以超鏈接形式注明出處拯啦。文章僅代表作者的知識(shí)和看法澡匪,如有不同觀點(diǎn),可以回復(fù)并討論褒链。