一、單個(gè)字段的操作
這是數(shù)據(jù)庫(kù)中的表:
分組介紹:
Select? ?重復(fù)字段? ?From 表? Group By? 重復(fù)字段? Having Count(*)>1;
SELECT F19, COUNT(1) FROM [LiangZB6].[dbo].[HuiZhou_Room_test] GROUP BY F19 HAVING COUNT(*)>1;
查看是否有重復(fù)的數(shù)據(jù):
GROUP? BY? <列名序列>? HAVING? <組條件表達(dá)式>;
查詢出:根據(jù)dname分組,同時(shí)滿足having字句中組條件表達(dá)式(重復(fù)次數(shù)大于1)的那些組
count(*) 與 count(1) 其實(shí)沒(méi)有什么差別,用哪個(gè)都可以
count(*)與count(列名)的區(qū)別:
count(*)將返回表格中所有存在的行的總數(shù)包括值為null的行曙砂,然而count(列名)將返回表格中除去null以外的所有行的總數(shù)(有默認(rèn)值的列也會(huì)被計(jì)入)
1. 查詢?nèi)恐貜?fù)的數(shù)據(jù):
Select? *? From 表? Where 重復(fù)字段 In (Select 重復(fù)字段 From 表 Group By 重復(fù)字段 Having Count(*)>1);
2. 刪除全部重復(fù)試題:
將上面的查詢select改為delete(這樣會(huì)出錯(cuò)的)
DELETE? FROM? dept? WHERE? dname IN (SELECT? dname? FROM? dept? GROUP BY? dname? HAVING? count(1) > 1);
會(huì)出現(xiàn)如下錯(cuò)誤:[Err] 1093 - You can't specify target table 'dept' for update in FROM clause
原因是:更新這個(gè)表的同時(shí)又查詢了這個(gè)表炼列,查詢這個(gè)表的同時(shí)又去更新了這個(gè)表,可以理解為死鎖盗痒。mysql不支持這種更新查詢同一張表的操作
解決辦法:把要更新的幾列數(shù)據(jù)查詢出來(lái)做為一個(gè)第三方表横漏,然后篩選更新谨设。
3. 查詢表中多余重復(fù)試題(根據(jù)depno來(lái)判斷,除了rowid最小的一個(gè))
a. 第一種方法:
SELECT? *? FROM? dept? WHERE? dname IN (SELECT? dname? FROM? dept? GROUP BY? dname? HAVING? COUNT(1) > 1)??
AND? deptno? NOT? IN(SELECT? MIN(deptno)? FROM? dept? GROUP? BY? dname? HAVING? COUNT(1) > 1);
上面這種寫(xiě)法正確缎浇,但是查詢的速度太慢扎拣,可以試一下下面這種方法:
b. 第二種方法:
☆根據(jù)dname分組,查找出deptno最小的华畏。然后再查找deptno不包含剛才查出來(lái)的鹏秋。這樣就查詢出了所有的重復(fù)數(shù)據(jù)(除了deptno最小的那行)
SELECT? ?*? FROM? dept? WHERE? deptno? NOT? IN (SELECT? dt.minno? FROM? (SELECT? MIN(deptno)? AS? minno? FROM? dept??
GROUP? BY? dname)? dt);
c. 補(bǔ)充第三種方法(根據(jù)評(píng)論區(qū)給的刪除總結(jié)出來(lái)的):
SELECT? *? FROM? table_name AS ta? WHERE? ta.唯一鍵 <> (SELECT? max( tb.唯一鍵 )? FROM? table_name AS tb? WHERE ta.判斷重復(fù)的列 = tb.判斷重復(fù)的列 );
4. 刪除表中多余重復(fù)試題并且只留1條:
a. 第一種方法:
DELETE? FROM? dept? WHERE? dname IN (SELECT? t.dname? FROM? (SELECT? dname? FROM? dept? GROUP BY? dname? HAVING??
count(1) > 1) t)? AND? deptno NOT? IN? (SELECT? dt.mindeptno? FROM (SELECT? min(deptno)? AS? mindeptno? FROM? dept??
GROUP? BY? dname? HAVING? count(1) > 1)? dt);
b. ☆第二種方法(與上面查詢的第二種方法對(duì)應(yīng),只是將select改為delete):
DELETE? FROM? dept? WHERE? deptno NOT IN (SELECT? dt.minno? FROM (SELECT? MIN(deptno) AS minno? FROM? dept? GROUP BY
dname) dt);
c. 補(bǔ)充第三種方法(評(píng)論區(qū)推薦的一種方法):
DELETE? FROM? table_name AS ta? WHERE? ta.唯一鍵 <> (SELECT? t.maxid? FROM? ( SELECT max( tb.唯一鍵 ) AS maxid FROM table_name AS tb WHERE ta.判斷重復(fù)的列 = tb.判斷重復(fù)的列 ) t);
二亡笑、多個(gè)字段的操作:
單個(gè)字段的如果會(huì)了,多個(gè)字段也非常簡(jiǎn)單横朋。就是將group by 的字段增加為你想要的即可仑乌。
此處只寫(xiě)一個(gè),其他方法請(qǐng)仿照一個(gè)字段的寫(xiě)即可。
DELETE? FROM? dept? WHERE? (dname, db_source) IN (SELECT? t.dname, t.db_source? FROM (SELECT? dname,? db_source? FROM??
dept? GROUP? BY? dname,? db_source? HAVING? count(1) > 1) t)? AND? deptno? NOT? IN? (SELECT? dt.mindeptno? FROM? (SELECT??
min(deptno)? AS? mindeptno? FROM? dept? GROUP BY? dname, db_source? HAVING? count(1) > 1) dt);
總結(jié):
其實(shí)上面的方法還有很多需要優(yōu)化的地方晰甚,如果數(shù)據(jù)量太大的話衙传,執(zhí)行起來(lái)很慢,可以考慮加優(yōu)化一下:
在經(jīng)常查詢的字段上加上索引
將*改為你需要查詢出來(lái)的字段厕九,不要全部查詢出來(lái)
小表驅(qū)動(dòng)大表用IN蓖捶,大表驅(qū)動(dòng)小表用EXISTS。IN適合的情況是外表數(shù)據(jù)量小的情況扁远,而不是外表數(shù)據(jù)大的情況俊鱼,因?yàn)镮N會(huì)遍歷外表的全部數(shù)據(jù),假設(shè)a表100條畅买,b表10000條那么遍歷次數(shù)就是100*10000次并闲,而exists則是執(zhí)行100次去判斷a表中的數(shù)據(jù)是否在b表中存在,它只執(zhí)行了a.length次數(shù)谷羞。至于哪一個(gè)效率高是要看情況的帝火,因?yàn)閕n是在內(nèi)存中比較的,而exists則是進(jìn)行數(shù)據(jù)庫(kù)查詢操作的湃缎。