摘要
- 本文內(nèi)容基于10.4.8-MariaDB
Q: 假設(shè)一個訂單的編號規(guī)則是AAAAOrder2020-0000001客税,AAAAOrder2020-0000002…后面的數(shù)字是自增長,如果訂單號碼達到AAAAOrder2020-1000000(100萬)秉馏,數(shù)據(jù)庫中應(yīng)該有100萬條數(shù)據(jù),此時我隨機刪除2條數(shù)據(jù)(物理刪除脱羡,且不考慮日志和備份)萝究,請問怎么找到刪掉的數(shù)據(jù)的編號免都?給出解題思路即可,答案需要在1秒內(nèi)運行得到帆竹。
思路
- 其實查找丟失數(shù)據(jù)方法有很多绕娘,不過這里重點強調(diào)1秒內(nèi)運行得到,就限制成了只能在數(shù)據(jù)庫層面完成馆揉,實現(xiàn)方法基本上就是sql或者存儲過程业舍,而且要盡量減少表的掃描次數(shù)和掃描范圍、盡可能的使用索引升酣。
- 訂單編號的序號是自增長,所以可以認(rèn)為數(shù)據(jù)是按照編號自增的順序插入數(shù)據(jù)表的,如果數(shù)據(jù)表存在單獨的自增主鍵Id态罪,則可以基于錯位法得到缺失的主鍵Id噩茄,這樣可以使用主鍵索引,而且主鍵自增索引的查詢效率是最高的复颈。
- 如果沒有獨立的主鍵Id绩聘,而是使用的orderId作為主鍵,則需要對orderId截斷后進行比較耗啦,查詢條件一定要使用完整的orderId凿菩,以便使用索引,也可以使用行號和截斷后的訂單編號進行比較的方法帜讲。具體實現(xiàn)見下文衅谷。
參考答案,作者水平有限似将,僅供學(xué)習(xí)交流
- 存在自增主鍵Id获黔,與orderId序號一致,可以通過錯位法計算不存在的Id值在验。
運行時間基本符合要求玷氏。
select id-1 as id_del from test_order where id not in (select id+1 from test_order);
+--------+
| id_del |
+--------+
| 233490 |
| 943220 |
| 0 | #這里去除0號記錄,因為不存在id為0的記錄
+--------+
3 rows in set (1.063 sec)
- 如果不存在自增主鍵ID腋舌,而是使用orderId作為主鍵盏触,則也可以通過錯位法計算不存在的Id值,主要比較時一定要用完整的orderId块饺,否則不能使用索引赞辩。
運行時間超過要求扩灯。
SELECT RIGHT(orderId, 7) - 1 AS id_del FROM test_order
WHERE orderId NOT in(
SELECT concat( left(orderId, 14), LPAD( RIGHT(orderId, 7) + 1, 7, 0))
FROM test_order);
+--------+
| id_del |
+--------+
| 0 | #這里去除0號記錄篓叶,因為不存在id為0的記錄
| 233490 |
| 943220 |
+--------+
3 rows in set (2.999 sec)
- 如果不存在自增主鍵ID衍锚,而是使用orderId作為主鍵李剖,也可以比較orderId的序號與行號仿野,獲取到不一致的第一條記錄纪吮,然后在從該記錄開始繼續(xù)比較獲得下一條不一致的記錄当船,也可以通過存儲過程實現(xiàn)沦零。
運行時間符合要求。
sql示例1
-- 查詢第一條丟失記錄
select substring(a.orderId,15)-1 as orderId from (
SELECT @rownum:=@rownum+1 AS rownum, test_order.orderId FROM (SELECT @rownum:=0) r, test_order
) a where substring(a.orderId,15) != a.rownum limit 1;
+---------+
| orderId |
+---------+
| 233490 |
+---------+
1 row in set (0.341 sec)
-- 將上面查詢結(jié)果作為第二條sql的參數(shù)篮灼,這里上一條查詢結(jié)果為 233490忘古,所以下面手工填寫對應(yīng)的值,
-- 這里因為limit不能寫變量诅诱,所以需要手工填寫髓堪,如果希望一次執(zhí)行,可以參考后面的sql示例2和存儲過程示例
select substring(a.orderId,15)-1 as orderId from (
SELECT @rownum:=@rownum+1 AS rownum, test_order.orderId FROM (SELECT @rownum:=0) r, test_order
limit 233489,1000000
) a where substring(a.orderId,15)-233490 != a.rownum limit 1;
+---------+
| orderId |
+---------+
| 943220 |
+---------+
1 row in set (0.372 sec)
-- 驗證是否準(zhǔn)確
select orderId from test_order limit 233488,2;
+----------------------+
| orderId |
+----------------------+
| aaaaorder2020-233489 |
| aaaaorder2020-233491 |
+----------------------+
2 rows in set (0.036 sec)
sql示例2
-- 可以直接粘貼到mysql終端執(zhí)行娘荡,也可以保存到文件干旁,然后在mysql終端執(zhí)行source /xx/xx/xx.sql
-- 這里修改定界符,就是為了方便看到一個總的執(zhí)行時間
delimiter ;;
select substring(a.orderId,15)-1 as orderId into @first_order_id from (
SELECT @rownum:=@rownum+1 AS rownum, test_order.orderId FROM (SELECT @rownum:=0) r, test_order
) a where substring(a.orderId,15) != a.rownum limit 1;
SET @second_limit = @first_order_id - 1;
SET @limitsql = CONCAT( 'SELECT @rownum:=@rownum+1 AS rownum, test_order.orderId FROM (SELECT @rownum:=0) r, test_order
limit ', @second_limit, ',1000000' );
SET @SQL = CONCAT( 'select substring(a.orderId,15)-1 as orderId into @second_order_id from (', @limitsql, '
) a where substring(a.orderId,15)-', @first_order_id, ' != a.rownum limit 1' );
PREPARE stmt FROM @SQL;
EXECUTE stmt;
SET @del_ids = concat( @first_order_id, ',', @second_order_id );
;;
delimiter ;
SELECT @del_ids;
source /Users/hanqf/Desktop/exec.sql
Query OK, 1 row affected (0.710 sec)
Query OK, 0 rows affected (0.710 sec)
Query OK, 0 rows affected (0.710 sec)
Query OK, 0 rows affected (0.710 sec)
Query OK, 0 rows affected (0.710 sec)
Statement prepared
Query OK, 1 row affected (0.710 sec)
Query OK, 0 rows affected (0.710 sec)
+---------------+
| @del_ids |
+---------------+
| 233490,943220 |
+---------------+
1 row in set (0.000 sec)
存儲過程示例
DROP PROCEDURE IF EXISTS find_delete_order_id;
CREATE PROCEDURE `find_delete_order_id`(OUT del_ids varchar(255))
BEGIN
DECLARE first_order_id INT;
DECLARE second_order_id INT;
DECLARE second_limit INT;
select substring(a.orderId,15)-1 as orderId into first_order_id from (
SELECT @rownum:=@rownum+1 AS rownum, test_order.orderId FROM (SELECT @rownum:=0) r, test_order
) a where substring(a.orderId,15) != a.rownum limit 1;
set second_limit = first_order_id-1;
select substring(a.orderId,15)-1 as orderId into second_order_id from (
SELECT @rownum:=@rownum+1 AS rownum, test_order.orderId FROM (SELECT @rownum:=0) r, test_order
limit second_limit,1000000
) a where substring(a.orderId,15)-first_order_id != a.rownum limit 1;
set del_ids = concat(first_order_id,',',second_order_id);
-- 此處為了方便測試炮沐,所以在存儲過程中就打印了結(jié)果
select del_ids;
END;
CALL `find_delete_order_id`(@del_ids);
+---------------+
| del_ids |
+---------------+
| 233490,943220 |
+---------------+
1 row in set (0.704 sec)
Query OK, 2 rows affected (0.704 sec)
select @del_ids;
+---------------+
| @del_ids |
+---------------+
| 233490,943220 |
+---------------+
1 row in set (0.000 sec)
表結(jié)構(gòu)
DROP TABLE IF EXISTS `test_order`;
CREATE TABLE `test_order` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`orderId` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_order_id` (`orderId`) USING HASH
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
數(shù)據(jù)初始化
說明:
直接將100w的數(shù)據(jù)插入數(shù)據(jù)表有點慢争群,實測130多秒吧,這里可以先將數(shù)據(jù)初始化到內(nèi)存表中大年,然后再從內(nèi)存表導(dǎo)入即可换薄,總時間大概只需要12秒左右。
-
因為數(shù)據(jù)都是寫入內(nèi)存中翔试,所以寫入數(shù)據(jù)時可能會報內(nèi)存不足轻要,解決方法如下:
a.永久修改,在my.cnf中增加如下配置:
tmp_table_size=1G
max_heap_table_size=1Gb.臨時修改,mysql終端執(zhí)行:
set tmp_table_size = 1073741824;
set max_heap_table_size = 1073741824;
show variables like "%table_size%";
-- 創(chuàng)建內(nèi)存表垦缅,重啟數(shù)據(jù)庫后冲泥,內(nèi)存表中的數(shù)據(jù)會被清空,但是表結(jié)構(gòu)依舊存在失都,不需要時可以drop掉
DROP TABLE IF EXISTS `test_order_memory`;
CREATE TABLE `test_order_memory` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`orderId` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_order_id` (`orderId`) USING HASH
) ENGINE=MEMORY AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
-- 初始化內(nèi)存表
DROP PROCEDURE IF EXISTS add_test_order_memory;
CREATE PROCEDURE `add_test_order_memory`(IN n int)
BEGIN
DECLARE i INT DEFAULT 1;
WHILE (i <= n ) DO
INSERT INTO test_order_memory (orderId) VALUES (concat('aaaaorder2020-',LPAD(i, 7, 0)));
set i=i+1;
END WHILE;
END;
-- 創(chuàng)建1000000條數(shù)據(jù)
call add_test_order_memory(1000000);
Query OK, 1000000 rows affected (7.354 sec)
-- 將內(nèi)存表中數(shù)據(jù)導(dǎo)入實際表中
INSERT into test_order SELECT * from test_order_memory;
Query OK, 1000000 rows affected (5.035 sec)
Records: 1000000 Duplicates: 0 Warnings: 0
-- 查看結(jié)果
select * from test_order limit 5;
+----+-----------------------+
| id | orderId |
+----+-----------------------+
| 1 | aaaaorder2020-0000001 |
| 2 | aaaaorder2020-0000002 |
| 3 | aaaaorder2020-0000003 |
| 4 | aaaaorder2020-0000004 |
| 5 | aaaaorder2020-0000005 |
+----+-----------------------+
5 rows in set (0.000 sec)
-- 刪除兩條數(shù)據(jù)柏蘑,這里就隨便填兩個id號
delete from test_order where id in (233490,943220);
Query OK, 2 rows affected (0.001 sec)