mark下自己近期在電商開發(fā)中遇到的一個(gè)問題-數(shù)據(jù)庫(kù)死鎖及其排查過程。
先拋一個(gè)業(yè)務(wù)報(bào)錯(cuò)日志做為這次梳理的開始
上圖是我接收到的錯(cuò)誤報(bào)警,SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction,錯(cuò)誤信息顯示我們業(yè)務(wù)中有一條數(shù)據(jù)庫(kù)操作遇到了死鎖情況。接下來就開始我們的追查之旅。
1.執(zhí)行“show engine innodb status”獲取INNODB引擎當(dāng)前信息(show engine innodb status 詳細(xì)介紹)
------------------------
LATEST DETECTED DEADLOCK
------------------------
2017-01-04 09:25:17 7f553477d700
*** (1) TRANSACTION:
TRANSACTION 124378994, ACTIVE 0.007 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 8 row lock(s), undo log entries 7
LOCK BLOCKING MySQL thread id: 11573556 block 11572504
MySQL thread id 11572504, OS thread handle 0x7f56342fb700, query id 3368968901 10.44.182.0 shzfstore updating
UPDATE `sku` SET `quantity`=quantity-'1',`lock_stock`=lock_stock+'1',`sys_version`=sys_version+1 WHERE `id` = '15608' AND `quantity` >= '1' limit 1
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 393 page no 45 n bits 248 index `PRIMARY` of table `store_product`.`sku` trx id 124378994 lock_mode X locks rec but not gap waiting
Record lock, heap no 19 PHYSICAL RECORD: n_fields 19; compact format; info bits 0
......
*** (2) TRANSACTION:
TRANSACTION 124378995, ACTIVE 0.004 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 11573556, OS thread handle 0x7f553477d700, query id 3368968902 10.172.221.117 shzfstore updating
UPDATE `sku` SET `quantity`=quantity-'1',`lock_stock`=lock_stock+'1',`sys_version`=sys_version+1 WHERE `id` = '15504' AND `quantity` >= '1' limit 1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 393 page no 45 n bits 248 index `PRIMARY` of table `store_product`.`sku` trx id 124378995 lock_mode X locks rec but not gap
Record lock, heap no 19 PHYSICAL RECORD: n_fields 19; compact format; info bits 0
......
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 393 page no 43 n bits 240 index `PRIMARY` of table `store_product`.`sku` trx id 124378995 lock_mode X locks rec but not gap waiting
Record lock, heap no 81 PHYSICAL RECORD: n_fields 19; compact format; info bits 0
......
*** WE ROLL BACK TRANSACTION (2)
LATEST DETECTED DEADLOCK記錄了最近一次的死鎖情況士八。
2017-01-04 09:25:17時(shí)間跟我們接收到的報(bào)錯(cuò)日志時(shí)間吻合。
上面還可以看出兩個(gè)事務(wù)之間發(fā)生鎖競(jìng)爭(zhēng)時(shí)梁呈,給我們留下的部分?jǐn)?shù)據(jù)
事務(wù)1
UPDATE
skuSET
quantity=quantity-'1',
lock_stock=lock_stock+'1',
sys_version=sys_version+1 WHERE
id= '15608' AND
quantity>= '1' limit 1
事務(wù)2
UPDATE
skuSET
quantity=quantity-'1',
lock_stock=lock_stock+'1',
sys_version=sys_version+1 WHERE
id= '15504' AND
quantity>= '1' limit 1
死鎖的兩個(gè)資源均被lock_mode X locks了
最后婚度,mysql給了很重要的一個(gè)數(shù)據(jù)“WE ROLL BACK TRANSACTION (2)” MYSQL回滾了事務(wù)2。既然mysql回滾了事務(wù)2官卡,那么肯定是事務(wù)2的語(yǔ)句觸發(fā)了死鎖蝗茁,被mysql回滾了醋虏,也就是應(yīng)該為報(bào)錯(cuò)日志所記錄的那部分。同時(shí)哮翘,MYSQL執(zhí)行了事務(wù)1颈嚼,那么事務(wù)1的SQL語(yǔ)句肯定被記錄在BINLOG中了。
2.查看binlog日志饭寺,找出事務(wù)1所執(zhí)行的語(yǔ)句
查找依據(jù):
- SQL語(yǔ)句,根據(jù)LATEST DETECTED DEADLOCK提供的死鎖時(shí)記錄的sql語(yǔ)句阻课。
- 線程ID(mysql的唯一標(biāo)識(shí)): MySQL thread id 11572504
- 執(zhí)行時(shí)間(時(shí)間線):2017-01-04 09:25:17 7f553477d700
根據(jù)以上三個(gè)標(biāo)識(shí),以及BINLOG的起始標(biāo)志“BEGIN佩研、COMMIT”柑肴,幾乎可以100%確定事務(wù)1所包含的SQL語(yǔ)句。
binlog信息大致如下
#170104 9:25:17 server id 3194178605 end_log_pos 137170469 CRC32 0x1b6559de Query thread_id=11572504 exec_time=0 error_code=0
SET TIMESTAMP=1483493117/*!*/;
BEGIN
......
### UPDATE `store_product`.`sku`
### WHERE
### @1=15504
### @2=11516
### @3=0.01
### @4=120065
### @5=109433
### @6=19
### SET
### @1=15504
### @2=11516
### @3=0.01
### @4=120065
### @5=109432
### @6=20
# at 137172557
......
### UPDATE `store_product`.`sku`
### WHERE
### @1=15608
### @2=11551
### @3=0.01
### @4=120077
### @5=109426
### @6=19
### SET
### @1=15608
### @2=11551
### @3=0.01
### @4=120077
### @5=109425
### @6=20
......
COMMIT/*!*/;
3.還原事務(wù)2所包含的執(zhí)行語(yǔ)句
事務(wù)1的語(yǔ)句找齊了旬薯,接下來找事務(wù)2的語(yǔ)句,還記得我們開頭提供的報(bào)錯(cuò)日志嗎适秩,那個(gè)日志里也詳細(xì)記錄了發(fā)起請(qǐng)求時(shí)的參數(shù)情況(截圖未展示)绊序,根據(jù)參數(shù)和我們處理業(yè)務(wù)的代碼,可以復(fù)現(xiàn)事務(wù)2所要執(zhí)行的語(yǔ)句
BEGIN
......
### UPDATE `store_product`.`sku`
### WHERE
### @1=15608
### @2=11516
### @3=0.01
### @4=120065
### @5=109433
### @6=19
### SET
### @1=15608
### @2=11516
### @3=0.01
### @4=120065
### @5=109432
### @6=20
......
### UPDATE `store_product`.`sku`
### WHERE
### @1=15504
### @2=11551
### @3=0.01
### @4=120077
### @5=109426
### @6=19
### SET
### @1=15504
### @2=11551
### @3=0.01
### @4=120077
### @5=109425
### @6=20
......
COMMIT/*!*/;
根據(jù)兩個(gè)事務(wù)所執(zhí)行的sql語(yǔ)句秽荞,目前可以還原死鎖產(chǎn)生的原因了
4.查看兩個(gè)事務(wù)執(zhí)行語(yǔ)句的順序:
順序 | 事務(wù)1 | 事務(wù)2 | 說明 |
---|---|---|---|
1 | begin | ||
2 | begin | ||
3 | UPDATE sku SET quantity =quantity-'1',lock_stock =lock_stock+'1',sys_version =sys_version+1 WHERE id = '15504' AND quantity >= '1' limit 1 |
事務(wù)1 給 sku表 id 15504記錄上 X 鎖 | |
4 | UPDATE sku SET quantity =quantity-'1',lock_stock =lock_stock+'1',sys_version =sys_version+1 WHERE id = '15608' AND quantity >= '1' limit 1 |
事務(wù)2 給 sku表 id 15608記錄上 X 鎖 | |
5 | UPDATE sku SET quantity =quantity-'1',lock_stock =lock_stock+'1',sys_version =sys_version+1 WHERE id = '15608' ANDquantity >= '1' limit 1 |
這里是關(guān)鍵骤公,事務(wù)1想給sku表 id 15608上X鎖,發(fā)現(xiàn)被別人(事務(wù)2)上鎖了扬跋,等待鎖釋放 | |
6 | UPDATE sku SET quantity =quantity-'1',lock_stock =lock_stock+'1',sys_version =sys_version+1 WHERE id = '15504' AND quantity >= '1' limit 1 |
事物2打算給sku表id為15504記錄上 X 排它鎖阶捆,發(fā)現(xiàn)被其他事務(wù)上了,而且此事務(wù)居然還在等他提交钦听,這時(shí)MYSQL立刻回滾事務(wù)2…(php發(fā)現(xiàn)MYSQL返回死鎖信息洒试,記錄該信息到錯(cuò)誤日志…發(fā)送回滾指令…mysql已經(jīng)“幫”他回滾了…) | |
7 | 執(zhí)行成功 | 事務(wù)1發(fā)現(xiàn)別人的鎖釋放了,獲得X鎖朴上,執(zhí)行成功 | |
8 | commit | 事務(wù)1執(zhí)行成功垒棋,記錄binlog日志 |
解決方案
- 減小事務(wù)中的語(yǔ)句數(shù)量
- 在業(yè)務(wù)中調(diào)整語(yǔ)句的執(zhí)行順序,例如可以按照where條件中字段的大小進(jìn)行一下排序痪宰,按照排序后順序執(zhí)行叼架,讓死鎖變?yōu)殒i等待。
相關(guān)補(bǔ)充
- innodb的行鎖衣撬,鎖的是查詢條件中的索引字段乖订,以及索引字段對(duì)應(yīng)的primary key字段