某次開發(fā)業(yè)務(wù)需求中,項(xiàng)目數(shù)據(jù)支撐由多個(gè)數(shù)據(jù)庫(Postgresql)組成革骨,業(yè)務(wù)要求從DB-1獲取Table A-User农尖,將數(shù)據(jù)計(jì)算后批量插入DB-1的Table B-UserInfo以及DB-2的Table C Customer中,之后刪除Table A-User的數(shù)據(jù)良哲。
DB-1連接配置如下盛卡,db.php
return [
'class' => 'yii\db\Connection',
'dsn' => 'pgsql:host=127.0.0.1;dbname=mydb1',
'username' => 'postgres',
'password' => '123456',
'charset' => 'utf8',
// Schema cache options (for production environment)
'enableSchemaCache' => true,
];
DB-2連接配置如下,params.php
return [
'backend_db' => [
'dsn' => 'pgsql:host=127.0.0.1;dbname=mydb2',
'username' => 'postgres',
'password' => '123456',
'charset' => 'utf8',
'enableSchemaCache' => true,
]
];
大概實(shí)現(xiàn)流程如下:
$datas = User::find()->where(['type' => $type])->with('user_type')->all();
if ($datas){
...
$transaction = User::getDb()->beginTransaction();
$transaction->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
$transaction2 = Customer::getDb()->beginTransaction();
$transaction2->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
foreach ($datas as $key => $user) {
...
$data = $data2 = $user->getAttributes();
...
$model = new UserInfo();
$model2 = new Customer();
$model->load($data, "");
$model2->load($data2, "");
if($model->save() && $model2->save() && $user->delete()){
$transaction2->commit();
$transaction->commit();
}
}
} catch(\Throwable $e) {
$transaction2->rollBack();
throw $e;
}
} catch(\Throwable $e) {
$transaction->rollBack();
throw $e;
}
}
按照上面的流程走筑凫,數(shù)據(jù)量少時(shí)沒問題滑沧,當(dāng)數(shù)據(jù)量多的時(shí)候,執(zhí)行時(shí)間非常長巍实。
然后就開始找原因了:
剛開始懷疑是事務(wù)導(dǎo)致滓技,去掉事務(wù)后,執(zhí)行效率并沒有提升多少棚潦,Pass令漂。
在每個(gè)重要執(zhí)行地方加入Yii::warning打印, 查看日志時(shí)間丸边,發(fā)現(xiàn)每一次model保存時(shí)花費(fèi)將近1秒叠必,那么問題就在實(shí)體保存上了。
在Yii框架里開啟debug模塊原环,查看DataBase挠唆,發(fā)現(xiàn)有很多Query語句,類似:
SELECT d.nspname AS table_schema,
c.relname AS table_name,
a.attname AS column_name,
COALESCE(td.typname, tb.typname, t.typname) AS data_type,
COALESCE(td.typtype, tb.typtype, t.typtype) AS type_type,
a.attlen AS character_maximum_length,
pg_catalog.col_description(c.oid, a.attnum) AS column_comment,
a.atttypmod AS modifier,
a.attnotnull = false AS is_nullable,
CAST(pg_get_expr(ad.adbin, ad.adrelid) AS varchar) AS column_default,
coalesce(pg_get_expr(ad.adbin, ad.adrelid) ~ 'nextval',false) AS is_autoinc,
CASE WHEN COALESCE(td.typtype, tb.typtype, t.typtype) = 'e'::char THEN array_to_string((SELECT array_agg(enumlabel) FROM pg_enum WHERE enumtypid = COALESCE(td.oid, tb.oid, a.atttypid))::varchar[], ',')
ELSE NULL END AS enum_values,
CASE atttypid
WHEN 21 /*int2*/ THEN 16 WHEN 23 /*int4*/ THEN 32 WHEN 20 /*int8*/ THEN 64 WHEN 1700 /*numeric*/ THEN CASE WHEN atttypmod = -1 THEN null ELSE ((atttypmod - 4) >> 16) & 65535 END WHEN 700 /*float4*/ THEN 24 /*FLT_MANT_DIG*/ WHEN 701 /*float8*/ THEN 53 /*DBL_MANT_DIG*/ ELSE null END AS numeric_precision,
CASE WHEN atttypid IN (21, 23, 20) THEN 0 WHEN atttypid IN (1700) THEN CASE WHEN atttypmod = -1 THEN null ELSE (atttypmod - 4) & 65535 END ELSE null END AS numeric_scale,
CAST(
information_schema._pg_char_max_length(information_schema._pg_truetypid(a, t), information_schema._pg_truetypmod(a, t))
AS numeric ) AS size,
a.attnum = any (ct.conkey) as is_pkey,
COALESCE(NULLIF(a.attndims, 0), NULLIF(t.typndims, 0), (t.typcategory='A')::int) AS dimensionFROM pg_class c LEFT JOIN pg_attribute a ON a.attrelid = c.oid LEFT JOIN pg_attrdef ad ON a.attrelid = ad.adrelid AND a.attnum = ad.adnum
LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_type tb ON (a.attndims > 0 OR t.typcategory='A') AND t.typelem > 0 AND t.typelem = tb.oid OR t.typbasetype > 0 AND t.typbasetype = tb.oid LEFT JOIN pg_type td ON t.typndims > 0 AND t.typbasetype > 0 AND tb.typelem = td.oid LEFT JOIN pg_namespace d ON d.oid = c.relnamespace
LEFT JOIN pg_constraint ct ON ct.conrelid = c.oid AND ct.contype = 'p'WHERE a.attnum > 0 AND t.typname != '' AND c.relname = 'current_user' AND d.nspname = 'public'ORDER BY a.attnum;
以及
select ct.conname as constraint_name,
a.attname as column_name,
fc.relname as foreign_table_name,
fns.nspname as foreign_table_schema,
fa.attname as foreign_column_namefrom (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype, ct.confkey, generate_subscripts(ct.conkey, 1) AS s
FROM pg_constraint ct
) AS ct
inner join pg_class c on c.oid=ct.conrelid
inner join pg_namespace ns on c.relnamespace=ns.oid inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum = ct.conkey[ct.s]
left join pg_class fc on fc.oid=ct.confrelid
left join pg_namespace fns on fc.relnamespace=fns.oid left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum = ct.confkey[ct.s]where ct.contype='f' and c.relname='current_user' and ns.nspname='public'order by fns.nspname, fc.relname, a.attnum
初步判斷以為是緩存原因嘱吗,就將enableSchemaCache
設(shè)為false了,查看Debug依然存在以上Query滔驾。
將Query的部分語句復(fù)制谒麦,對(duì)整個(gè)項(xiàng)目進(jìn)行搜索大法,檢查這些語句是在yii2/db/pgsql/Schema.php中哆致。懷疑是Model在插入數(shù)據(jù)绕德、驗(yàn)證時(shí)需要查詢表結(jié)構(gòu)、約束之類的摊阀,首先把保存的驗(yàn)證去掉model->save(false)
耻蛇,再次執(zhí)行踪蹬,效率提升了一丟丟,查看Debug臣咖,Query依然有幾百條...
索性不用Model了跃捣,換成純SqlCommand模式:
Customer::getDb()->createCommand()->Insert('customer', $data)->execute()
再次執(zhí)行,效率依然只是提升了一丟丟夺蛇,查看Debug疚漆,Query依然有幾百條...
為什么不用批量插入呢?因?yàn)門able-B插入時(shí)刁赦,需要用到Table-A插入后的id作為外鍵娶聘,剛開始沒有想到好的辦法來解決,之后想到了才用批量插入:
Customer::getDb()->createCommand()->batchInsert('customer', $columns, $data)->execute()
到這里甚脉,50條數(shù)據(jù)執(zhí)行在2秒左右就完成了丸升。
那1000條數(shù)據(jù)呢?
果斷測試了一下牺氨,結(jié)果很桑心狡耻,需要10多秒...不符合客戶要求。
現(xiàn)在是用批量插入波闹,Query比起之前已經(jīng)很少了酝豪,仔細(xì)查看Debug,看著Query語句忽然想到精堕,由于是用了兩個(gè)DB連接孵淘,Yii2可能在執(zhí)行數(shù)據(jù)時(shí),只記錄了第一條連接的Table結(jié)構(gòu)緩存歹篓,當(dāng)切換數(shù)據(jù)庫進(jìn)行插入操作后瘫证,又需要重新獲取當(dāng)前操作的表結(jié)構(gòu),之后把結(jié)構(gòu)調(diào)整了一下庄撮,類似如下:
$datas = (new \yii\db\Query())->select('*')->from('user')->where('...')->all();
if ($datas){
...
$transaction = User::getDb()->beginTransaction();
$transaction->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
foreach ($datas as $key => $user) {
...
$data處理;
...
}
if(User::getDb()->->createCommand()->batchInsert()->execute() && User::getDb()->createCommand()->delete()){
$transaction2 = Customer::getDb()->beginTransaction();
$transaction2->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
Customer::getDb()->->createCommand()->batchInsert()->execute()
$transaction2->commit();
$transaction->commit();
} catch(\Throwable $e) {
$transaction2->rollBack();
throw $e;
}
}
} catch(\Throwable $e) {
$transaction->rollBack();
throw $e;
}
}
改成先處理完DB1的業(yè)務(wù)背捌,之后再處理DB2的。
現(xiàn)在沒有看到多余的Query語句了洞斯,只有在插入時(shí)查詢一次毡庆,兩次插入總共兩次Query,Debug統(tǒng)計(jì)1200條記錄的DB操作是200多毫秒烙如。
到這里以為完事了么抗,在一次程序測試時(shí),發(fā)現(xiàn)某個(gè)地方出現(xiàn)錯(cuò)誤$transaction2
沒有回滾亚铁。以為是Yii2不支持嵌套事務(wù)(其實(shí)并不完全是嵌套蝇刀,只是跨數(shù)據(jù)庫同時(shí)使用事務(wù)而已),度娘找了下答案徘溢,沒有想要的結(jié)果吞琐。
繼續(xù)檢查Debug捆探,發(fā)現(xiàn)$transaction2
事務(wù)在開始之后又執(zhí)行了一次DB open
的操作。由此想到站粟,事務(wù)2的連接在開始之后被建立一次新的連接替換了黍图。繼續(xù)檢查代碼。最終發(fā)現(xiàn):
$transaction2 = Customer::getDb()->beginTransaction();
$transaction2->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
//這里又getDb()一次
Customer::getDb()->->createCommand()->batchInsert()->execute()
$transaction2->commit();
$transaction->commit();
} catch(\Throwable $e) {
$transaction2->rollBack();
throw $e;
}
Customer的getDb():
public static function getDb()
{
return new \yii\db\Connection(Yii::$app->params['backend_db']);
}
這樣寫肯定是兩條連接了卒蘸,每次調(diào)用getDb()方法雌隅,都會(huì)new一條新的數(shù)據(jù)庫連接,導(dǎo)致事務(wù)連接被替換而無效缸沃。為什么前面的User::getDb()
不會(huì)導(dǎo)致事務(wù)1的失敗呢恰起?原因是事務(wù)1的數(shù)據(jù)庫連接是db.php中的,是Yii默認(rèn)連接趾牧,Yii在每次執(zhí)行DB操作時(shí)都是使用一條DB連接操作检盼,所以沒有出現(xiàn)以上問題。
最終修改版:
$datas = (new \yii\db\Query())->select('*')->from('user')->where('...')->all();
if ($datas){
...
$transaction = User::getDb()->beginTransaction();
$transaction->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
foreach ($datas as $key => $user) {
...
$data處理;
...
}
if(User::getDb()->->createCommand()->batchInsert()->execute() && User::getDb()->createCommand()->delete()){
//這里必須用同一連接翘单,否則事務(wù)無效
$backend_conn = new \yii\db\Connection(Yii::$app->params['backend_db']);
$transaction2 = $backend_conn->beginTransaction();
$transaction2->setIsolationLevel(\yii\db\Transaction::SERIALIZABLE);
try {
$backend_conn->->createCommand()->batchInsert()->execute()
$transaction2->commit();
$transaction->commit();
} catch(\Throwable $e) {
$transaction2->rollBack();
throw $e;
}
}
} catch(\Throwable $e) {
$transaction->rollBack();
throw $e;
}
}
寫的比較啰嗦吨枉,記錄的都是一步一個(gè)坑的過程,希望能給其他碰到此類問題的人一個(gè)提示哄芜。
總結(jié):
- 日常開發(fā)碰到問題貌亭,一定要先考慮清楚代碼邏輯的實(shí)現(xiàn)流程是否有問題。
- 調(diào)試時(shí)斷句輸出變量數(shù)據(jù)认臊,能很好的輔助檢查錯(cuò)誤圃庭。
- Yii2的Debug模塊真的很強(qiáng)大,日志能看出很多問題失晴。
- 批量數(shù)據(jù)插入操作用batchInsert剧腻,建議不要用Model方式操作。
- 多數(shù)據(jù)庫事務(wù)執(zhí)行涂屁,最好不要同時(shí)操作兩個(gè)數(shù)據(jù)庫书在,先處理完數(shù)據(jù)庫1的再處理數(shù)據(jù)庫2的業(yè)務(wù),這樣不會(huì)導(dǎo)致Yii隨時(shí)需要切換DB連接和查詢表結(jié)構(gòu)拆又。
- 非默認(rèn)數(shù)據(jù)庫連接(db.php)檢查是否共用同一數(shù)據(jù)庫連接儒旬。