最近我們生產(chǎn)環(huán)境的kafka集群有增加節(jié)點(diǎn)的需求闸翅,然而kafka在新增節(jié)點(diǎn)后并不會(huì)像elasticsearch那樣感知到新節(jié)點(diǎn)加入后自動(dòng)將數(shù)據(jù)reblance到新集群中奔则,因此這個(gè)過(guò)程需要我們手動(dòng)分配。一番折騰之后官套,實(shí)現(xiàn)了增加kafka集群節(jié)點(diǎn)并將原有數(shù)據(jù)均勻分配到擴(kuò)容后的集群。下面結(jié)合一個(gè)例子談一下整個(gè)過(guò)程。
一灶似、環(huán)境說(shuō)明
1.集群狀況
假定當(dāng)前的cluster中只有(101列林,102,103)三個(gè)kafka節(jié)點(diǎn)酪惭,有一個(gè)名為think_tank的topic希痴,該topic有2個(gè)replica,均勻分布在三個(gè)節(jié)點(diǎn)上.
2.目的
我們要做的是在cluster中新增兩個(gè)節(jié)點(diǎn)(記為104春感,105)后砌创,將的數(shù)據(jù)均勻分到新集群中的5個(gè)節(jié)點(diǎn)上。
二鲫懒、操作步驟
新增kafka節(jié)點(diǎn)的部署不是本文重點(diǎn)嫩实,就不在此贅述。
其實(shí)官方文檔的這一小節(jié)關(guān)于集群擴(kuò)容講解很詳細(xì):Expanding your cluster ,整個(gè)過(guò)程需要分為三個(gè)步驟:獲取kafka給出的建議分配方案窥岩、按照給出的分配方案執(zhí)行分配甲献、查看分配的進(jìn)度以及狀態(tài)。這三個(gè)步驟對(duì)應(yīng)了kafka腳本提供的三個(gè)partition reassigment工具颂翼。
--generate: In this mode, given a list of topics and a list of brokers, the tool generates a candidate reassignment to move all partitions of the specified topics to the new brokers. This option merely provides a convenient way to generate a partition reassignment plan given a list of topics and target brokers.
--execute: In this mode, the tool kicks off the reassignment of partitions based on the user provided reassignment plan. (using the --reassignment-json-file option). This can either be a custom reassignment plan hand crafted by the admin or provided by using the --generate option
--verify: In this mode, the tool verifies the status of the reassignment for all partitions listed during the last --execute. The status can be either of successfully completed, failed or in progress
結(jié)合例子具體說(shuō)明:
1晃洒、生成重新分配topic的方案
腳本的參數(shù)是以json文件的形式傳入的,首先要新建一個(gè)json文件并設(shè)置需要分配哪些topic朦乏,think_tank-to-move.json:
{
"topics":[
{
"topic":"think_tank"
},
{
"topic":"這里可以同時(shí)指定多個(gè)..."
}
],
"version":1
}
使用/bin目錄中提供的kafka-reassign-partitions.sh
的腳本請(qǐng)求獲取生成分配方案:
./bin/kafka-reassign-partitions.sh --zookeeper your_zk_address:2181 --topics-to-move-json-file think_tank-to-move.json --broker-list "101,102,103,104,105" --generate
--broker-lsit 的參數(shù) "101,102,103,104,105"是指集群中每個(gè)broker的id球及,由于我們是需要將所有topic均勻分配到擴(kuò)完結(jié)點(diǎn)的5臺(tái)機(jī)器上,所以要指定集歇。同理桶略,當(dāng)業(yè)務(wù)改變?yōu)閷⒃瓉?lái)的所有數(shù)據(jù)從舊節(jié)點(diǎn)(01,102,103)遷移到新節(jié)點(diǎn)(104,105)實(shí)現(xiàn)數(shù)據(jù)平滑遷移诲宇,這時(shí)的參數(shù)應(yīng)"104际歼,105".
腳本執(zhí)行后返回的結(jié)果如下:
Current partition replica assignment
{"version":1,"partitions":[{"topic":"think_tank","partition":2,"replicas":[101,102]},{"topic":"think_tank","partition":4,"replicas":[103,102]},{"topic":"think_tank","partition":3,"replicas":[102,101]},{"topic":"think_tank","partition":0,"replicas":[102,103]},{"topic":"think_tank","partition":1,"replicas":[103,101]}]}
Proposed partition reassignment configuration
{"version":1,"partitions":[{"topic":"think_tank","partition":2,"replicas":[103,101]},{"topic":"think_tank","partition":4,"replicas":[105,103]},{"topic":"think_tank","partition":3,"replicas":[104,102]},{"topic":"think_tank","partition":0,"replicas":[101,104]},{"topic":"think_tank","partition":1,"replicas":[102,105]}]}
可以看出當(dāng)前正在運(yùn)行的方案中,think_tank的replica都是分布在101姑蓝,102鹅心,103這3個(gè)節(jié)點(diǎn),新給出的建議方案中replica均勻分布在擴(kuò)容后的5個(gè)節(jié)點(diǎn)中纺荧。
2.執(zhí)行分配方案
將上一個(gè)步驟中生成的建議方案復(fù)制到新建的think_tank_reassignment.json中:
{"version":1,"partitions":[{"topic":"think_tank","partition":2,"replicas":[103,101]},{"topic":"think_tank","partition":4,"replicas":[105,103]},{"topic":"think_tank","partition":3,"replicas":[104,102]},{"topic":"think_tank","partition":0,"replicas":[101,104]},{"topic":"think_tank","partition":1,"replicas":[102,105]}]}
使用腳本執(zhí)行:
./bin/kafka-reassign-partitions.sh --zookeeper your_zk_address:2181 --reassignment-json-file think_tank_reassignment.json --execute
腳本執(zhí)行旭愧,返回內(nèi)容:
Current partition replica assignment
{"version":1,"partitions":[{"topic":"think_tank","partition":2,"replicas":[101,102]},{"topic":"think_tank","partition":4,"replicas":[103,102]},{"topic":"think_tank","partition":3,"replicas":[102,101]},{"topic":"think_tank","partition":0,"replicas":[102,103]},{"topic":"think_tank","partition":1,"replicas":[103,101]}]}
Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions.
如上,成功開始執(zhí)行分配數(shù)據(jù)宙暇,同時(shí)提示你如果有需要將之前的分配方案?jìng)浞荼阌诨貪L到原方案输枯。
3.查看配過(guò)程進(jìn)
查看腳本的方法如下,注意這次的json文件要和執(zhí)行步驟中的json是同一個(gè)文件:
./bin/kafka-reassign-partitions.sh --zookeeper your_zk_address:2181 --reassignment-json-file think_tank_reassignment.json --verify
返回結(jié)果:
Reassignment of partition [think_tank,2] completed successfully
Reassignment of partition [think_tank,1] completed successfully
Reassignment of partition [think_tank,3] is still in progress
Reassignment of partition [think_tank,4] completed successfully
Reassignment of partition [think_tank,0] is still in progress
is still in progress表示還在處理中占贫,全部遷移成功后每個(gè)partition都會(huì)顯示 completed successfully.注意如果topic數(shù)據(jù)量大桃熄,這個(gè)過(guò)程可能會(huì)時(shí)間長(zhǎng)一些,不要輕易重啟節(jié)點(diǎn)型奥!
可能會(huì)導(dǎo)致數(shù)據(jù)不一致M铡5锞!
三螟深、其它
這個(gè)partion reassignment工具同樣可以按需手動(dòng)地將某個(gè)特定的topic指定到特定的broker上谐宙,所要做的就是按照步驟一給定的格式關(guān)聯(lián)partition到borker即可,如界弧,將think_tank的partition0指定到101凡蜻、102兩節(jié)點(diǎn)上:
{
"version":1,
"partitions":[
{
"topic":"think_tank",
"partition":0,
"replicas":[
101,
105
]
}
]
}
另外,如果有增加replica的個(gè)數(shù)的需求垢箕,同樣可以使用這個(gè)腳本咽瓷,可以翻一下官網(wǎng)文檔。
One more thing
一點(diǎn)兒感觸舰讹,在確定問(wèn)題所在后,官方的文檔應(yīng)該作為我們優(yōu)先考慮的一個(gè)重要資料源闪朱,網(wǎng)上的資料由于時(shí)間較早月匣、版本不同的原因,解決方式可能需要細(xì)微的改動(dòng)才能達(dá)到目的奋姿,這些坑在官方的一手資料上其實(shí)是可以規(guī)避的锄开。
歡迎拍磚,歡迎交流~
注:轉(zhuǎn)載請(qǐng)注明出處