storm啟停腳本的編寫:
第一步:在master節(jié)點(diǎn)創(chuàng)建start-supervisor.sh腳本仍稀,然后分發(fā)到各個(gè)服務(wù)器哀卫,就可以通過(guò)運(yùn)行該腳本開(kāi)啟supervisor服務(wù)
start-supervisor.sh腳本
#!/bin/bash
#使配置的storm環(huán)境變量生效
source /home/hadoop/.bashrc
#后臺(tái)運(yùn)行supervisor
nohup storm supervisor >/dev/null? 2>&1 &
第二步:在master節(jié)點(diǎn)創(chuàng)建supervisor-hosts文件,用來(lái)存放主機(jī)名
supervisor-hosts
hadoop02
hadoop03
hadoop04
第三步:在master節(jié)點(diǎn)創(chuàng)建start-all.sh啟動(dòng)所有supervisor
start-all.sh
#!/bin/bash
source /home/hadoop/.bashrc
#重寫bin和supervisor目錄
bin=/home/hadoop/apps/apache-storm-0.9.7/bin
supervisors=/home/hadoop/apps/apache-storm-0.9.7/bin/supervisor-hosts
#啟動(dòng)主節(jié)點(diǎn)
nohup storm nimbus >/dev/null? 2>&1 &
#讀取supervisor-hosts文件中每一個(gè)節(jié)點(diǎn)執(zhí)行start-supervisor.sh腳本啟動(dòng)supervisor服務(wù)
#while后面的supervisor用來(lái)接收讀取到的每一行數(shù)據(jù)
cat $supervisors | while read supervisor
do
echo $supervisor
ssh $supervisor $bin/start-supervisor.sh
done
第四步:寫停止腳本
stop-all.sh
#!/bin/bash
source /home/hadoop/.bashrc
#重寫bin和supervisor目錄
bin=/home/hadoop/apps/apache-storm-0.9.7/bin
supervisors=/home/hadoop/apps/apache-storm-0.9.7/bin/supervisor-hosts
#把nimbus相關(guān)的進(jìn)程都?xì)⒌?/p>
kill -9 `ps -ef | grep java | grep nimbus | awk '{print $2}'`
#停止所有的supervisor
cat $supervisors | while read supervisor
do
echo $supervisor
ssh $supervisor $bin/stop-supervisor.sh &
done
第五步:在各節(jié)點(diǎn)的storm的bin目錄下編寫腳本stop-supervisor.sh
stop-supervisor.sh
#!/bin/bash
source /home/hadoop/.bashrc
#殺死supervisor進(jìn)程
kill -9 `ps -ef | grep java | grep supervisor | awk '{print $2}'`
第六步:上傳所有腳本到storm/bin目錄下想际,并將start-supervisor.sh和stop-supervisor.sh分發(fā)到所有storm集群節(jié)點(diǎn):Hadoop02呢撞,Hadoop04
第七步:為了防止重名:修改腳本名字為 storm-start-all.sh 和 storm-stop-all.sh
第八步:修改所有腳本運(yùn)行權(quán)限:chmod 755 *.sh
在家目錄下運(yùn)行stop-storm-all.sh腳本
出錯(cuò):
錯(cuò)誤一:-bash: /home/hadoop/apps/apache-storm-0.9.7/bin/start-storm-all.sh: /bin/bash^M: bad interpreter: No such file or directory
[hadoop@hadoop03 bin]$ sh start-storm-all.sh
: No such file or directory /home/hadoop/.bashrc
先注釋掉.bashrc命令行,在外部手動(dòng)開(kāi)啟全局環(huán)境變量
錯(cuò)誤二:?jiǎn)为?dú)運(yùn)行腳本stop-supervisor.sh ?arguments must be process or job IDs0
storm搭建:
官網(wǎng):storm.apache.org
版本:apache-storm-1.1.1.tar.gz
配置文件storm.yaml的修改?
參考http://storm.apache.org/releases/1.1.1/Setting-up-a-Storm-cluster.html
1)storm.zookeeper.servers: This is a list of the hosts in the Zookeeper cluster for your Storm cluster. It should look something like:
storm.zookeeper.servers:
- "hadoop02"
- "hadoop03"
- "hadoop04"
2)storm.local.dir
storm.local.dir:"/home/hadoop/log/storm"
3)nimbus.seeds:---主節(jié)點(diǎn)
nimbus.host: "hadoop03"
4)supervisor.slots.ports:
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
發(fā)送安裝包
運(yùn)行storm集群
在主節(jié)點(diǎn)運(yùn)行Nimbus:【Hadoop03】bin/storm nimbus
后臺(tái):nohup storm nimbus >/dev/null? 2>&1 &
啟動(dòng)后臺(tái)ui管理界面:nohup storm ui >/dev/null? 2>&1 &
在其余節(jié)點(diǎn)運(yùn)行Supervisor:bin/storm supervisor
后臺(tái):nohup storm supervisor >/dev/null? 2>&1 &
在UI界面查看storm: http://{ui host}:8080.
強(qiáng)制殺死:kill -s -9 進(jìn)程號(hào)
搭建集群中遇到的問(wèn)題:
1条获、參數(shù)supervisor.slots.ports是用來(lái)指定一個(gè)節(jié)點(diǎn)最多可以運(yùn)行的task的數(shù)目忠荞,storm中一個(gè)節(jié)點(diǎn)最多四個(gè)task,其中的一個(gè)端口號(hào)表示一個(gè)task任務(wù)
2、啟動(dòng)storm之前可以先配置環(huán)境變量委煤,然后在家目錄下就可以執(zhí)行命令
nohup storm nimbus >/dev/null? 2>&1 &
nohup storm supervisor >/dev/null? 2>&1 &
3堂油、要查看storm的ui界面,首先要在后臺(tái)開(kāi)啟ui管理界面才能訪問(wèn)
nohup storm ui >/dev/null? 2>&1 &
4素标、nimbus和supervisor的節(jié)點(diǎn)選瘸剖:nimbus選取在Hadoop03(hdfs的active NameNode,純屬主觀意愿)头遭,supervisor的節(jié)點(diǎn)選取在Hadoop02寓免,Hadoop03,Hadoop04
5计维、nimbus袜香,supervisor是什么?
storm中8大概念:
1)Topologies:拓?fù)淙蝿?wù):地鐵運(yùn)送乘客的任務(wù)鲫惶,其中包含多個(gè)spout和bolt
2)Streams:地鐵5號(hào)線蜈首,運(yùn)送乘客(數(shù)據(jù))
3)Spouts:起始站
4)Bolts:中間站
5)Stream groupings
6)Reliability
7)Tasks
8)Workers
storm架構(gòu):
默認(rèn)情況下,一個(gè)supervisor節(jié)點(diǎn)最多可以啟動(dòng)4個(gè)worker進(jìn)程欠母,每一個(gè)topology默認(rèn)占用一個(gè)worker進(jìn)程欢策,每個(gè)worker進(jìn)程會(huì)啟動(dòng)1個(gè)或者多個(gè)executor,每個(gè)executor啟動(dòng)1個(gè)task赏淌。
最重要的是并行度/高并發(fā)以及線程安全的實(shí)現(xiàn)踩寇。