Hadoop Streaming的使用

? ? ? ?Hadoop Streaming是python可以調(diào)用于執(zhí)行MapReduce任務(wù)的接口茉帅，本人在學(xué)習(xí)使用時(shí)踩了很多坑叨叙，也折騰了一段時(shí)間，本篇文章主要記錄一下該工具的簡(jiǎn)單使用堪澎。

一擂错、介紹

? ? ? ?hadoop streaming 是Hadoop的一個(gè)工具，可以用其創(chuàng)建和運(yùn)行map\reduce作業(yè)樱蛤，程序只要遵循標(biāo)準(zhǔn)輸入钮呀、輸出（stdin讀、stdout寫(xiě)）即可昨凡。mapper和reducer步驟可以是文件或者可執(zhí)行腳本爽醋。
基本格式如下：

hadoop command [genericOptions] [streamingOptions]

? ? ? ?注意：普通選項(xiàng)一定要寫(xiě)在streaming選項(xiàng)前面

二、普通選項(xiàng)

Parameter	Optional/Required	Description
-conf configuration_file	Optional	Specify an application configuration file
-D property=value	Optional	Use value for given property
-fs host:port or local	Optional	Specify a namenode
-files	Optional	Specify comma-separated files to be copied to the Map/Reduce cluster
-libjars	Optional	Specify comma-separated jar files to include in the classpath
-archives	Optional	Specify comma-separated archives to be unarchived on the compute machines

? ? ? ?其中：-D property=value是很重要的指令便脊。

※指定map\reduce任務(wù)數(shù):

-D  mapred.reduce.tasks= 2

指定reducer個(gè)數(shù)蚂四，為0時(shí)，該作業(yè)只有mapper

※指定mapper輸出分隔符：

-D stream.map.output.field.separator=.

指定mapper每條輸出key,value分隔符

-D stream.num.map.output.key.fields=4

第4個(gè) . 之前的部分為key,剩余為value

-D map.output.key.field.separator=.

設(shè)置map輸出中哪痰，Key內(nèi)部的分隔符

※指定基于哪些key進(jìn)行分桶：

-D num.key.fields.for.partition=1

只用1列Key做分桶

-D num.key.fields.for.partition=2

使用1,2共兩列key做分桶

-D mapred.text.key.partitioner.option =-k2,3

第2,3列Key做分桶

-D mapred.text.key.partitioner.option =-k2,2

第2列key做分桶

※使用上述-D配置后遂赠，下文需加上：

-partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner

三、streaming命令選項(xiàng)

Parameter	Optional/Required	Description
-input directoryname or filename	Required	Input location for mapper
-output directoryname	Required	Output location for reducer
-mapper executable or JavaClassName	Optional	Mapper executable. If not specified, IdentityMapper is used as the default
-reducer executable or JavaClassName	Optional	Reducer executable. If not specified, IdentityReducer is used as the default
-file filename	Optional	Make the mapper, reducer, or combiner executable available locally on the compute nodes
-inputformat JavaClassName	Optional	Class you supply should return key/value pairs of Text class. If not specified, TextInputFormat is used as the default
-outputformat JavaClassName	Optional	Class you supply should take key/value pairs of Text class. If not specified, TextOutputformat is used as the default
-partitioner JavaClassName	Optional	Class that determines which reduce a key is sent to
-combiner streamingCommand or JavaClassName	Optional	Combiner executable for map output
-cmdenv name=value	Optional	Pass environment variable to streaming commands
-inputreader	Optional	For backwards-compatibility: specifies a record reader class (instead of an input format class)
-verbose	Optional	Verbose output
-lazyOutput	Optional	Create output lazily. For example, if the output format is based on FileOutputFormat, the output file is created only on the first call to Context.write
-numReduceTasks	Optional	Specify the number of reducers
-mapdebug	Optional	Script to call when map task fails
-reducedebug	Optional	Script to call when reduce task fails

（常用選項(xiàng)已經(jīng)標(biāo)注）

示例：

hadoop jar /usr/hadoop/hadoop-2.5.1/share/hadoop/tools/lib/hadoop-streaming-2.5.1.jar \

    -D stream.num.map.output.key.fields=4 \

    -D stream.map.output.field.separator=. \

    -D mapred.text.key.partitioner.options=-k1,2 \

    -D map.output.key.field.separator=. \

    -partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner \

    -input /user/input/in.txt \

    -output /user/output \

    -mapper mapper.py -file mapper.py \

    -reducer reducer.py -file reducer.py

總結(jié)：
? ? ? ? 1、map操作會(huì)默認(rèn)將輸出按照key進(jìn)行排序，而不管value
? ? ? ? 2庐氮、需自己指定關(guān)鍵字列肛走，從而打到不同的reduce作業(yè)中

? ? ? ?后續(xù)將更新mapreduce工作原理以及shuffle流程，以及好友推薦谍椅、搜索自動(dòng)補(bǔ)全項(xiàng)目。
參考鏈接：
http://hadoop.apache.org/docs/current/hadoop-streaming/HadoopStreaming.html
https://www.cnblogs.com/shay-zhangjin/p/7714868.html

最后編輯于：2019.03.13 11:12:01

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市哼鬓，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌边灭，老刑警劉巖异希，帶你破解...
沈念sama閱讀 222,104評(píng)論 6贊 515
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡称簿，警方通過(guò)查閱死者的電腦和手機(jī)扣癣，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 94,816評(píng)論 3贊 399
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門(mén)，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)憨降，“玉大人父虑，你說(shuō)我怎么就攤上這事∈谝” “怎么了士嚎？”我有些...
開(kāi)封第一講書(shū)人閱讀 168,697評(píng)論 0贊 360
道士緝兇錄：失蹤的賣(mài)姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)悔叽。經(jīng)常有香客問(wèn)我莱衩，道長(zhǎng)，這世上最難降的妖魔是什么娇澎？我笑而不...
開(kāi)封第一講書(shū)人閱讀 59,836評(píng)論 1贊 298
?港島之戀（遺憾婚禮）
正文為了忘掉前任笨蚁，我火速辦了婚禮，結(jié)果婚禮上趟庄，老公的妹妹穿的比我還像新娘括细。我一直安慰自己，他們只是感情好戚啥，可當(dāng)我...
茶點(diǎn)故事閱讀 68,851評(píng)論 6贊 397
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開(kāi)白布奋单。她就那樣靜靜地躺著，像睡著了一般猫十。火紅的嫁衣襯著肌膚如雪辱匿。梳的紋絲不亂的頭發(fā)上，一...
開(kāi)封第一講書(shū)人閱讀 52,441評(píng)論 1贊 310
城市分裂傳說(shuō)
那天炫彩，我揣著相機(jī)與錄音匾七，去河邊找鬼。笑死江兢，一個(gè)胖子當(dāng)著我的面吹牛昨忆，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播杉允，決...
沈念sama閱讀 40,992評(píng)論 3贊 421
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開(kāi)眼邑贴，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼！你這毒婦竟也來(lái)了叔磷？” 一聲冷哼從身側(cè)響起拢驾，我...
開(kāi)封第一講書(shū)人閱讀 39,899評(píng)論 0贊 276
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤，失蹤者是張志新（化名）和其女友劉穎改基，沒(méi)想到半個(gè)月后繁疤，有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 46,457評(píng)論 1贊 318
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 38,529評(píng)論 3贊 341
?白月光啟示錄
正文我和宋清朗相戀三年稠腊，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了躁染。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 40,664評(píng)論 1贊 352
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡架忌，死狀恐怖吞彤，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情叹放，我是刑警寧澤饰恕，帶...
沈念sama閱讀 36,346評(píng)論 5贊 350
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站井仰，受9級(jí)特大地震影響懂盐，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜糕档，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 42,025評(píng)論 3贊 334
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望拌喉。院中可真熱鬧速那，春花似錦、人聲如沸尿背。這莊子的主人今日做“春日...
開(kāi)封第一講書(shū)人閱讀 32,511評(píng)論 0贊 24
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)田藐。三九已至荔烧，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間汽久，已是汗流浹背鹤竭。一陣腳步聲響...
開(kāi)封第一講書(shū)人閱讀 33,611評(píng)論 1贊 272
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留景醇，地道東北人臀稚。一個(gè)月前我還...
沈念sama閱讀 49,081評(píng)論 3贊 377
代替公主和親
正文我出身青樓，卻偏偏與公主長(zhǎng)得像三痰，于是被迫代替她去往敵國(guó)和親吧寺。傳聞我的和親對(duì)象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 45,675評(píng)論 2贊 359