hdfs原始數(shù)據(jù):
hello a
hello b
map階段[映射成鍵值對]:
輸入數(shù)據(jù):
<0,"hello a">
<8,"hello b">
輸出數(shù)據(jù):
map(key,value,context) {
String line = value; //hello a
String[] words = value.split("\t");
for(String word : words) {
//hello
// a
// hello
// b
context.write(word,1);
}
}
<hello,1>
<a,1>
<hello,1>
<b,1>
reduce階段(分組排序):
輸入數(shù)據(jù):
<a,1>
<b,1>
<hello,{1,1}>
輸出數(shù)據(jù):
reduce(key,value,context) {
int sum = 0;
String word = key;
for(int i : value) {
sum += i;
}
context.write(word,sum);
}