安裝java
刪除自帶的java
rpm -qa|grep java
yum -y remove xxx
解壓java到opt目錄配置環(huán)境變量
vi /etc/profile
設(shè)置JAVA_HOME變量
export JAVA_HOME=/opt/jdk1.8.0_161
設(shè)置JRE_HOME變量
export JRE_HOME=/opt/jdk1.8.0_161/jre
設(shè)置PATH變量
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
更新環(huán)境變量
source /etc/profile
驗(yàn)證
java -version
安裝hadoop
準(zhǔn)備
解壓
tar -zxvf hadoop-2.7.5.tar.gz
進(jìn)入目錄
cd hadoop-2.7.5
配置環(huán)境變量
vi etc/hadoop/hadoop-env.sh
在最后添加一行
export JAVA_HOME=/opt/jdk1.8.0_161
測(cè)試,會(huì)輸出一些help之類的配置
bin/hadoop
偽分布式安裝
hadoop 啟動(dòng)
vi etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
vi etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
無(wú)密碼登錄配置
創(chuàng)建秘鑰
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
拷貝秘鑰
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
改變權(quán)限
chmod 0600 ~/.ssh/authorized_keys
測(cè)試
ssh localhost
啟動(dòng)hdfs
格式化文件系統(tǒng)
bin/hdfs namenode -format
啟動(dòng)hdfs
sbin/start-dfs.sh
驗(yàn)證
瀏覽器查看 http://localhost:50070/
或者命令行查看
jps
yarn啟動(dòng)
配置mapred-site.xml
copy模板修改
cp etc/hadoop/mapred-site.xml.template ect/hadoop/mapred-site.xml
vi etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置 yarn-site.xml
vi etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
啟動(dòng)yarn
sbin/start-yarn.sh
驗(yàn)證
遠(yuǎn)程連接運(yùn)行一個(gè)任務(wù)
保存文件至dfs
創(chuàng)建目錄
bin/hdfs dfs -mkdir /sort
拷貝當(dāng)前目錄下的sort目錄
sort目錄下有兩個(gè)文件
file1
5
6
4
7
8
9
22
524
8783
876
546512
546541
57
755
4781
file2
89
5
412
589
84
4841
11
5532
11
881
12
111
2222
45
21
123
5
8238
55953
hdfs dfs -put sort /sort/sort_input
準(zhǔn)備程序
public static class Map extends
Mapper<Object, Text, IntWritable, IntWritable> {
private static IntWritable data = new IntWritable();
@Override
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
data.set(Integer.parseInt(line));
context.write(data, new IntWritable(1));
}
}
public static class Reduce extends
Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
private static IntWritable linenum = new IntWritable(1);
@Override
public void reduce(IntWritable key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
for (IntWritable val : values) {
context.write(linenum, key);
linenum = new IntWritable(linenum.get() + 1);
}
}
}
public static class Partition extends Partitioner<IntWritable, IntWritable> {
/**
* 為每一個(gè)數(shù)值進(jìn)行分區(qū)
* @param key 鍵
* @param value 值
* @param numPartitions 分區(qū)個(gè)數(shù)
* @return 分區(qū)id
*/
@Override
public int getPartition(IntWritable key, IntWritable value,
int numPartitions) {
int maxNumber = 65223;
int bound = maxNumber / numPartitions + 1;
int keyNumber = key.get();
for (int i = 0; i < numPartitions; i++) {
if (keyNumber < bound * i && keyNumber >= bound * (i - 1)) {
return i - 1;
}
}
return 0;
}
}
public static void main(String[] args) throws Exception {
String inputPath = "hdfs://127.0.0.1:9000/sort/sort_input";
String outputPath = "hdfs://127.0.0.1:9000/sort/sort_output";
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Sort");
job.setJarByClass(Sort.class);
job.setMapperClass(Map.class);
job.setPartitionerClass(Partition.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(inputPath));
FileOutputFormat.setOutputPath(job, new Path(outputPath));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
坑
集群?jiǎn)?dòng)不成功
Does not contain a valid host:port authority: _ 9000
我的host配置
127.0.0.1 hadoop_node1
vim /etc/hosts
把下劃線去掉
127.0.0.1 hadoopnode1
Hadoop本地開發(fā)溉仑,9000端口拒絕訪問(wèn)
http://blog.csdn.net/yjc_1111/article/details/53817750
vi復(fù)制命令
任務(wù):將第9行至第15行的數(shù)據(jù)牺勾,復(fù)制到第16行
:9丸冕,15 copy 15
或
:9韭寸,15 co 15
由此可有:
:9卸亮,15 move 16 或 :9,15 m 16 將第9行到第15行的文本內(nèi)容到第16行的后面