案例說明:1.本例使用了兩臺pc作為實(shí)驗(yàn)對象脂信;
2.本例中的master的ip為192.168.1.103癣蟋;slave2的ip為192.168.1.102;
一狰闪、新建用戶
1.linux 如何創(chuàng)建新用戶:
sudo useradd -m hadoop -s /bin/bash
2.新建用戶設(shè)置密碼 :
sudo passwd hadoop
3.增加管理員權(quán)限:
sudo adduser hadoop sudo
4.更新apt:
sudo apt-get update
5.安裝vim:
sudo apt-get install vim
二疯搅、網(wǎng)絡(luò)配置
1.查看ip :
ifconfig
2.修改主機(jī)名:
sudo vim /etc/hostname
3.修改所有節(jié)點(diǎn)(slave2也要修改)的ip映射:
ifconfig #查看master 的ip
sudo vim /etc/hosts
4.測試是否配置好:
ping slave2 -c 3 #只ping3次 或者 ping ip
三、安裝埋泵、配置ssh無密碼登錄
1.安裝ssh:
sudo apt-get install openssh-server
2.登陸本機(jī):
ssh localhost
3.退出ssh登錄的localhost幔欧,ssh-keygen生成密鑰,并將密鑰加入到授權(quán)中:
exit
cd ~/.ssh/
ssh-keygen -t rsa #如果執(zhí)行不成功在最前面加sudo
cat ./id_rsa.pub >> /.authorized_keys #如果報(bào)錯(cuò)權(quán)限問題丽声,需要重新為~/.shh復(fù)制權(quán)限
4.傳輸公鑰到slave2節(jié)點(diǎn):
scp ~/.ssh/id_rsa.pub hadoop@slave2:/home/hadoop/
5.在slave2節(jié)點(diǎn)上揭措,將ssh公鑰加入授權(quán):
mkdir ~/.ssh
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys #將密鑰加入
rm ~/id_rsa.pub #可以不刪除
6.登出用戶 選擇log out渐扮;在重新登錄
7.在master上進(jìn)入slave2節(jié)點(diǎn):
ssh slave2
(如果出現(xiàn)問題實(shí)在解決不了可重裝) 卸載ssh:
sudo apt-get --purge remove openssh-serve
四、Java安裝配置
千萬不能安裝openjdk,血淚史以政。
1.下載Jdk-8u131-linux-x64.tar.gz并解壓
sudo tar -zxf ~/Downloads/jdk-8u131-linux-x64.tar.gz -C /usr/local #解壓命令
2.添加java環(huán)境:
sudo vim ~/.bashrc
加入下列變量:
export JAVA_HOME=/usr/local/jdk1.8.0_131
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/rt.jar
3.使環(huán)境變量生效:
source ~/.bashrc
4.驗(yàn)證java環(huán)境配置正確與否
echo $JAVA_HOME
Java -version
$JAVA_HOME/bin/java -version #應(yīng)該輸出相同結(jié)果
五矿瘦、Hadoop安裝
1.下載Hadoop安裝至/usr/local
sudo tar -zxf ~/Downloads/hadoop-2.7.3.tar.gz -C /usr/local
cd /usr/local/
sudo mv ./hadoop-2.7.3/ ./hadoop #修改名字
sudo chown -R hadoop ./hadoop # 賦予權(quán)限
2.查看Hadoop是否可用
cd /usr/local/hadoop
./bin/hadoop version
3.hadoop環(huán)境配置
sudo vim ~/.bashrc
添加如下變量
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_ROOT_LOGGER=INFO,console
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
執(zhí)行
source ~/.bashrc
六浙于、Hadoop集群配置
cd /usr/local/hadoop/etc/hadoop
1.修改hadoop-env.sh
sudo vim hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.8.0_131
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:/usr/local/hadoop/bin
2.修改slaves刪除localhost加入 slave2
sudo vim slaves
3.修改core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://Master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
</configuration>
4.修改hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>Master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<!--指定HDFS副本的數(shù)量-->
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/hdfs/data</value>
</property>
</configuration>
5.修改mapred-site.xml
cp mapred-site.xml.template mapred-site.xml
修改mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Master:19888</value>
</property>
</configuration>
6.修改yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>Master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>Master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>Master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>Master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>Master:8088</value>
</property>
</configuration>
7.復(fù)制master節(jié)點(diǎn)的Hadoop文件夾到slave2上
scp -r /usr/local/hadoop hadoop@slave2:/usr/local
如果報(bào)錯(cuò)一般由于權(quán)限問題無法訪問帘撰,可以執(zhí)行:
scp -r /usr/local/hadoop hadoop@slave2:/home/hadoop
ssh slave2
sudo cp -r ~/hadoop /usr/local
sudo chown -R hadoop /usr/local/hadoop
8.在slave2上,安裝java徒坡,并在~/.bashrc配置java撕氧、Hadoop環(huán)境(參考上文)
9.首次啟動需要在Master節(jié)點(diǎn)執(zhí)行NameNode的格式化:
hdfs namenode -format
10.測試Hadoop是否安裝成功
start-dfs.sh
start-yarn.sh
查看集群是否啟動成功,輸入在master輸入jps顯示:
SecondaryNameNode
ResourceManager
NameNode
在slave2上輸入jps顯示:
NodeManager
DataNode
另外需要在Master節(jié)點(diǎn)通過命令
hdfs dfsadmin -report
查看DataNode啟動。
安裝出現(xiàn)的問題:
1.安裝openjava出現(xiàn)hadoop無法啟動崭参,需要卸載openjava呵曹,在安裝sun java;
2.Nodedata不能啟動何暮,用戶對/usr/local/hadoop文件夾權(quán)限不足奄喂,不能讀取海洼;必須對slave機(jī)器的hadoop文件夾設(shè)置權(quán)限跨新,sudo chown -R hadoop /usr/local/hadoop, 如果是權(quán)限比較大的用戶如root不會出現(xiàn)問題;
3. Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: ?%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
解決方法:
關(guān)閉安全模式:
hadoop路徑/bin/hadoop dfsadmin -safemode leave
修改Hadoop core-site.xml 因?yàn)閚odedata不能識別master
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.1.103:9000</value>
</property>