一脖卖、背景介紹
在單臺(tái) CentOS 7.4 系統(tǒng)的服務(wù)器上安裝 hadoop弥奸,用作平時(shí)開發(fā)代碼的測試環(huán)境,其中安裝包含 hadoop-3.2.2.tar.gz捺癞、spark-3.0.1-bin-hadoop3.2.tgz。
二构挤、遇到問題
問題1:進(jìn)入hadoop目錄執(zhí)行 sbin/start-all.sh 時(shí)報(bào)錯(cuò):localhost Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
做本機(jī)信任
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
問題2:進(jìn)入hadoop目錄執(zhí)行 sbin/start-all.sh 時(shí)報(bào)錯(cuò):ERROR: JAVA_HOME is not set and could not be found.
修改 hadoop/etc/hadoop/hadoop-env.sh 腳本文件內(nèi)容髓介,可參考:
export JAVA_HOME=/opt/software/jdk/java
三、我的總結(jié)
以hadoop安裝到 /opt/software/big_data_env 且source_hadoop_env.sh放置于hadoop目錄中為例:
1儿倒、安裝前
- 完成相關(guān)依賴安裝(java等)版保、環(huán)境設(shè)置(本機(jī)信任等)
- 完成core-site.xml呜笑、hdfs-site.xml夫否、yarn-site.xml彻犁、mapred-site.xml等4個(gè)配置文件的配置,提前避免一些安裝問題
- 編寫環(huán)境變量加載腳本凰慈,如:source_hadoop_env.sh
core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:19000</value>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>0.0.0.0:19870</value>
<description>
The address and the base port where the dfs namenode web ui wiil listen on.
</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///opt/software/dfs/data</value>
</property>
</configuration>
yarn-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>-1</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>-1</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>98.5</value>
</property>
<property>
<name>yarn.nodemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:18088</value>
<description>
The http address of the RM web application.
IF onlt a host is provided as the value,
the webapp will be served on a random port.
</description>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
source_hadoop_env.sh
#!/usr/bin/bash
export JAVA_HOME=/opt/software/jdk/java
export HADOOP_HOME=/opt/software/big_data_env/hadoop
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}:${HADOOP_HOME}/bin:$PATH
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
export HADOOP_CONF_DIR=/opt/software/big_data_env/hadoop/etc/hadoop
2汞幢、安裝中
- 執(zhí)行 sbin/start-all.sh 前,加載環(huán)境變量(使用自動(dòng)加載方法)并格式化namenode
sed -i '$a\source /opt/software/big_data_env/hadoop/source_hadoop_env.sh' ~/.bash_profile source ~/.bash_profile hadoop namenode -format sbin/start-all.sh
3微谓、安裝后
- 查看進(jìn)程是否存在(包含NameNode森篷、DataNode、ResourceManager豺型、SecondaryNameNode仲智、NodeManager)
jps
- 查看 /opt/software/hadoop/logs 中各個(gè)日志打印內(nèi)容是否有報(bào)錯(cuò),其中如:
tail -5000f /opt/software/big_data_env/hadoop/logs/hadoop-root-datanode-*.log
- 查看頁面是否可以正常訪問
1)hdfs namenode web interface:htttp://{主機(jī)地址}:19870
2)hdfs file system: htttp://{主機(jī)地址}:19870/explorer.html
3)yarn resourcemanager web interface: htttp://{主機(jī)地址}:18088/cluster/scheduler
備注:從3.0開始 hdfs file system 可以直接上傳下載數(shù)據(jù)文件.