安裝HDFS
由于hadoop依賴(lài)于特定版本的snappy晌涕,請(qǐng)先卸載snappy確保安裝的順利進(jìn)行:
hawq ssh -f hostfile -e 'yum remove -y snappy'
HAWQ的HDFS采用HA的方式進(jìn)行安裝配置。安裝Hadoop可執(zhí)行文件敞贡。
hawq ssh -f hostfile -e 'yum install -y hadoop hadoop-hdfs'
配置NameNode目錄授滓,需要配置的節(jié)點(diǎn)有兩個(gè),oushum1 和 oushum2。創(chuàng)建nnhostfile揩抡,類(lèi)似前文hostfile:
touch nnhostfile
配置nnhostfile內(nèi)容為hadoop的NameNode節(jié)點(diǎn)hostname:
oushum1oushum2
創(chuàng)建DataNode主機(jī)文件dnhostfile,類(lèi)似前文nnhostfile:
touch dnhostfile
配置dnhostfile內(nèi)容為hadoop的DataNode節(jié)點(diǎn)hostname:
oushus1oushus2
創(chuàng)建NameNode目錄:
hawq ssh -f nnhostfile -e 'mkdir -p /data1/hdfs/namenode'hawq ssh -f nnhostfile -e 'chmod -R 755 /data1/hdfs'hawq ssh -f nnhostfile -e 'chown -R hdfs:hadoop /data1/hdfs'
創(chuàng)建DataNode目錄:
hawq ssh -f dnhostfile -e 'mkdir -p /data1/hdfs/datanode'hawq ssh -f dnhostfile -e 'mkdir -p /data2/hdfs/datanode'hawq ssh -f dnhostfile -e 'chmod -R 755 /data1/hdfs'hawq ssh -f dnhostfile -e 'chmod -R 755 /data2/hdfs'hawq ssh -f dnhostfile -e 'chown -R hdfs:hadoop /data1/hdfs'hawq ssh -f dnhostfile -e 'chown -R hdfs:hadoop /data2/hdfs'
復(fù)制下列文件到oushum1上的/etc/hadoop/conf/中,
● http://www.oushu.com/docs/ch/_downloads/908bee114673dff44292d2b51ed5a1ce/core-site.xml.
● http://www.oushu.com/docs/ch/_downloads/a57b214c41f418570548204fdf5089b3/hdfs-site.xml.
● http://www.oushu.com/docs/ch/_downloads/5caeda7d6d35f2ab18438c8994e855c1/hadoop-env.sh.
修改hadoop配置文件, 根據(jù)各個(gè)節(jié)點(diǎn)的自身配置決定的础倍,可以參考下面內(nèi)容進(jìn)行修改 崭篡,主要是/etc/hadoop/conf目錄下的core-site.xml入蛆、hdfs-site.xml、hadoop-env.xml和slaves
修改oushum1上的配置文件/etc/hadoop/conf/core-site.xml 首先需要打開(kāi)HA烁设,即去掉如下所示的HA注釋?zhuān)?/p>
去掉下面的內(nèi)容:
<property><name>fs.defaultFS</name><value>hdfs://hdfs-nn:9000</value></property>
修改下面的內(nèi)容:
<configuration><property><name>fs.defaultFS</name><value>hdfs://oushu</value></property><property><name>ha.zookeeper.quorum</name><value>oushum1:2181,oushum2:2181,oushus1:2181</value></property>...<property><name>ipc.server.listen.queue.size</name><value>3300</value></property>...<configuration>
修改oushum1上的配置文件/etc/hadoop/conf/hdfs-site.xml 首先打開(kāi)HA,即去掉如下所示的兩行注釋?zhuān)?/p>
HA打開(kāi)后钓试,修改內(nèi)容如下:
<configuration><property><name>dfs.name.dir</name><value>file:/data1/hdfs/namenode</value><final>true</final></property><property><name>dfs.data.dir</name><value>file:/data1/hdfs/datanode,file:/data2/hdfs/datanode</value><final>true</final></property>...<property><name>dfs.block.local-path-access.user</name><value>gpadmin</value></property>...<property><name>dfs.domain.socket.path</name><value>/var/lib/hadoop-hdfs/dn_socket</value></property>...<property><name>dfs.block.access.token.enable</name><value>true</value><description>If "true", access tokens are used as capabilities for accessingdatanodes.If "false", no access tokens are checked on accessing datanodes.</description></property>...<property><name>dfs.nameservices</name><value>oushu</value></property><property><name>dfs.ha.namenodes.oushu</name><value>nn1,nn2</value></property><property><name>dfs.namenode.rpc-address.oushu.nn1</name><value>oushum2:9000</value></property><property><name>dfs.namenode.http-address.oushu.nn1</name><value>oushum2:50070</value></property><property><name>dfs.namenode.rpc-address.oushu.nn2</name><value>oushum1:9000</value></property><property><name>dfs.namenode.http-address.oushu.nn2</name><value>oushum1:50070</value></property><property><name>dfs.namenode.shared.edits.dir</name><value>qjournal://oushum1:8485;oushum2:8485;oushus1:8485/oushu</value></property><property><name>dfs.ha.automatic-failover.enabled.oushu</name><value>true</value></property><property><name>dfs.client.failover.proxy.provider.oushu</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property><property><name>dfs.journalnode.edits.dir</name><value>/data1/hdfs/journaldata</value></property>...
修改oushum1上/etc/hadoop/conf/hadoop-env.sh:
...
export JAVA_HOME="/usr/java/default"
...
export HADOOP_CONF_DIR="/etc/hadoop/conf"
...
export HADOOP_NAMENODE_OPTS="-Xmx6144m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70"
export HADOOP_DATANODE_OPTS="-Xmx2048m -Xss256k"
...
export HADOOP_LOG_DIR=/var/log/hadoop/$USER
...
修改oushum1上/etc/hadoop/conf/slaves 將所有DataNode的HostName寫(xiě)入該文件:
oushus1oushus2
拷貝oushum1上/etc/hadoop/conf中的配置文件到所有節(jié)點(diǎn):
hawq scp -r -f hostfile /etc/hadoop/conf =:/etc/hadoop/
在oushum1節(jié)點(diǎn)装黑,格式化ZKFailoverController
sudo -u hdfs hdfs zkfc -formatZK
在配置journal的所有節(jié)點(diǎn)上副瀑,啟動(dòng)journalnode。創(chuàng)建jhostfile曹体,類(lèi)似前文hostfile俗扇,內(nèi)容為配置journal的節(jié)點(diǎn)hostname:
oushum1oushum2oushus1
使用下面的命令,啟動(dòng)journalnode:
hawq ssh -f jhostfile -e 'sudo -u hdfs /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start journalnode'
格式化并啟動(dòng)oushum1上的NameNode:
sudo -u hdfs hdfs namenode -format -clusterId sssudo -u hdfs /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode
在另一個(gè)NameNode oushum2中進(jìn)行同步操作箕别,并啟動(dòng)NameNode:
hawq ssh -h oushum2 -e 'sudo -u hdfs hdfs namenode -bootstrapStandby'hawq ssh -h oushum2 -e 'sudo -u hdfs /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode'
通過(guò)hawq ssh啟動(dòng)所有datanode節(jié)點(diǎn):
hawq ssh -f dnhostfile -e 'sudo -u hdfs /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode'
通過(guò)hawq ssh啟動(dòng)oushum2上的zkfc進(jìn)程铜幽,使其成為active namenode:
hawq ssh -h oushum2 -e 'sudo -u hdfs /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start zkfc'
通過(guò)hawq ssh啟動(dòng)oushum1上的zkfc進(jìn)程,使其成為standby namenode:
hawq ssh -h oushum1 -e 'sudo -u hdfs /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start zkfc'
檢查hdfs是否成功運(yùn)行:
su - hdfshdfs dfsadmin -reporthdfs dfs -mkdir /testnodehdfs dfs -put /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh /testnode/hdfs dfs -ls -R /
你也可以查看HDFS web界面:http://oushum1:50070/