一.故障現(xiàn)象
- Yarn資源管理的URL地址http://hadoop003:8088/cluster逃糟;
- 瀏覽器顯示無法訪問;
二. 排查步驟
1. 檢查java進(jìn)程
[root@hadoop003 hadoop]# jps
2291 QuorumPeerMain
11907 JournalNode
11796 DataNode
20824 Worker
22057 Jps
4972 ResourceManager
12014 NodeManager
共7個進(jìn)程揖赴,其中ResourceManager和NodeManager已正常啟動巫财。
2. 查看網(wǎng)絡(luò)連接
[root@hadoop003 hadoop]# ping hadoop003
PING hadoop003 (192.168.5.103) 56(84) bytes of data.
64 bytes from hadoop003 (192.168.5.103): icmp_seq=1 ttl=64 time=0.033 ms
64 bytes from hadoop003 (192.168.5.103): icmp_seq=2 ttl=64 time=0.071 ms
64 bytes from hadoop003 (192.168.5.103): icmp_seq=3 ttl=64 time=0.056 ms
64 bytes from hadoop003 (192.168.5.103): icmp_seq=4 ttl=64 time=0.044 ms
64 bytes from hadoop003 (192.168.5.103): icmp_seq=5 ttl=64 time=0.086 ms
網(wǎng)絡(luò)連接正常秽荞。
3. 查看8088端口狀態(tài)
[root@hadoop003 hadoop]# lsof -i:8088
[root@hadoop003 hadoop]# netstat -nltup | grep 8088
無處于監(jiān)聽狀態(tài)的8088端口顷编。
三. 解決辦法
1. 找到腳本
[root@hadoop003 sbin]# pwd
/opt/module/hadoop-2.7.3/sbin
[root@hadoop003 sbin]# ls -ltr
total 120
-rwxr-xr-x 1 root root 1718 Jan 31 14:54 stop-all.cmd
-rwxr-xr-x 1 root root 1727 Jan 31 14:54 start-all.cmd
-rwxr-xr-x 1 root root 6452 Jan 31 14:54 hadoop-daemon.sh
-rwxr-xr-x 1 root root 2145 Jan 31 14:54 slaves.sh
-rwxr-xr-x 1 root root 1360 Jan 31 14:54 hadoop-daemons.sh
-rwxr-xr-x 1 root root 1471 Jan 31 14:54 start-all.sh
-rwxr-xr-x 1 root root 1462 Jan 31 14:54 stop-all.sh
-rwxr-xr-x 1 root root 1597 Jan 31 14:54 hdfs-config.cmd
-rwxr-xr-x 1 root root 1414 Jan 31 14:54 stop-dfs.cmd
-rwxr-xr-x 1 root root 1360 Jan 31 14:54 start-dfs.cmd
-rwxr-xr-x 1 root root 2752 Jan 31 14:54 distribute-exclude.sh
-rwxr-xr-x 1 root root 1648 Jan 31 14:54 refresh-namenodes.sh
-rwxr-xr-x 1 root root 1427 Jan 31 14:54 hdfs-config.sh
-rwxr-xr-x 1 root root 1128 Jan 31 14:54 start-balancer.sh
-rwxr-xr-x 1 root root 1357 Jan 31 14:54 start-secure-dns.sh
-rwxr-xr-x 1 root root 3734 Jan 31 14:54 start-dfs.sh
-rwxr-xr-x 1 root root 1179 Jan 31 14:54 stop-balancer.sh
-rwxr-xr-x 1 root root 3206 Jan 31 14:54 stop-dfs.sh
-rwxr-xr-x 1 root root 1340 Jan 31 14:54 stop-secure-dns.sh
-rwxr-xr-x 1 root root 2291 Jan 31 14:54 httpfs.sh
-rwxr-xr-x 1 root root 1524 Jan 31 14:54 start-yarn.cmd
-rwxr-xr-x 1 root root 3128 Jan 31 14:54 kms.sh
-rwxr-xr-x 1 root root 1595 Jan 31 14:54 stop-yarn.cmd
-rwxr-xr-x 1 root root 1347 Jan 31 14:54 start-yarn.sh
-rwxr-xr-x 1 root root 4295 Jan 31 14:54 yarn-daemon.sh
-rwxr-xr-x 1 root root 1340 Jan 31 14:54 stop-yarn.sh
-rwxr-xr-x 1 root root 1353 Jan 31 14:54 yarn-daemons.sh
-rwxr-xr-x 1 root root 4080 Jan 31 14:54 mr-jobhistory-daemon.sh
2. 停止Yarn
[root@hadoop003 sbin]# start-yarn.sh
starting yarn daemons
resourcemanager running as process 4972. Stop it first.
hadoop003: nodemanager running as process 12014. Stop it first.
hadoop002: nodemanager running as process 14176. Stop it first.
hadoop001: nodemanager running as process 20194. Stop it first.
[root@hadoop003 sbin]# stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
resourcemanager did not stop gracefully after 5 seconds: killing with kill -9
hadoop003: stopping nodemanager
hadoop002: stopping nodemanager
hadoop001: stopping nodemanager
no proxyserver to stop
3.檢查java進(jìn)程
[root@hadoop003 sbin]# jps
22833 Jps
2291 QuorumPeerMain
11907 JournalNode
11796 DataNode
20824 Worker
可見ResourceManager和NodeManager的進(jìn)程已被停止霎俩。
4. 啟動Yarn
[root@hadoop003 sbin]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-hadoop-resourcemanager-hadoop003.out
hadoop002: starting nodemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop002.out
hadoop001: starting nodemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop001.out
hadoop003: starting nodemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop003.out
5. 顯示所有Java進(jìn)程pid
[root@hadoop003 sbin]# jps
23169 Jps
2291 QuorumPeerMain
11907 JournalNode
11796 DataNode
20824 Worker
23017 NodeManager
22907 ResourceManager
6. 查看8088端口狀態(tài)
[root@hadoop003 sbin]# lsof -i:8088
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 22907 root 200u IPv6 316494 0t0 TCP hadoop003:radan-http (LISTEN)
[root@hadoop003 sbin]# netstat -tunlp | grep 8088
tcp6 0 0 192.168.5.103:8088 :::* LISTEN 22907/java
[root@hadoop003 sbin]#