外網(wǎng)無(wú)法訪問(wèn)云主機(jī)HDFS文件系統(tǒng)

一、問(wèn)題背景:
1.云主機(jī)是 Linux 環(huán)境叛赚,搭建 Hadoop 偽分布式
公網(wǎng) IP:139.198.18.xxx
內(nèi)網(wǎng) IP:192.168.137.2
主機(jī)名:hadoop001
2.本地的core-site.xml配置如下:

<configuration>
<property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop001:9001</value>
</property>
<property>
        <name>hadoop.tmp.dir</name>
        <value>hdfs://hadoop001:9001/hadoop/tmp</value>
</property>
</configuration>

3.本地的hdfs-site.xml配置如下:

<configuration>
<property>
       <name>dfs.replication</name>
       <value>1</value>
 </property>
</configuration>

4.云主機(jī)hosts文件配置:

[hadoop@hadoop001 ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# hostname loopback address
  192.168.137.2   hadoop001

云主機(jī)將內(nèi)網(wǎng)IP和主機(jī)名hadoop001做了映射
5.本地hosts文件配置

139.198.18.XXX     hadoop001

本地已經(jīng)將公網(wǎng)IP和域名hadoop001做了映射
二、問(wèn)題癥狀
1.在云主機(jī)上開啟 HDFS锤灿,JPS 查看進(jìn)程都沒有異常赘艳,通過(guò) Shell 操作 HDFS 文件也沒有問(wèn)題
2.通過(guò)瀏覽器訪問(wèn) 50070 端口管理界面也沒有問(wèn)題
3.在本地機(jī)器上使用 Java API 操作遠(yuǎn)程 HDFS 文件酌毡,URI 使用公網(wǎng) IP克握,代碼如下:

val uri = new URI("hdfs://hadoop001:9001")
val fs = FileSystem.get(uri,conf)
val listfiles = fs.listFiles(new Path("/data"),true)
    while (listfiles.hasNext) {
    val nextfile = listfiles.next()
    println("get file path:" + nextfile.getPath().toString())
    }
------------------------------運(yùn)行結(jié)果---------------------------------
get file path:hdfs://hadoop001:9001/data/infos.txt

4.在本地機(jī)器使用SparkSQL讀取hdfs上的文件并轉(zhuǎn)換為DF的過(guò)程中

object SparkSQLApp {
  def main(args: Array[String]): Unit = {
  val spark = SparkSession.builder().appName("SparkSQLApp").master("local[2]").getOrCreate()
  val info = spark.sparkContext.textFile("/data/infos.txt")
  import spark.implicits._
  val infoDF = info.map(_.split(",")).map(x=>Info(x(0).toInt,x(1),x(2).toInt)).toDF()
  infoDF.show()
  spark.stop()
  }
  case class Info(id:Int,name:String,age:Int)
}

出現(xiàn)如下報(bào)錯(cuò)信息:

....
....
....
19/02/23 16:07:00 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
19/02/23 16:07:00 INFO HadoopRDD: Input split: hdfs://hadoop001:9001/data/infos.txt:0+17
19/02/23 16:07:21 WARN BlockReaderFactory: I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
.....
....
19/02/23 16:07:21 INFO DFSClient: Could not obtain BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 from any node: java.io.IOException: No live nodes contain block BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 after checking nodes = [DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK] Dead nodes:  DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK]. Will get new block locations from namenode and retry...
19/02/23 16:07:21 WARN DFSClient: DFS chooseDataNode: got # 1 IOException, will wait for 272.617680460432 msec.
19/02/23 16:07:42 WARN BlockReaderFactory: I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
...
...
19/02/23 16:07:42 WARN DFSClient: Failed to connect to /192.168.137.2:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3499)
...
...
19/02/23 16:08:12 WARN DFSClient: Failed to connect to /192.168.137.2:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
...
...
19/02/23 16:08:12 INFO DFSClient: Could not obtain BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 from any node: java.io.IOException: No live nodes contain block BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 after checking nodes = [DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK]], ignoredNodes = null No live nodes contain current block Block locations: DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK] Dead nodes:  DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK]. Will get new block locations from namenode and retry...
19/02/23 16:08:12 WARN DFSClient: DFS chooseDataNode: got # 3 IOException, will wait for 11918.913311370841 msec.
19/02/23 16:08:45 WARN BlockReaderFactory: I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
...
...
19/02/23 16:08:45 WARN DFSClient: Could not obtain block: BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 file=/data/infos.txt No live nodes contain current block Block locations: DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK] Dead nodes:  DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK]. Throwing a BlockMissingException
19/02/23 16:08:45 WARN DFSClient: Could not obtain block: BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 file=/data/infos.txt No live nodes contain current block Block locations: DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK] Dead nodes:  DatanodeInfoWithStorage[192.168.137.2:50010,DS-fb2e7244-165e-41a5-80fc-4bb90ae2c8cd,DISK]. Throwing a BlockMissingException
19/02/23 16:08:45 WARN DFSClient: DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 file=/data/infos.txt
    at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1001)
...
...
19/02/23 16:08:45 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 file=/data/infos.txt
    at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1001)
    at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:648)
...
...
19/02/23 16:08:45 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
19/02/23 16:08:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
19/02/23 16:08:45 INFO TaskSchedulerImpl: Cancelling stage 0
19/02/23 16:08:45 INFO DAGScheduler: ResultStage 0 (show at SparkSQLApp.scala:30) failed in 105.618 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 file=/data/infos.txt
    at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1001)
...
...
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1358284489-192.168.137.2-1550394746448:blk_1073741840_1016 file=/data/infos.txt
    at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1001)
...
...

三、問(wèn)題分析
1.本地 Shell 可以正常操作阔馋,排除集群搭建和進(jìn)程沒有啟動(dòng)的問(wèn)題
2.云主機(jī)沒有設(shè)置防火墻玛荞,排除防火墻沒關(guān)的問(wèn)題
3.云服務(wù)器防火墻開放了 DataNode 用于數(shù)據(jù)傳輸服務(wù)端口 默認(rèn)是 50010
4.我在本地搭建了另一臺(tái)虛擬機(jī),該虛擬機(jī)和本地在同一局域網(wǎng)呕寝,本地可以正常操作該虛擬機(jī)的hdfs勋眯,基本確定了是由于內(nèi)外網(wǎng)的原因。
5.查閱資料發(fā)現(xiàn) HDFS 中的文件夾和文件名都是存放在 NameNode 上下梢,操作不需要和 DataNode 通信客蹋,因此可以正常創(chuàng)建文件夾和創(chuàng)建文件說(shuō)明本地和遠(yuǎn)程 NameNode 通信沒有問(wèn)題。那么很可能是本地和遠(yuǎn)程 DataNode 通信有問(wèn)題
四孽江、問(wèn)題猜想
由于本地測(cè)試和云主機(jī)不在一個(gè)局域網(wǎng)讶坯,hadoop配置文件是以內(nèi)網(wǎng)ip作為機(jī)器間通信的ip。在這種情況下,我們能夠訪問(wèn)到namenode機(jī)器岗屏,namenode會(huì)給我們數(shù)據(jù)所在機(jī)器的ip地址供我們?cè)L問(wèn)數(shù)據(jù)傳輸服務(wù)辆琅,但是當(dāng)寫數(shù)據(jù)的時(shí)候,NameNode 和DataNode 是通過(guò)內(nèi)網(wǎng)通信的这刷,返回的是datanode內(nèi)網(wǎng)的ip,我們無(wú)法根據(jù)該IP訪問(wèn)datanode服務(wù)器婉烟。
我們來(lái)看一下其中一部分報(bào)錯(cuò)信息:

19/02/23 16:07:21 WARN BlockReaderFactory: I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
...
19/02/23 16:07:42 WARN DFSClient: Failed to connect to /192.168.137.2:50010 for block, add to deadNodes and continue....

從報(bào)錯(cuò)信息中可以看出,連接不到192.168.137.2:50010暇屋,也就是datanode的地址似袁,因?yàn)橥饩W(wǎng)必須訪問(wèn)“139.198.18.XXX:50010”才能訪問(wèn)到datanode。
為了能夠讓開發(fā)機(jī)器訪問(wèn)到hdfs咐刨,我們可以通過(guò)域名訪問(wèn)hdfs昙衅,讓namenode返回給我們datanode的域名。
五定鸟、問(wèn)題解決
1.嘗試一:
在開發(fā)機(jī)器的hosts文件中配置datanode對(duì)應(yīng)的外網(wǎng)ip和域名(上文已經(jīng)配置)而涉,并且在與hdfs交互的程序中添加如下代碼:

val conf = new Configuration()
conf.set("dfs.client.use.datanode.hostname", "true")

報(bào)錯(cuò)依舊
2.嘗試二:

val spark = SparkSession
      .builder()
      .appName("SparkSQLApp")
       .master("local[2]")
      .config("dfs.client.use.datanode.hostname", "true")
      .getOrCreate()

報(bào)錯(cuò)依舊
3.嘗試三:
在hdfs-site.xml中添加如下配置:

 <property>
        <name>dfs.client.use.datanode.hostname</name>
        <value>true</value>
    </property>

運(yùn)行成功
通過(guò)查閱資料,建議在hdfs-site.xml中增加dfs.datanode.
use.datanode.hostname屬性联予,表示datanode之間的通信也通過(guò)域名方式

<property>
        <name>dfs.datanode.use.datanode.hostname</name>
        <value>true</value>
    </property>

這樣能夠使得更換內(nèi)網(wǎng)IP變得十分簡(jiǎn)單啼县、方便,而且可以讓特定datanode間的數(shù)據(jù)交換變得更容易躯泰。但與此同時(shí)也存在一個(gè)副作用谭羔,當(dāng)DNS解析失敗時(shí)會(huì)導(dǎo)致整個(gè)Hadoop不能正常工作华糖,所以要保證DNS的可靠

總結(jié):將默認(rèn)的通過(guò)IP訪問(wèn)麦向,改為通過(guò)域名方式訪問(wèn)。

六客叉、參考資料
https://blog.csdn.net/vaf714/article/details/82996860
https://www.cnblogs.com/krcys/p/9146329.html
https://blog.csdn.net/dominic_tiger/article/details/71773656
https://rainerpeter.wordpress.com/2014/02/12/connect-to-hdfs-running-in-ec2-using-public-ip-addresses/

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末诵竭,一起剝皮案震驚了整個(gè)濱河市话告,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌卵慰,老刑警劉巖沙郭,帶你破解...
    沈念sama閱讀 221,820評(píng)論 6 515
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異裳朋,居然都是意外死亡病线,警方通過(guò)查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 94,648評(píng)論 3 399
  • 文/潘曉璐 我一進(jìn)店門鲤嫡,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)送挑,“玉大人,你說(shuō)我怎么就攤上這事暖眼√韪” “怎么了?”我有些...
    開封第一講書人閱讀 168,324評(píng)論 0 360
  • 文/不壞的土叔 我叫張陵诫肠,是天一觀的道長(zhǎng)司澎。 經(jīng)常有香客問(wèn)我,道長(zhǎng)栋豫,這世上最難降的妖魔是什么挤安? 我笑而不...
    開封第一講書人閱讀 59,714評(píng)論 1 297
  • 正文 為了忘掉前任,我火速辦了婚禮笼才,結(jié)果婚禮上漱受,老公的妹妹穿的比我還像新娘。我一直安慰自己骡送,他們只是感情好昂羡,可當(dāng)我...
    茶點(diǎn)故事閱讀 68,724評(píng)論 6 397
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著摔踱,像睡著了一般虐先。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上派敷,一...
    開封第一講書人閱讀 52,328評(píng)論 1 310
  • 那天蛹批,我揣著相機(jī)與錄音,去河邊找鬼篮愉。 笑死腐芍,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的试躏。 我是一名探鬼主播猪勇,決...
    沈念sama閱讀 40,897評(píng)論 3 421
  • 文/蒼蘭香墨 我猛地睜開眼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼颠蕴!你這毒婦竟也來(lái)了泣刹?” 一聲冷哼從身側(cè)響起助析,我...
    開封第一講書人閱讀 39,804評(píng)論 0 276
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎椅您,沒想到半個(gè)月后外冀,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 46,345評(píng)論 1 318
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡掀泳,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 38,431評(píng)論 3 340
  • 正文 我和宋清朗相戀三年雪隧,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片员舵。...
    茶點(diǎn)故事閱讀 40,561評(píng)論 1 352
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡膀跌,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出固灵,到底是詐尸還是另有隱情捅伤,我是刑警寧澤,帶...
    沈念sama閱讀 36,238評(píng)論 5 350
  • 正文 年R本政府宣布巫玻,位于F島的核電站丛忆,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏仍秤。R本人自食惡果不足惜熄诡,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,928評(píng)論 3 334
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望诗力。 院中可真熱鬧凰浮,春花似錦、人聲如沸苇本。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,417評(píng)論 0 24
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)瓣窄。三九已至笛厦,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間俺夕,已是汗流浹背裳凸。 一陣腳步聲響...
    開封第一講書人閱讀 33,528評(píng)論 1 272
  • 我被黑心中介騙來(lái)泰國(guó)打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留劝贸,地道東北人姨谷。 一個(gè)月前我還...
    沈念sama閱讀 48,983評(píng)論 3 376
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像映九,于是被迫代替她去往敵國(guó)和親梦湘。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,573評(píng)論 2 359

推薦閱讀更多精彩內(nèi)容