DataX mongodb導(dǎo)出數(shù)據(jù)到mysql

Python版本要為2
cmd亂碼解決:輸入CHCP 65001
數(shù)據(jù)庫(kù)中的數(shù)據(jù)中文亂碼解決:在json文件中jdbcUrl項(xiàng)加上:?characterEncoding=utf8

DataX介紹

安裝DataX

DataX下載地址
下載完成解壓至某個(gè)路徑下即可

查看配置模板

python datax.py -r {YOUR_READER} -w {YOUR_WRITER}

例如mysql:

C:\DataX\bin>python datax.py -r mysqlreader -w mysqlwriter

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


Please refer to the mysqlreader document:
     https://github.com/alibaba/DataX/blob/master/mysqlreader/doc/mysqlreader.md

Please refer to the mysqlwriter document:
     https://github.com/alibaba/DataX/blob/master/mysqlwriter/doc/mysqlwriter.md

Please save the following configuration as a json file and  use
     python {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json
to run the job.

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "column": [],
                        "connection": [
                            {
                                "jdbcUrl": [],
                                "table": []
                            }
                        ],
                        "password": "",
                        "username": "",
                        "where": ""
                    }
                },
                "writer": {
                    "name": "mysqlwriter",
                    "parameter": {
                        "column": [],
                        "connection": [
                            {
                                "jdbcUrl": "",
                                "table": []
                            }
                        ],
                        "password": "",
                        "preSql": [],
                        "session": [],
                        "username": "",
                        "writeMode": ""
                    }
                }
            }
        ],
        "setting": {
            "speed": {
                "channel": ""
            }
        }
    }
}

配置mongodb2mysql.json文件

{
    "job": {
        "setting": {
            "speed": {
                "channel": 1
            }
        },
        "content": [{
            "reader": {
                "name": "mongodbreader",
                "parameter": {
                    "address": ["*.*.*.*:27017"],
                    "userName": "root",
                    "userPassword": "123456",
                    "dbName": "weixin",
                    "collectionName": "fileids_wxpy",
                    "column": [{
                        "index":0,
                        "name": "_id",
                        "type": "string"
                    }, {
                        "index":1,
                        "name": "crawler_time",
                        "type": "string"
                    }, {
                        "index":2,
                        "name": "file_url",
                        "type": "string"
                    }, {
                        "index":3,
                        "name": "flag",
                        "type": "string"
                    }, {
                        "index":4,
                        "name": "logo_url",
                        "type": "string"
                    }, {
                        "index":5,
                        "name": "source",
                        "type": "string"
                    }, {
                        "index":6,
                        "name": "update_date",
                        "type": "string"
                    }, {
                        "index":7,
                        "name": "update_time",
                        "type": "long"
                    }, {
                        "index":8,
                        "name": "wx_id",
                        "type": "string"
                    }, {
                        "index":9,
                        "name": "wx_name",
                        "type": "string"
                    }]
                }
            },
             "writer": {
                    "name": "mysqlwriter", 
                    "parameter": {
                        "column": [
                        "id",
                        "crawler_time",
                        "file_url",
                        "flag",
                        "logo_url",
                        "source",
                        "update_date",
                        "update_time",
                        "wx_id",
                        "wx_name"
                        ], 
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://*.*.*.*:3306/weixin?characterEncoding=utf8", 
                                "table": ["fileids_wxpy"]
                            }
                        ], 
                        "password": "123456", 
                        "username": "root"
                    }
                }
        }]
    }

}

mysql新建數(shù)據(jù)庫(kù) 官疲、表

create DATABASE weixin;
use weixin;
DROP TABLE IF EXISTS `fileids_wxpy`;
CREATE TABLE `fileids_wxpy` (
  `id` bigint(20) unsigned NOT NULL,
  `crawler_time` int(10) unsigned NOT NULL,
  `file_url` varchar(255) NOT NULL,
  `flag` varchar(255) NOT NULL,
  `logo_url` varchar(255) NOT NULL,
  `source` varchar(255) NOT NULL,
  `update_date` int(10) unsigned NOT NULL,
  `update_time` int(10) unsigned NOT NULL,
  `wx_id` varchar(255) NOT NULL,
  `wx_name` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

啟動(dòng)

C:\DataX\bin>python datax.py mongodb2mysql.json

報(bào)錯(cuò)

Caused by: com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=*.*.*.*:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSecurityException: Exception authenticating MongoCredential{mechanism=null, userName='root', source='weixin', password=<hidden>, mechanismProperties={}}}, caused by {com.mongodb.MongoCommandException: Command failed with error 18: 'Authentication failed.' on server*.*.*.*:27017. The full response is { "ok" : 0.0, "errmsg" : "Authentication failed.", "code" : 18, "codeName" : "AuthenticationFailed" }}}]
        at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
        at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
        at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
        at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
        at com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
        at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:201)
        at com.mongodb.operation.CountOperation.execute(CountOperation.java:206)
        at com.mongodb.operation.CountOperation.execute(CountOperation.java:53)
        at com.mongodb.Mongo.execute(Mongo.java:772)
        at com.mongodb.Mongo$2.execute(Mongo.java:759)
        at com.mongodb.MongoCollectionImpl.count(MongoCollectionImpl.java:185)
        at com.mongodb.MongoCollectionImpl.count(MongoCollectionImpl.java:165)
        at com.alibaba.datax.plugin.reader.mongodbreader.util.CollectionSplitUtil.doSplitInterval(CollectionSplitUtil.java:55)
        at com.alibaba.datax.plugin.reader.mongodbreader.util.CollectionSplitUtil.doSplit(CollectionSplitUtil.java:37)
        at com.alibaba.datax.plugin.reader.mongodbreader.MongoDBReader$Job.split(MongoDBReader.java:37)
        at com.alibaba.datax.core.job.JobContainer.doReaderSplit(JobContainer.java:732)
        at com.alibaba.datax.core.job.JobContainer.split(JobContainer.java:393)
        at com.alibaba.datax.core.job.JobContainer.start(JobContainer.java:117)
        ... 3 more

原因

MongoDB中每個(gè)數(shù)據(jù)庫(kù)之間是相互獨(dú)立的去团,都有獨(dú)立的權(quán)限,正確的做法是使用root賬號(hào)在【將要操作的數(shù)據(jù)庫(kù)】中創(chuàng)建一個(gè)【子賬號(hào)】,在用這個(gè)子賬號(hào)連接mongo

解決辦法

>use admin

switched to db admin

>db.auth("root","******")

1

>show dbs

admin

local

weixin

>use weixin

switched to db weixin

>db.createUser(
        {
            user:"DataXTest",
            pwd:"123456",
            roles:[{role:"dbOwner",db:"weixin"}]
        }
)

Successfully added user: {
    "user" : "DataXTest",
    "roles" : [
        {
            "role" : "dbOwner",
            "db" : "weixin"
        }
    ]

}

使用DataXTest來(lái)替換jsono配置文件中mongodb的賬號(hào)root后,再次運(yùn)行

C:\DataX\bin>python datax.py mongodb2mysql.json
2019-06-13 14:39:40.218 [job-0] INFO  JobContainer - PerfTrace not enable!
2019-06-13 14:39:40.219 [job-0] INFO  StandAloneJobContainerCommunicator - Total 50115 records, 17716504 bytes | Speed 36.04KB/s, 104 records/s | Error 12 records, 3513 bytes |  All Task WaitWriterTime 259.684s |  All Task WaitReaderTime 207.041s | Percentage 100.00%
2019-06-13 14:39:40.221 [job-0] INFO  JobContainer -
任務(wù)啟動(dòng)時(shí)刻                    : 2019-06-13 14:31:33
任務(wù)結(jié)束時(shí)刻                    : 2019-06-13 14:39:40
任務(wù)總計(jì)耗時(shí)                    :                487s
任務(wù)平均流量                    :           36.04KB/s
記錄寫入速度                    :            104rec/s
讀出記錄總數(shù)                    :               50115
讀寫失敗總數(shù)                    :                  12

注: 此處錯(cuò)誤的12條記錄是由于id 長(zhǎng)度超過(guò)19位

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市敞映,隨后出現(xiàn)的幾起案子沸伏,更是在濱河造成了極大的恐慌糕珊,老刑警劉巖,帶你破解...
    沈念sama閱讀 216,997評(píng)論 6 502
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件毅糟,死亡現(xiàn)場(chǎng)離奇詭異红选,居然都是意外死亡,警方通過(guò)查閱死者的電腦和手機(jī)姆另,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,603評(píng)論 3 392
  • 文/潘曉璐 我一進(jìn)店門喇肋,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái),“玉大人迹辐,你說(shuō)我怎么就攤上這事苟蹈。” “怎么了右核?”我有些...
    開(kāi)封第一講書人閱讀 163,359評(píng)論 0 353
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)渺绒。 經(jīng)常有香客問(wèn)我贺喝,道長(zhǎng),這世上最難降的妖魔是什么宗兼? 我笑而不...
    開(kāi)封第一講書人閱讀 58,309評(píng)論 1 292
  • 正文 為了忘掉前任躏鱼,我火速辦了婚禮,結(jié)果婚禮上殷绍,老公的妹妹穿的比我還像新娘染苛。我一直安慰自己,他們只是感情好主到,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,346評(píng)論 6 390
  • 文/花漫 我一把揭開(kāi)白布茶行。 她就那樣靜靜地躺著,像睡著了一般登钥。 火紅的嫁衣襯著肌膚如雪畔师。 梳的紋絲不亂的頭發(fā)上,一...
    開(kāi)封第一講書人閱讀 51,258評(píng)論 1 300
  • 那天牧牢,我揣著相機(jī)與錄音看锉,去河邊找鬼。 笑死塔鳍,一個(gè)胖子當(dāng)著我的面吹牛伯铣,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播轮纫,決...
    沈念sama閱讀 40,122評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼腔寡,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了蜡感?” 一聲冷哼從身側(cè)響起蹬蚁,我...
    開(kāi)封第一講書人閱讀 38,970評(píng)論 0 275
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤恃泪,失蹤者是張志新(化名)和其女友劉穎,沒(méi)想到半個(gè)月后犀斋,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體贝乎,經(jīng)...
    沈念sama閱讀 45,403評(píng)論 1 313
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,596評(píng)論 3 334
  • 正文 我和宋清朗相戀三年叽粹,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了览效。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 39,769評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡虫几,死狀恐怖锤灿,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情辆脸,我是刑警寧澤但校,帶...
    沈念sama閱讀 35,464評(píng)論 5 344
  • 正文 年R本政府宣布,位于F島的核電站啡氢,受9級(jí)特大地震影響状囱,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜倘是,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,075評(píng)論 3 327
  • 文/蒙蒙 一亭枷、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧搀崭,春花似錦叨粘、人聲如沸。這莊子的主人今日做“春日...
    開(kāi)封第一講書人閱讀 31,705評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至轰传,卻和暖如春冻晤,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背绸吸。 一陣腳步聲響...
    開(kāi)封第一講書人閱讀 32,848評(píng)論 1 269
  • 我被黑心中介騙來(lái)泰國(guó)打工鼻弧, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人锦茁。 一個(gè)月前我還...
    沈念sama閱讀 47,831評(píng)論 2 370
  • 正文 我出身青樓攘轩,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親码俩。 傳聞我的和親對(duì)象是個(gè)殘疾皇子度帮,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,678評(píng)論 2 354

推薦閱讀更多精彩內(nèi)容