Doris系列4-Doris Hive Spark簡單的比較

一.測試環(huán)境準(zhǔn)備

測試環(huán)境:
Doris 4臺虛擬機(jī)熏挎,4核8G 150GB普通磁盤速勇。
Hive 4臺虛擬機(jī),4核8G 150GB普通磁盤坎拐。
Hive on Spark 同Hive

測試數(shù)據(jù):
數(shù)據(jù)量7億左右烦磁。

hive> desc ods_fact_sale_orc;
OK
id                      bigint                                      
sale_date               string                                      
prod_name               string                                      
sale_nums               int                                         
Time taken: 0.202 seconds, Fetched: 4 row(s)

測試語句:

select * from ods_fact_sale_orc where id = 100;
select count(*) from ods_fact_sale_orc;

備注:
因?yàn)槭褂玫氖堑团涮摂M機(jī),加上配置都是默認(rèn)值哼勇,故測試結(jié)果只能作為一個參考都伪,請知。

二. 測試結(jié)果

2.1 Hive

hive> select * from ods_fact_sale_orc where id = 100;
Query ID = root_20211208104736_0c2ab392-2f47-4b42-94fc-c2daf14a2f7d
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
21/12/08 10:47:38 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm69
Starting Job = job_1638236643110_0032, Tracking URL = http://hp3:8088/proxy/application_1638236643110_0032/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1638236643110_0032
Hadoop job information for Stage-1: number of mappers: 9; number of reducers: 0
2021-12-08 10:47:45,948 Stage-1 map = 0%,  reduce = 0%
2021-12-08 10:47:55,366 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 5.48 sec
2021-12-08 10:47:58,483 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 44.13 sec
2021-12-08 10:48:05,773 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 54.31 sec
2021-12-08 10:48:07,831 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 63.71 sec
2021-12-08 10:48:08,867 Stage-1 map = 89%,  reduce = 0%, Cumulative CPU 73.08 sec
2021-12-08 10:48:09,896 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 82.58 sec
MapReduce Total cumulative CPU time: 1 minutes 22 seconds 580 msec
Ended Job = job_1638236643110_0032
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 9   Cumulative CPU: 82.58 sec   HDFS Read: 2150344344 HDFS Write: 830 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 22 seconds 580 msec
OK
100     2012-07-09 00:00:00.0   PROD9   38
Time taken: 34.411 seconds, Fetched: 1 row(s)
hive> 
    > select count(*) from ods_fact_sale_orc;
Query ID = root_20211208104925_0b4ece78-2735-4ebb-bab9-431aae074e11
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
21/12/08 10:49:25 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm69
Starting Job = job_1638236643110_0033, Tracking URL = http://hp3:8088/proxy/application_1638236643110_0033/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1638236643110_0033
Hadoop job information for Stage-1: number of mappers: 9; number of reducers: 1
2021-12-08 10:49:32,158 Stage-1 map = 0%,  reduce = 0%
2021-12-08 10:49:39,433 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 3.87 sec
2021-12-08 10:49:40,465 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 9.17 sec
2021-12-08 10:49:41,497 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 25.21 sec
2021-12-08 10:49:45,624 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 30.72 sec
2021-12-08 10:49:46,653 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 35.9 sec
2021-12-08 10:49:47,685 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 46.3 sec
2021-12-08 10:49:53,847 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 48.87 sec
MapReduce Total cumulative CPU time: 48 seconds 870 msec
Ended Job = job_1638236643110_0033
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 9  Reduce: 1   Cumulative CPU: 48.87 sec   HDFS Read: 1992485 HDFS Write: 109 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 48 seconds 870 msec
OK
767830000
Time taken: 29.842 seconds, Fetched: 1 row(s)
hive> 

2.2 Hive on Spark

hive> 
    > set hive.execution.engine=spark;
hive> select count(*) from ods_fact_sale_orc;
Query ID = root_20211208105129_f4675518-560e-43e7-b4d7-c4db0a5d52c4
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Running with YARN Application = application_1638236643110_0034
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/yarn application -kill application_1638236643110_0034
Hive on Spark Session Web UI URL: http://hp4:10421

Query Hive on Spark job[0] stages: [0, 1]
Spark job[0] status = RUNNING
--------------------------------------------------------------------------------------
          STAGES   ATTEMPT        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
--------------------------------------------------------------------------------------
Stage-0 ........         0      FINISHED      9          9        0        0       0  
Stage-1 ........         0      FINISHED      1          1        0        0       0  
--------------------------------------------------------------------------------------
STAGES: 02/02    [==========================>>] 100%  ELAPSED TIME: 11.12 s    
--------------------------------------------------------------------------------------
Spark job[0] finished successfully in 11.12 second(s)
Spark Job[0] Metrics: TaskDurationTime: 25265, ExecutorCpuTime: 15802, JvmGCTime: 469, BytesRead / RecordsRead: 1979283 / 749902, BytesReadEC: 0, ShuffleTotalBytesRead / ShuffleRecordsRead: 522 / 9, ShuffleBytesWritten / ShuffleRecordsWritten: 522 / 9
OK
767830000
Time taken: 30.556 seconds, Fetched: 1 row(s)
hive> 
    > select * from ods_fact_sale_orc where id = 100;
Query ID = root_20211208105212_f5968f86-89ab-4e44-b712-4a687a7bec7f
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Running with YARN Application = application_1638236643110_0034
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/yarn application -kill application_1638236643110_0034
Hive on Spark Session Web UI URL: http://hp4:10421

Query Hive on Spark job[1] stages: [2]
Spark job[1] status = RUNNING
--------------------------------------------------------------------------------------
          STAGES   ATTEMPT        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
--------------------------------------------------------------------------------------
Stage-2 ........         0      FINISHED      9          9        0        0       0  
--------------------------------------------------------------------------------------
STAGES: 01/01    [==========================>>] 100%  ELAPSED TIME: 15.08 s    
--------------------------------------------------------------------------------------
Spark job[1] finished successfully in 15.08 second(s)
Spark Job[1] Metrics: TaskDurationTime: 49170, ExecutorCpuTime: 42275, JvmGCTime: 2493, BytesRead / RecordsRead: 2150340258 / 749902, BytesReadEC: 0, ShuffleTotalBytesRead / ShuffleRecordsRead: 0 / 0, ShuffleBytesWritten / ShuffleRecordsWritten: 0 / 0
OK
100     2012-07-09 00:00:00.0   PROD9   38
Time taken: 15.237 seconds, Fetched: 1 row(s)
hive> 

2.3 Doris

mysql> select * from table3 where id = 100;
+------+-----------------------+-----------+-----------+
| id   | sale_date             | prod_name | sale_nums |
+------+-----------------------+-----------+-----------+
|  100 | 2012-07-09 00:00:00.0 | PROD9     |        38 |
+------+-----------------------+-----------+-----------+
1 row in set (0.03 sec)

mysql> select count(*) from table3;
+-----------+
| count(*)  |
+-----------+
| 767830000 |
+-----------+
1 row in set (17.92 sec)

2.4 測試結(jié)果

產(chǎn)品 單行查詢時(shí)間 查詢總數(shù)時(shí)間
Hive 34秒 30秒
Hive on Spark 15秒 11秒
Doris 0.03秒 18秒

可以得出簡單結(jié)論:

  1. Doris單行查詢性能秒殺 Hive和Hive on Spark
  2. Doris查詢總數(shù)性能比Hive on Spark稍慢

官網(wǎng)看到的积担,看來Aggregate模型的情況下count(*) 這個有點(diǎn)欺負(fù)Doris了陨晶。


image.png

后面的測試,有待更新......

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末帝璧,一起剝皮案震驚了整個濱河市先誉,隨后出現(xiàn)的幾起案子湿刽,更是在濱河造成了極大的恐慌,老刑警劉巖褐耳,帶你破解...
    沈念sama閱讀 219,270評論 6 508
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件诈闺,死亡現(xiàn)場離奇詭異,居然都是意外死亡铃芦,警方通過查閱死者的電腦和手機(jī)雅镊,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,489評論 3 395
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來杨帽,“玉大人漓穿,你說我怎么就攤上這事∽⒂” “怎么了晃危?”我有些...
    開封第一講書人閱讀 165,630評論 0 356
  • 文/不壞的土叔 我叫張陵,是天一觀的道長老客。 經(jīng)常有香客問我僚饭,道長,這世上最難降的妖魔是什么胧砰? 我笑而不...
    開封第一講書人閱讀 58,906評論 1 295
  • 正文 為了忘掉前任鳍鸵,我火速辦了婚禮,結(jié)果婚禮上尉间,老公的妹妹穿的比我還像新娘偿乖。我一直安慰自己,他們只是感情好哲嘲,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,928評論 6 392
  • 文/花漫 我一把揭開白布贪薪。 她就那樣靜靜地躺著,像睡著了一般眠副。 火紅的嫁衣襯著肌膚如雪画切。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,718評論 1 305
  • 那天囱怕,我揣著相機(jī)與錄音霍弹,去河邊找鬼。 笑死娃弓,一個胖子當(dāng)著我的面吹牛典格,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播台丛,決...
    沈念sama閱讀 40,442評論 3 420
  • 文/蒼蘭香墨 我猛地睜開眼钝计,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了齐佳?” 一聲冷哼從身側(cè)響起私恬,我...
    開封第一講書人閱讀 39,345評論 0 276
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎炼吴,沒想到半個月后本鸣,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,802評論 1 317
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡硅蹦,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,984評論 3 337
  • 正文 我和宋清朗相戀三年荣德,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片童芹。...
    茶點(diǎn)故事閱讀 40,117評論 1 351
  • 序言:一個原本活蹦亂跳的男人離奇死亡涮瞻,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出假褪,到底是詐尸還是另有隱情署咽,我是刑警寧澤,帶...
    沈念sama閱讀 35,810評論 5 346
  • 正文 年R本政府宣布生音,位于F島的核電站宁否,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏缀遍。R本人自食惡果不足惜慕匠,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,462評論 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望域醇。 院中可真熱鬧台谊,春花似錦、人聲如沸譬挚。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,011評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽殴瘦。三九已至狠角,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間蚪腋,已是汗流浹背丰歌。 一陣腳步聲響...
    開封第一講書人閱讀 33,139評論 1 272
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留屉凯,地道東北人立帖。 一個月前我還...
    沈念sama閱讀 48,377評論 3 373
  • 正文 我出身青樓,卻偏偏與公主長得像悠砚,于是被迫代替她去往敵國和親晓勇。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,060評論 2 355

推薦閱讀更多精彩內(nèi)容