環(huán)境準(zhǔn)備
- jdk1.8.0_301
- scala-2.11.8
- spark-2.4.8-bin-hadoop2.7
- hadoop-2.7.6(spark on yarn時需要)
- 當(dāng)前目錄:/root/***/packages/
- 當(dāng)前機器:bigdata112
1. Local模式
安裝jdk
wget https://download.oracle.com/otn/java/jdk/8u301-b09/d3c52aa6bfa54d3ca74e617f18309292/jdk-8u301-linux-x64.tar.gz?AuthParam=1631169458_b753f63069d375ab0a6a52e1d9cd9013
tar xzvf jdk-8u301-linux-x64.tar.gz -C ../software/
- 配置環(huán)境變量:
vim ~/.profile
,輸入:
JAVA_HOME=/root/***/software/jdk1.8.0_301
PATH=$PATH:$JAVA_HOME/bin
- 環(huán)境變量生效
source ~/.profile
- 驗證安裝:
java -version
,出現(xiàn)以下信息說明安裝成功:
java version "1.8.0_301"
Java(TM) SE Runtime Environment (build 1.8.0_301-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.301-b09, mixed mode)
安裝scala
wget https://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz
tar xzvf scala-2.11.8.tgz -C ../software/
- 配置環(huán)境變量:
vim ~/.profile
霜第,輸入:
SCALA_HOME=/root/***/software/scala-2.11.8
PATH=$PATH:$SCALA_HOME/bin
- 環(huán)境變量生效
source ~/.profile
- 驗證安裝:
scala
丐怯,出現(xiàn)以下信息說明安裝成功,:q
退出:
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_301).
Type in expressions for evaluation. Or try :help.
安裝spark
wget https://archive.apache.org/dist/spark/spark-2.4.8/spark-2.4.8-bin-hadoop2.7.tgz
tar xzvf spark-2.4.8-bin-hadoop2.7.tgz -C ../software/
- 配置環(huán)境變量:
vim ~/.profile
蚀同,輸入:
SPARK_HOME=/root/***/software/spark-2.4.8-bin-hadoop2.7
PATH=$PATH:$SPARK_HOME/bin
- 環(huán)境變量生效
source ~/.profile
- 驗證安裝:
spark-shell
缅刽,出現(xiàn)以下信息說明安裝成功,:q
退出:
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.8
/_/
2. Standalone模式
hostname |
role |
bigdata112 |
master |
bigdata113 |
worker |
bigdata114 |
worker |
bigdata115 |
worker |
- 在Local模式基礎(chǔ)上(當(dāng)前機器:bigdata112):
cd ../software/spark-2.4.8-bin-hadoop2.7/conf/
cp spark-env.sh.template spark-env.sh
vim spark-env.sh
export JAVA_HOME=/root/***/software/jdk1.8.0_301
export SCALA_HOME=/root/***/software/scala-2.11.8
export SPARK_HOME=/root/***/software/spark-2.4.8-bin-hadoop2.7
export SPARK_EXECUTOR_MEMORY=5G
export SPARK_EXECUTOR_cores=2
export SPARK_WORKER_CORES=2
cp slaves.template slaves
vim slaves
bigdata113
bigdata114
bigdata115
- 將spark目錄復(fù)制到其他機器上(注意環(huán)境變量也要保持一致)
scp /root/***/software/spark-2.4.8-bin-hadoop2.7 bigdata113:/root/***/software/
scp /root/***/software/spark-2.4.8-bin-hadoop2.7 bigdata114:/root/***/software/
scp /root/***/software/spark-2.4.8-bin-hadoop2.7 bigdata115:/root/***/software/
- 啟動master(webUI端口默認(rèn)是8080):
./root/***/software/spark-2.4.8-bin-hadoop2.7/sbin/start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /root/***/software/spark-2.4.8-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-***2021.out
- 啟動salves(webUI端口默認(rèn)是8080):
./root/***/software/spark-2.4.8-bin-hadoop2.7/sbin/start-slaves.sh
bigdata113: starting org.apache.spark.deploy.worker.Worker, logging to /root/***/software/spark-2.4.8-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-***2021.out
bigdata114: starting org.apache.spark.deploy.worker.Worker, logging to /root/***/software/spark-2.4.8-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-2-***2021.out
bigdata115: starting org.apache.spark.deploy.worker.Worker, logging to /root/***/software/spark-2.4.8-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-3-***2021.out
- 執(zhí)行
spark-shell --master spark://***2021:7077
就能看到master = spark://***2021:7077的信息
Spark context available as 'sc' (master = spark://***2021:7077, app id = app-20210909163213-0001).
Spark On Yarn模式
- hadoop/yarn安裝見我的另一個博客:Hadoop三種模式的安裝與配置
- 基于Standalone模式基礎(chǔ)蠢络,在spark-env.sh中添加hadoop和yarn的配置文件位置信息
HADOOP_CONF_DIR=/root/***/software/hadoop-2.7.6/etc/hadoop
YARN_CONF_DIR=/root/***/software/hadoop-2.7.6/etc/hadoop
- 將spark-env.sh文件復(fù)制到其他機器
- 啟動hadoop衰猛、yarn(無需啟動spark的Master和slaves,因為它兩將由yarn管理啟動)
- 執(zhí)行
spark-shell --master yarn --deploy-mode client
就能看到master = yarn的信息
Spark context available as 'sc' (master = yarn, app id = application_1560334779290_0001).