文章來源:科多大數(shù)據(jù)
許多對大數(shù)據(jù)有一定了解的同學(xué),對于大數(shù)據(jù)常用命令不是很熟悉镀层。今天科多大數(shù)據(jù)老師就總結(jié)了大數(shù)據(jù)Hadoop培訓(xùn)學(xué)習(xí)常用命令,下面跟隨著科多大數(shù)據(jù)老師一起來看一看吧。
每臺服務(wù)器需要關(guān)閉防火墻
systemctl
daemon-reload(masterJ節(jié)點)
systemctl
stop firewalld
.刪除文件夾
mkdir
/opt/tmp
rm
-fr /usr/hadoop/name
rm
-fr /usr/hadoop/data
mkdir
/usr/hadoop/name
mkdir
/usr/hadoop/data
.格式化namenode
hdfs
namenode -format
.啟動hdfs
/usr/hadoop/sbin/start-dfs.sh
/usr/hadoop/sbin/start-yarn.sh
.停止hdfs
/usr/hadoop/sbin/stop-yarn.sh
/usr/hadoop/sbin/stop-dfs.sh
--cd
/usr/hadoop/sbin
.關(guān)閉安全模式
hdfs
dfsadmin -safemode leave (啟動hadoop后种玛,才能執(zhí)行,msster節(jié)點運行即可)
sbin/hadoop-daemon.sh
start secondarynamenode
.啟動zookeeper
4.1每臺服務(wù)器啟動zookeeper
/usr/zookeeper/bin/zkServer.sh
start
cd
/usr/zookeeper
4.2所有服務(wù)器臺機器分別啟動后笑窜,查看狀態(tài)
/usr/zookeeper/bin/zkServer.sh
status
bin/zkServer.sh
status 查看啟動是否成功致燥,三臺機器會選擇一臺做為leader,另兩臺為follower
cd
/usr/zookeeper
/usr/zookeeper/bin/zkServer.sh
status
.啟動hbase
start-hbase.sh,(因配置了環(huán)境變量,不需指定具體路徑)(msster節(jié)點運行即可)
.啟動spark
cd
/usr/spark/sbin
./start-all.sh
.啟動hive
metastore
hive
--service metastore怖侦,執(zhí)行hive前一定要運行,重要,然后重新打開一個會話窗口
.登陸mysql
mysql
-u root -p Mysql5718%
.強制刪除文件夾
rm
-fr /opt/spark
.修改hostname
[root@bogon
~]# hostname slave1
[root@bogon
~]# hostname slave2
[root@bogon
~]# hostname slave3
hive>set
-v;
修改時間
date
-s 14:24:00
6.查看日志
cat
/usr/hadoop/logs/hadoop-root-datanode-slave1.log
7.mysql密碼 Mysql5718%
連接,mysql -u root -p
遠程授權(quán)
GRANT
ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'Mysql5718%' WITH GRANT
OPTION;
FLUSH
PRIVILEGES;
GRANT
ALL PRIVILEGES ON *.* TO 'root'@'master' IDENTIFIED BY '12345678' WITH GRANT
OPTION;
GRANT
ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY '12345678' WITH GRANT
OPTION;
schematool
-dbType mysql -initSchema
#
reboot #重啟主機
#
shutdown -h now #立即關(guān)機
#
poweroff
*********************************
重要文件
vi
/etc/hosts
cd
/etc/sysconfig
vi
network
(
vi /etc/sysconfig/network )
cd
/usr/hadoop/etc/hadoop/
vi
yarn-site.xml
vi
hdfs-site.xml
vi
core-site.xml
vi
mapred-site.xml
******************************
1篡悟、查看mysql版本
方法一:status;
方法二:select version();
2、Mysql啟動匾寝、停止搬葬、重啟常用命令
service
mysqld start
service
mysqld stop
service
mysqld restart
*********************************
遠程拷貝
scp
-r /usr/hadoop root@192.168.50.131:/usr/
*********************************
上傳本地文件文件到hdfs
[root@master
bin]# hadoop fs -put /usr/hadoop/file/file1.txt /usr/hadoop/input
hadoop
fs -put /usr/spark/lib/spark-* /spark-jars
*********************************
調(diào)用java包,方法名艳悔,輸入急凰,輸出
[root@master
sbin]# hadoop jar
/usr/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
wordcount /input /output
*********************************
查看輸出結(jié)果
[root@master
sbin]# hadoop fs -cat /usr/hadoop/output1/*
hadoop
fs -ls /spark-jars
hdfs
dfs -ls /spark-jars
*********************************
編譯spark
./dev/make-distribution.sh
--name "hadoop2-without-hive" --tgz
"-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11" -rf
:spark-mllib-local_2.11
./dev/make-distribution.sh
--name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11"
-rf :spark-hive_2.11
./dev/make-distribution.sh
--name "hadoop2-without-hive" --tgz
"-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11" -rf
:spark-repl_2.11
*********************************
編譯hive
mvn
clean install -Phadoop-2,dist -DskinpTests -Dhadoop-23.version=2.7.1
-Dspark.version=2.0.3
mvn
clean install -Phadoop-2,dist -DskinpTests
以下命令會生成 hive_code_source/packaging/target/apache-hive-2.1.1-bin.tar.gz
mvn
clean package -Pdist -Dmaven.test.skip=true
*********************************
修改maven的conf文件夾下的settings.xml文件。
設(shè)置maven的本地倉庫
<localRepository>/home/devil/maven_repos</localRepository>
mvn
clean install -DskipTests -X
*********************************
產(chǎn)看hive,hbase 版本命令
hive
--version
hbase
shell
*****************************
拷貝文件不提示
yes|cp
-fr /opt/hive211/conf/* /opt/hive2.1.1/conf
cp
/usr/spark/jars/scala-* /opt/hive2.1.1/lib
***************
spark-shell
cd
/usr/spark/bin
**************
netstat
-tunlp|grep 4040
netstat
-tunlp|grep java
*****************
./bin/spark-submit
--class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client
lib/spark-examples-1.6.3-hadoop2.4.0.jar 10
*****************************
通過 ps 指令獲得指定進程名稱的 pid
2013/04/12
BY 虛偽的靈魂·
通過 ps 指令獲得制定進程名稱的 pid 步驟如下:
1.打印出全部進程的, 進程名稱以及pid
ps
-ef
大概會得到類似如下結(jié)果:
UID
PID PPID C STIME TTY TIME CMD
root
1 0 0 09:01 ? 00:00:00 /sbin/init
root
2 0 0 09:01 ? 00:00:00 [kthreadd]
root
3 2 0 09:01 ? 00:00:00 [ksoftirqd/0]
root
5 2 0 09:01 ? 00:00:00 [kworker/u:0]
root
6 2 0 09:01 ? 00:00:00 [migration/0]
root
7 2 0 09:01 ? 00:00:00 [watchdog/0]
root
8 2 0 09:01 ? 00:00:00 [migration/1]
root
10 2 0 09:01 ? 00:00:00 [ksoftirqd/1]
root
12 2 0 09:01 ? 00:00:00 [watchdog/1]
2.過濾出指定的進程名稱
ps
-ef | grep mysqld
大概會得到類似如下結(jié)果:
mysql
841 1 0 09:01 ? 00:00:02 /usr/sbin/mysqld
xwsoul
4532 4205 0 11:16 pts/0 00:00:00 grep --color=auto mysqld
3.這樣就會多出一行我們剛剛的 grep mysqld 的結(jié)果, 因此我們要忽略該指令
ps
-ef | grep mysqld | grep -v 'grep '
大概會得到類似如下的結(jié)果:
mysql
841 1 0 09:01 ? 00:00:02 /usr/sbin/mysqld
4.使用 awk 打印出pid號
ps
-ef | grep mysqld | grep -v 'grep ' | awk '{print $2}'
大概會得到類似如下的結(jié)果:
841
同樣的如果像獲得進程的父進程號(ppid), 可按如下操作:
ps
-ef | grep mysqld | grep -v 'grep ' | awk '{print $3}'
****************************
hive2.1.1,spark2.0.2搭建
1)
spark
#組件:mvn-3.3.9 jdk-1.8
#wget
http://mirror.bit.edu.cn/apache/spark/spark-2.0.2/spark-2.0.2.tgz ---下載源碼 (如果是Hive on spark---hive2.1.1對應(yīng)spark1.6.0)
#tar
zxvf spark-2.0.2.tgz ---解壓
#cd
spark-2.0.2/dev
##修改make-distribution.sh的MVN路徑為$M2_HOME/bin/mvn ---查看并安裝pom.xml的mvn版本
##cd
.. ---切換到spark-2.0.2
#./dev/change-scala-version.sh
2.11 ---更改scala版本(低于11不需要此步驟)
#./dev/make-distribution.sh
--name "hadoop2-without-hive" --tgz
"-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided" ---生成在根目錄下
2)
hive
wget
http://mirror.bit.edu.cn/apache/hive/hive-2.3.2/apache-hive-2.3.2-src.tar.gz
tar
-zxf apache-hive-2.1.1-src.tar.gz
mv
apache-hive-2.1.1-src.tar.gz hive2_1_1
cd
/opt/hive2_1_1
編譯hive
mvn
clean package -Pdist -Dmaven.test.skip=true
**************************
hadoop常用命令
1、查看指定目錄下內(nèi)容:hadoop fs –ls [文件目錄]
[root@cdh01
tmp]# hadoop fs -ls -h /tmp
Found
2 items
drwxrwxrwx
- hdfs supergroup 0 2016-01-21 10:24
/tmp/.cloudera_health_monitoring_canary_files
drwx-wx-wx
- hive supergroup 0 2016-01-21 10:02 /tmp/hive
[root@cdh01
tmp]# hadoop fs -ls -h /
Found
2 items
drwxrwxrwx
- hdfs supergroup 0 2016-01-21 10:02 /tmp
drwxrwxr-x
- hdfs supergroup 0 2016-01-21 10:01 /user
2抡锈、將本地文件夾存儲至hadoop上:hadoop fs –put [本地目錄] [hadoop目錄]
[root@cdh01
/]# mkdir test_put_dir #創(chuàng)建目錄
[root@cdh01
/]# chown hdfs:hadoop test_put_dir #賦目錄權(quán)限給hadoop用戶
[root@cdh01
/]# su hdfs #切換到hadoop用戶
[hdfs@cdh01
/]$ ls
bin
boot dev dfs dfs_bak etc home lib lib64 lost+found media misc mnt net opt proc
root sbin selinux srv sys test_put_dir tmp usr var wawa.txt wbwb.txt wyp.txt
[hdfs@cdh01
/]$ hadoop fs -put test_put_dir /
[hdfs@cdh01
/]$ hadoop fs -ls /
Found
4 items
drwxr-xr-x
- hdfs supergroup 0 2016-01-21 11:07 /hff
drwxr-xr-x
- hdfs supergroup 0 2016-01-21 15:25 /test_put_dir
drwxrwxrwt
- hdfs supergroup 0 2016-01-21 10:39 /tmp
drwxr-xr-x
- hdfs supergroup 0 2016-01-21 10:39 /user
3疾忍、在hadoop指定目錄內(nèi)創(chuàng)建新目錄:hadoop fs –mkdir [目錄地址]
[root@cdh01
/]# su hdfs
[hdfs@cdh01
/]$ hadoop fs -mkdir /hff
4、在hadoop指定目錄下新建一個空文件床三,使用touchz命令:
[hdfs@cdh01
/]$ hadoop fs -touchz /test_put_dir/test_new_file.txt
[hdfs@cdh01
/]$ hadoop fs -ls /test_put_dir
Found
1 items
-rw-r--r--
3 hdfs supergroup 0 2016-01-21 15:29 /test_put_dir/test_new_file.txt
5一罩、將本地文件存儲至hadoop上:hadoop fs –put [本地地址] [hadoop目錄]
[hdfs@cdh01
/]$ hadoop fs -put wyp.txt /hff #直接目錄
[hdfs@cdh01
/]$ hadoop fs -put wyp.txt hdfs://cdh01.cap.com:8020/hff #服務(wù)器目錄
注:文件wyp.txt放在/根目錄下,結(jié)構(gòu)如:
bin
dfs_bak lib64 mnt root sys var
boot
etc lost+found net sbin test_put_dir wawa2.txt
dev
home media opt selinux tmp wbwb.txt
dfs
lib misc proc srv usr wyp.txt
6撇簿、打開某個已存在文件:hadoop fs –cat [file_path]
[hdfs@cdh01
/]$ hadoop fs -cat /hff/wawa.txt
1張三 男 135
2劉麗 女 235
3王五 男 335
7聂渊、將hadoop上某個文件重命名hadoop fs –mv [舊文件名] [新文件名]
[hdfs@cdh01
/]$ hadoop fs -mv /tmp /tmp_bak #修改文件夾名
8、將hadoop上某個文件down至本地已有目錄下:hadoop fs -get [文件目錄] [本地目錄]
[hdfs@cdh01
/]$ hadoop fs -get /hff/wawa.txt /test_put_dir
[hdfs@cdh01
/]$ ls -l /test_put_dir/
total
4
-rw-r--r--
1 hdfs hdfs 42 Jan 21 15:39 wawa.txt
9四瘫、刪除hadoop上指定文件:hadoop fs -rm [文件地址]
[hdfs@cdh01
/]$ hadoop fs -ls /test_put_dir/
Found
2 items
-rw-r--r--
3 hdfs supergroup 0 2016-01-21 15:41 /test_put_dir/new2.txt
-rw-r--r--
3 hdfs supergroup 0 2016-01-21 15:29 /test_put_dir/test_new_file.txt
[hdfs@cdh01
/]$ hadoop fs -rm /test_put_dir/new2.txt
16/01/21
15:42:24 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion
interval = 1440 minutes, Emptier interval = 0 minutes.
Moved:
'hdfs://cdh01.cap.com:8020/test_put_dir/new2.txt' to trash at:
hdfs://cdh01.cap.com:8020/user/hdfs/.Trash/Current
[hdfs@cdh01
/]$ hadoop fs -ls /test_put_dir/
Found
1 items
-rw-r--r--
3 hdfs supergroup 0 2016-01-21 15:29 /test_put_dir/test_new_file.txt
10汉嗽、刪除hadoop上指定文件夾(包含子目錄等):hadoop fs –rm -r [目錄地址]
[hdfs@cdh01
/]$ hadoop fs -rmr /test_put_dir
16/01/21
15:50:59 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion
interval = 1440 minutes, Emptier interval = 0 minutes.
Moved:
'hdfs://cdh01.cap.com:8020/test_put_dir' to trash at:
hdfs://cdh01.cap.com:8020/user/hdfs/.Trash/Current
[hdfs@cdh01
/]$ hadoop fs -ls /
Found
3 items
drwxr-xr-x
- hdfs supergroup 0 2016-01-21 11:07 /hff
drwxrwxrwt
- hdfs supergroup 0 2016-01-21 10:39 /tmp
drwxr-xr-x
- hdfs supergroup 0 2016-01-21 15:42 /user
11、將hadoop指定目錄下所有內(nèi)容保存為一個文件找蜜,同時down至本地
hadoop
dfs –getmerge /user /home/t
12饼暑、將正在運行的hadoop作業(yè)kill掉
hadoop
job –kill [job-id]
13
sqoop 數(shù)據(jù)從oracle遷移到hdfs
sqoop
list-tables --connect jdbc:oracle:thin:@192.168.78.221:1521:orcl --username
scott --password=123456
sqoop
list-tables --connect jdbc:oracle:thin:@192.168.90.122:1521:xdc --username pdca
--password=XXXXXX
sqoop
import --connect jdbc:oracle:thin:@192.168.78.221:1521:orcl --username scott
--password=123456 --table EMP -m 1 --target-dir /sqoop --direct-split-size
67108864
sqoop
import -m 1 --connect jdbc:mysql://master:3306/mysql --username root --password
Mysql5718% --table user --target-dir /user/hdfs/testdata/
./sqoop
import --connect jdbc:mysql://master:3306/hive --table TBLS --username root
--password Mysql5718% -m 1
sqoop
import --append --connect jdbc:oracle:thin:@192.168.78.221:1521:orcl --username
scott --password=123456 --table EMP --columns ename --hbase-table
hive_hbase_test9 --hbase-row-key id --column-family empinfo
sqoop
import --connect jdbc:oracle:thin:@192.168.78.221:1521:orcl --username scott
--password=123456 --table EMP --warehouse-dir /user/nanyue/oracletest -m 1
sqoop
import --hive-import --connect jdbc:oracle:thin:@192.168.78.221:1521:orcl
--username scott --password 123456 --table EMP --hive-database default
--hive-table poke1 -m 1
sqoop
import --hive-import --connect jdbc:oracle:thin:@192.168.90.122:1521:xdc
--username pdca --password XXXXXX--table PDCA_PROJECT_T --hive-database default
--hive-table poke1 -m 1
sqoop
import --connect jdbc:oracle:thin:@192.168.90.122:1521:xdc --username pdca
--password XXXXX --table PDCA_AAB_LOOKUP_T --fields-terminated-by '\t'
--hive-drop-import-delims --map-column-java CONTENT=String --hive-import
--hive-overwrite --create-hive-table
?--hive-table poke1 --delete-target-dir;
sqoop
import --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc --username pdca
--password xxxxxx--hive-import -table poke1;
sqoop
import --hive-import --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc
--username pdca --password xxxxxx--table PDCA_MES_LINE_T --hive-database
default --hive-table poke1 -m 1
--導(dǎo)入所有表
sqoop
import-all-tables --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc --username
PDCA --password xxxxxx--hive-database DEFAULT -m 1 --create-hive-table
--hive-import --hive-overwrite
--導(dǎo)入單個表
sqoop
import --hive-import --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc
--username pdca --password xxxxxx--table PDCA_MES_LINE_T --hive-database
default -m 1 --create-hive-table --hive-import --hive-overwrite
--指定字段,單獨輸入密碼
sqoop
import --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc --username pdca --P
--table PDCA_MES_LINE_T --columns 'MES_LINE_CODE,MES_LINE_NAME'
--create-hive-table -target-dir /opt/hive2.1.1//tmp -m 1 --hive-table
PDCA_MES_LINE_T_Test --hive-import -- --default-character-set=utf-8
--指定字段洗做,不單獨指定密碼
sqoop
import --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc --username pdca
--password xxxxxx--table PDCA_MES_LINE_T --columns
'MES_LINE_CODE,MES_LINE_NAME' --create-hive-table -target-dir
/opt/hive2.1.1//tmp -m 1 --hive-table PDCA_MES_LINE_T_Test1 --hive-import
?-- --default-character-set=utf-8
***************************************
導(dǎo)入hbase
sqoop
import --connect jdbc:oracle:thin:@10.3.60.123:1521:xdc --table PDCA_MES_LINE_T
--hbase-table A --column-family mesline --hbase-row-key MES_LINE_CODE
--hbase-create-table --username 'pdca' -P
****************************
查看hive 表在hdfs上的存儲路徑
1弓叛、執(zhí)行hive,進入hive窗口
2竭望、執(zhí)行show databases,查看所有的database;
3邪码、執(zhí)行use origin_ennenergy_onecard; 則使用origin_ennenergy_onecard數(shù)據(jù)庫
4、執(zhí)行show create table M_BD_T_GAS_ORDER_INFO_H;則可以查看table在hdfs上的存儲路徑
*****************************
查看端口號綁定狀態(tài)
查看10000端口號的綁定狀態(tài)
sudo
netstat -nplt | grep 10000
以上就是大數(shù)據(jù)Hadoop培訓(xùn)學(xué)習(xí)常用命令咬清,你學(xué)會了嗎闭专?更多大數(shù)據(jù)相關(guān)知識可以來科多大數(shù)據(jù)www.keduox.com了解哦。