19.1 環(huán)境介紹
- 操作系統(tǒng)版本:RedHat6.5
- CM版本:CM 5.11.1
- 集群已啟用Kerberos和Sentry
- 采用具有sudo權(quán)限的ec2-user用戶進行操作
19.2 實驗準備
19.2.1外部表數(shù)據(jù)父目錄創(chuàng)建
- 使用hive用戶登錄Kerberos
[root@ip-172-31-8-141 1874-hive-HIVESERVER2]# kinit -kt hive.keytab hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM
[root@ip-172-31-8-141 1874-hive-HIVESERVER2]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM
Valid starting Expires Service principal
09/01/17 11:10:54 09/02/17 11:10:54 krbtgt/CLOUDERA.COM@CLOUDERA.COM
renew until 09/06/17 11:10:54
[root@ip-172-31-8-141 1874-hive-HIVESERVER2]#
- 創(chuàng)建HDFS目錄
- 使用如下命令在HDFS的根目錄下創(chuàng)建Hive外部表的數(shù)據(jù)目錄/extwarehouse
[root@ip-172-31-8-141 ec2-user]# hadoop fs -mkdir /extwarehouse
[root@ip-172-31-8-141 ec2-user]# hadoop fs -ls /
drwxr-xr-x - hive supergroup 0 2017-09-01 11:27 /extwarehouse
drwxrwxrwx - user_r supergroup 0 2017-08-23 03:23 /fayson
drwx------ - hbase hbase 0 2017-09-01 02:59 /hbase
drwxrwxrwt - hdfs supergroup 0 2017-08-31 06:18 /tmp
drwxrwxrwx - hdfs supergroup 0 2017-08-30 03:48 /user
[root@ip-172-31-8-141 ec2-user]# hadoop fs -chown hive:hive /extwarehouse
[root@ip-172-31-8-141 ec2-user]# hadoop fs -chmod 771 /extwarehouse
[root@ip-172-31-8-141 ec2-user]# hadoop fs -ls /
drwxrwx--x - hive hive 0 2017-09-01 11:27 /extwarehouse
drwxrwxrwx - user_r supergroup 0 2017-08-23 03:23 /fayson
drwx------ - hbase hbase 0 2017-09-01 02:59 /hbase
drwxrwxrwt - hdfs supergroup 0 2017-08-31 06:18 /tmp
drwxrwxrwx - hdfs supergroup 0 2017-08-30 03:48 /user
[root@ip-172-31-8-141 ec2-user]#
- 配置外部表數(shù)據(jù)父目錄的ACL同步
- 確保HDFS已開啟sentry并啟用ACL同步
- 配置sentry同步路徑,19.2.1創(chuàng)建的Hive外部表數(shù)據(jù)目錄
- 配置完成迈倍,重啟服務猿规。
19.3 Hive外部表創(chuàng)建
- 使用beeline命令行連接hive吹艇,創(chuàng)建Hive外部表
- 建表語句:
create external table if not exists student(
name string,
age int,
addr string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/extwarehouse/student';
- 終端操作:
[root@ip-172-31-8-141 1874-hive-HIVESERVER2]# beeline
Beeline version 1.1.0-cdh5.11.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM
...
0: jdbc:hive2://localhost:10000/> create external table if not exists student(
. . . . . . . . . . . . . . . . > name string,
. . . . . . . . . . . . . . . . > age int,
. . . . . . . . . . . . . . . . > addr string
. . . . . . . . . . . . . . . . > )
. . . . . . . . . . . . . . . . > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
. . . . . . . . . . . . . . . . > LOCATION '/extwarehouse/student';
...
INFO : OK
No rows affected (0.236 seconds)
0: jdbc:hive2://localhost:10000/>
- 向student表中l(wèi)oad數(shù)據(jù)
- 準備測試數(shù)據(jù)
[root@ip-172-31-8-141 student]# pwd
/home/ec2-user/student
[root@ip-172-31-8-141 student]# ll
total 4
-rw-r--r-- 1 root root 39 Sep 1 11:37 student.txt
[root@ip-172-31-8-141 student]# cat student.txt
zhangsan,18,guangzhou
lisi,20,shenzhen
[root@ip-172-31-8-141 student]#
- 將student.txt文件put到hdfs的/tmp/student目錄
[root@ip-172-31-8-141 student]# hadoop fs -mkdir /tmp/student
[root@ip-172-31-8-141 student]# ll
total 4
-rw-r--r-- 1 hive hive 39 Sep 1 11:37 student.txt
[root@ip-172-31-8-141 student]# hadoop fs -put student.txt /tmp/student
[root@ip-172-31-8-141 student]# hadoop fs -ls /tmp/student
Found 1 items
-rw-r--r-- 3 hive supergroup 39 2017-09-01 11:57 /tmp/stu
dent/student.txt
[root@ip-172-31-8-141 student]#
- 在beeline命令行下,將數(shù)據(jù)load到student表
0: jdbc:hive2://localhost:10000/> load data inpath '/tmp/student' into table student;
...
INFO : Table default.student stats: [numFiles=1, totalSize=39]
INFO : Completed executing command(queryId=hive_20170901115858_5a76aa76-1b24-40ce-8254-42991856c05b); Time taken: 0.263 seconds
INFO : OK
No rows affected (0.41 seconds)
0: jdbc:hive2://localhost:10000/>
- 執(zhí)行完load命令后栏妖,查看表數(shù)據(jù)
0: jdbc:hive2://localhost:10000/> select * from student;
...
INFO : OK
+---------------+--------------+---------------+--+
| student.name | student.age | student.addr |
+---------------+--------------+---------------+--+
| zhangsan | 18 | guangzhou |
| lisi | 20 | shenzhen |
+---------------+--------------+---------------+--+
2 rows selected (0.288 seconds)
0: jdbc:hive2://localhost:10000/>
19.4 給fayson用戶student表讀權(quán)限
- 使用fayson用戶的principal初始化Kerberors的票據(jù)
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ kinit fayson
Password for fayson@CLOUDERA.COM:
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: fayson@CLOUDERA.COM
Valid starting Expires Service principal
09/01/17 12:27:39 09/02/17 12:27:39 krbtgt/CLOUDERA.COM@CLOUDERA.COM
renew until 09/08/17 12:27:39
[ec2-user@ip-172-31-8-141 cdh-shell-master]$
- 訪問hdfs目錄
[ec2-user@ip-172-31-8-141 ~]$ hadoop fs -ls /extwarehouse/student
ls: Permission denied: user=fayson, access=READ_EXECUTE, inode="/extwarehouse/student":hive:hive:drwxrwx--x
[ec2-user@ip-172-31-8-141 ~]$
- beeline命令行查看
[ec2-user@ip-172-31-8-141 ~]$ beeline
Beeline version 1.1.0-cdh5.11.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM
...
INFO : OK
+-----------+--+
| tab_name |
+-----------+--+
+-----------+--+
No rows selected (0.295 seconds)
0: jdbc:hive2://localhost:10000/> select * from student;
Error: Error while compiling statement: FAILED: SemanticException No valid privileges
User fayson does not have privileges for QUERY
The required privileges: Server=server1->Db=default->Table=student->Column=addr->action=select; (state=42000,code=40000)
0: jdbc:hive2://localhost:10000/>
- impala-shell命令行查看
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ impala-shell
...
[Not connected] > connect ip-172-31-10-156.ap-southeast-1.compute.internal:21000;
Connected to ip-172-31-10-156.ap-southeast-1.compute.internal:21000
Server version: impalad version 2.8.0-cdh5.11.1 RELEASE (build 3382c1c488dff12d5ca8d049d2b59babee605b4e)
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > show tables;
Query: show tables
ERROR: AuthorizationException: User 'fayson@CLOUDERA.COM' does not have privileges to access: default.*
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > select * from student;
Query: select * from student
Query submitted at: 2017-09-01 12:33:06 (Coordinator: http://ip-172-31-10-156.ap-southeast-1.compute.internal:25000)
ERROR: AuthorizationException: User 'fayson@CLOUDERA.COM' does not have privileges to execute 'SELECT' on: default.student
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] >
- 通過hive用戶創(chuàng)建的外部表藕漱,未給fayson用戶賦予student表讀權(quán)限情況下,無權(quán)限訪問hdfs的(/extwarehouse/student)數(shù)據(jù)目錄帜羊,在beeline和impala-shell命令行下,fayson用戶均無權(quán)限查詢student表數(shù)據(jù)鸠天。
- 為fayson用戶賦予student表讀權(quán)限
- 注:以下操作均在hive管理員用戶下操作
- 創(chuàng)建student_read角色
0: jdbc:hive2://localhost:10000/> create role student_read;
...
INFO : Executing command(queryId=hive_20170901124848_927878ba-0217-4a32-a508-bf29fed67be8): create role student_read
...
INFO : OK
No rows affected (0.104 seconds)
0: jdbc:hive2://localhost:10000/>
- 將student表的查詢權(quán)限授權(quán)給student_read角色
0: jdbc:hive2://localhost:10000/> grant select on table student to role student_read;
...
INFO : Executing command(queryId=hive_20170901125252_8702d99d-d8eb-424e-929d-5df352828e2c): grant select on table student to role student_read
...
INFO : OK
No rows affected (0.111 seconds)
0: jdbc:hive2://localhost:10000/>
- 將student_read角色授權(quán)給fayson用戶組
0: jdbc:hive2://localhost:10000/> grant role student_read to group fayson;
...
INFO : Executing command(queryId=hive_20170901125454_5f27a87e-2f63-46d9-9cce-6f346a0c415c): grant role student_read to group fayson
...
INFO : OK
No rows affected (0.122 seconds)
0: jdbc:hive2://localhost:10000/>
- 再次測試,使用fayson用戶登錄Kerberos帐姻,訪問HDFS目錄
- 訪問student數(shù)據(jù)所在hdfs目錄/extwarehouse/student
[ec2-user@ip-172-31-8-141 ~]$ hadoop fs -ls /extwarehouse/student
Found 1 items
-rwxrwx--x+ 3 hive hive 39 2017-09-01 14:42 /extwarehouse/student/student.txt
[ec2-user@ip-172-31-8-141 ~]$
- beeline查詢student表
[ec2-user@ip-172-31-8-141 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: fayson@CLOUDERA.COM
Valid starting Expires Service principal
09/01/17 12:58:59 09/02/17 12:58:59 krbtgt/CLOUDERA.COM@CLOUDERA.COM
renew until 09/08/17 12:58:59
[ec2-user@ip-172-31-8-141 ~]$
[ec2-user@ip-172-31-8-141 ~]$ beeline
Beeline version 1.1.0-cdh5.11.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM
...
INFO : OK
+-----------+--+
| tab_name |
+-----------+--+
| student |
+-----------+--+
1 row selected (0.294 seconds)
0: jdbc:hive2://localhost:10000/> select * from student;
...
INFO : OK
+---------------+--------------+---------------+--+
| student.name | student.age | student.addr |
+---------------+--------------+---------------+--+
| zhangsan | 18 | guangzhou |
| lisi | 20 | shenzhen |
+---------------+--------------+---------------+--+
2 rows selected (0.241 seconds)
0: jdbc:hive2://localhost:10000/>
- impala-shell查詢student表
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: fayson@CLOUDERA.COM
Valid starting Expires Service principal
09/01/17 12:58:59 09/02/17 12:58:59 krbtgt/CLOUDERA.COM@CLOUDERA.COM
renew until 09/08/17 12:58:59
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ impala-shell
...
[Not connected] > connect ip-172-31-10-156.ap-southeast-1.compute.internal:21000;
Connected to ip-172-31-10-156.ap-southeast-1.compute.internal:21000
Server version: impalad version 2.8.0-cdh5.11.1 RELEASE (build 3382c1c488dff12d5ca8d049d2b59babee605b4e)
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > show tables;
Query: show tables
+---------+
| name |
+---------+
| student |
+---------+
Fetched 1 row(s) in 0.02s
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > select * from student;
...
+----------+-----+-----------+
| name | age | addr |
+----------+-----+-----------+
| zhangsan | 18 | guangzhou |
| lisi | 20 | shenzhen |
+----------+-----+-----------+
Fetched 2 row(s) in 0.13s
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] >
- 通過hive用戶創(chuàng)建的外部表稠集,給fayson用戶賦予student表讀權(quán)限后奶段,可正常訪問hdfs的(/extwarehouse/student)數(shù)據(jù)目錄兔综,在beeline和impala-shell命令行下北秽,fayson用戶均可查詢student表數(shù)據(jù)。
- 開啟外部表的數(shù)據(jù)父目錄ACL同步后贷屎,不需要單獨的維護外部表數(shù)據(jù)目錄權(quán)限晦鞋。
大數(shù)據(jù)視頻推薦:
騰訊課堂
CSDN
大數(shù)據(jù)語音推薦:
企業(yè)級大數(shù)據(jù)技術(shù)應用
大數(shù)據(jù)機器學習案例之推薦系統(tǒng)
自然語言處理
大數(shù)據(jù)基礎
人工智能:深度學習入門到精通