創(chuàng)建表
create 'test1', 'lf', 'sf'
lf: column family of LONG values (binary value)
-- sf: column family of STRING values
導(dǎo)入數(shù)據(jù)
put 'test1', 'user1|ts1', 'sf:c1', 'sku1'
put 'test1', 'user1|ts2', 'sf:c1', 'sku188'
put 'test1', 'user1|ts3', 'sf:s1', 'sku123'
put 'test1', 'user2|ts4', 'sf:c1', 'sku2'
put 'test1', 'user2|ts5', 'sf:c2', 'sku288'
put 'test1', 'user2|ts6', 'sf:s1', 'sku222'
一個(gè)用戶(userX),在什么時(shí)間(tsX),作為rowkey
對(duì)什么產(chǎn)品(value:skuXXX)奔脐,做了什么操作作為列名峦朗,比如,c1: click from homepage; c2: click from ad; s1: search from homepage; b1: buy
查詢案例
誰的值=sku188
scan 'test1', FILTER=>"ValueFilter(=,'binary:sku188')"
ROW??????????????????????????COLUMN+CELL????????????????????
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
誰的值包含88
scan 'test1', FILTER=>"ValueFilter(=,'substring:88')"
ROW??????????????????????????COLUMN+CELL????
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
user2|ts5???????????????????column=sf:c2, timestamp=1409122355030, value=sku288
通過廣告點(diǎn)擊進(jìn)來的(column為c2)值包含88的用戶
scan 'test1', FILTER=>"ColumnPrefixFilter('c2') AND ValueFilter(=,'substring:88')"
ROW??????????????????????????COLUMN+CELL
user2|ts5???????????????????column=sf:c2, timestamp=1409122355030, value=sku288
通過搜索進(jìn)來的(column為s)值包含123或者222的用戶
scan 'test1', FILTER=>"ColumnPrefixFilter('s') AND ( ValueFilter(=,'substring:123') OR ValueFilter(=,'substring:222') )"
ROW??????????????????????????COLUMN+CELL
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
user2|ts6???????????????????column=sf:s1, timestamp=1409122355970, value=sku222
rowkey為user1開頭的
scan 'test1', FILTER => "PrefixFilter ('user1')"
ROW??????????????????????????COLUMN+CELL
user1|ts1???????????????????column=sf:c1, timestamp=1409122354868, value=sku1
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
FirstKeyOnlyFilter: 一個(gè)rowkey可以有多個(gè)version,同一個(gè)rowkey的同一個(gè)column也會(huì)有多個(gè)的值, 只拿出key中的第一個(gè)column的第一個(gè)version
KeyOnlyFilter: 只要key,不要value
scan 'test1', FILTER=>"FirstKeyOnlyFilter() AND ValueFilter(=,'binary:sku188') AND KeyOnlyFilter()"
ROW??????????????????????????COLUMN+CELL
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=
從user1|ts2開始,找到所有的rowkey以u(píng)ser1開頭的
scan 'test1', {STARTROW=>'user1|ts2', FILTER => "PrefixFilter ('user1')"}
ROW??????????????????????????COLUMN+CELL
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
從user1|ts2開始,找到所有的到rowkey以u(píng)ser2開頭
scan 'test1', {STARTROW=>'user1|ts2', STOPROW=>'user2'}
ROW??????????????????????????COLUMN+CELL
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
查詢r(jià)owkey里面包含ts3的
importorg.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.RowFilter
scan 'test1', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('ts3'))}
ROW??????????????????????????COLUMN+CELL
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
查詢r(jià)owkey里面包含ts的
importorg.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.RowFilter
scan 'test1', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('ts'))}
ROW??????????????????????????COLUMN+CELL
user1|ts1???????????????????column=sf:c1, timestamp=1409122354868, value=sku1
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
user2|ts4???????????????????column=sf:c1, timestamp=1409122354998, value=sku2
user2|ts5???????????????????column=sf:c2, timestamp=1409122355030, value=sku288
user2|ts6???????????????????column=sf:s1, timestamp=1409122355970, value=sku222
加入一條測試數(shù)據(jù)
put 'test1', 'user2|err', 'sf:s1', 'sku999'
查詢r(jià)owkey里面以u(píng)ser開頭的店溢,新加入的測試數(shù)據(jù)并不符合正則表達(dá)式的規(guī)則床牧,故查詢不出來
import org.apache.hadoop.hbase.filter.RegexStringComparator
importorg.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.RowFilter
scan 'test1', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),RegexStringComparator.new('^user\d+\|ts\d+$'))}
ROW??????????????????????????COLUMN+CELL
user1|ts1???????????????????column=sf:c1, timestamp=1409122354868, value=sku1
user1|ts2???????????????????column=sf:c1, timestamp=1409122354918, value=sku188
user1|ts3???????????????????column=sf:s1, timestamp=1409122354954, value=sku123
user2|ts4???????????????????column=sf:c1, timestamp=1409122354998, value=sku2
user2|ts5???????????????????column=sf:c2, timestamp=1409122355030, value=sku288
user2|ts6???????????????????column=sf:s1, timestamp=1409122355970, value=sku222
加入測試數(shù)據(jù)
put 'test1', 'user1|ts9', 'sf:b1', 'sku1'
b1開頭的列中并且值為sku1的
scan 'test1', FILTER=>"ColumnPrefixFilter('b1') AND ValueFilter(=,'binary:sku1')"
ROW??????????????????????????COLUMN+CELL???????????????????????????????????????????????????????????????????????
user1|ts9???????????????????column=sf:b1, timestamp=1409124908668, value=sku1
SingleColumnValueFilter的使用,b1開頭的列中并且值為sku1的
importorg.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
scan 'test1', {COLUMNS => 'sf:b1', FILTER => SingleColumnValueFilter.new(Bytes.toBytes('sf'), Bytes.toBytes('b1'), CompareFilter::CompareOp.valueOf('EQUAL'), Bytes.toBytes('sku1'))}
ROW??????????????????????????COLUMN+CELL
user1|ts9???????????????????column=sf:b1, timestamp=1409124908668, value=sku1
hbase zkcli 的使用
hbase zkcli
ls /
[hbase, zookeeper]
[zk: hadoop000:2181(CONNECTED) 1] ls /hbase
[meta-region-server, backup-masters, table, draining, region-in-transition, running, table-lock, master, namespace, hbaseid, online-snapshot, replication, splitWAL, recovering-regions, rs]
[zk: hadoop000:2181(CONNECTED) 2] ls /hbase/table
[member, test1, hbase:meta, hbase:namespace]
[zk: hadoop000:2181(CONNECTED) 3] ls /hbase/table/test1
[]
[zk: hadoop000:2181(CONNECTED) 4] get /hbase/table/test1
?master:60000}l$??lPBUF
cZxid = 0x107
ctime = Wed Aug 27 14:52:21 HKT 2014
mZxid = 0x10b
mtime = Wed Aug 27 14:52:22 HKT 2014
pZxid = 0x107
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 31
numChildren = 0
hbase表操作命令
1册踩、認(rèn)證及進(jìn)入:
kinit?命令進(jìn)行認(rèn)證暂吉,進(jìn)入命令:hbase?shell? ? 查看當(dāng)前用戶(whoami)
2慕的、展示表:
list
3肮街、查看表結(jié)構(gòu):
describe? "table.name"
4、掃描表
scan 'table.name',{LIMIT=>5}
5、值包含8888888
scan "table.name",FILTER=>"ValueFilter(=,'binary:888888')"
6擂红、值含有888888
scan "table.name",FILTER=>"ValueFilter(=,'substring:888888')"
7昵骤、column為:c2?的值包含 8888888
scan "table.name",FILTER=>"ColumPrefixFilter('c2') AND ValueFilter(=,'substring:88')"
8变秦、column?為:s1?的值為包含88或者66
scan? “table.name”FILTER=>"ColumPrefixFilter('s') AND (ValueFilter(=,'substring:88')OR ValueFilter(='substring:66')) "
9、rowkey?為user1開頭的
scan 'test1' ,FILTER =>"PrefixFilter('user1')"
10雨饺、get的用法(t為表名额港,r為row,c為行)
hbase> get ‘t1′, ‘r1′
hbase> get ‘t1′, ‘r1′,
{TIMERANGE => [ts1, ts2]}
hbase> get ‘t1′, ‘r1′, {COLUMN => ‘c1′}
hbase> get ‘t1′, ‘r1′, {COLUMN => ['c1', 'c2', 'c3']}
hbase> get ‘t1′, ‘r1′, {COLUMN => ‘c1′, TIMESTAMP => ts1}
hbase> get ‘t1′, ‘r1′, {COLUMN => ‘c1′, TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get ‘t1′, ‘r1′, {COLUMN => ‘c1′, TIMESTAMP => ts1, VERSIONS => 4}
hbase> get ‘t1′, ‘r1′, ‘c1′
hbase> get ‘t1′, ‘r1′, ‘c1′, ‘c2′
hbase> get ‘t1′, ‘r1′, ['c1', 'c2']
11、scan
hbase> scan ‘.META.'
hbase> scan ‘.META.', {COLUMNS => ‘info:regioninfo'}
hbase> scan ‘t1′, {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => ‘xyz'}
hbase> scan ‘t1′, {COLUMNS => ‘c1′, TIMERANGE => [1303668804, 1303668904]}
hbase> scan ‘t1′, {FILTER => “(PrefixFilter (‘row2′) AND (QualifierFilter (>=, ‘binary:xyz'))) AND (TimestampsFilter ( 123, 456))”}
hbase> scan ‘t1′, {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}