hbase建表时设置预分区

时间:2024-04-23 08:40:42

一.hbase rowkey设计的原则
遵循唯一性,散列,不应过长等原则

二.rowkey常用的设计
1.reverse反转
2.salt加盐
3.hash散列

三.hbase建表预分区,指定3个rowkey,分成4个region
在Hbase中,预分区是一种优化手段,用于在创建表时提前规划好Region的分布,以提高数据写入的效率和查询性能,同时避免数据分布不均导致的热点问题
为什么要预分区?
1.减少split操作:随着数据的增长,单个Region超过一定大小会触发split操作,会消耗资源并影响性能
2.平衡数据分布:通过预知数据的分布特性,更均匀地分配Region,避免数据倾斜和热点问题
3.提升写入性能:预分区能够使得数据初始写入时直接分散到多个Region,提高并行写入的能力

示例:

create 'phoenix2','cf1',SPLITS => ['key1','key5','key8']

describe 'phoenix2'
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCK
CACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0170 seconds

put 'phoenix2','key0','cf1:name','key0'
put 'phoenix2','key1','cf1:name','key1'
put 'phoenix2','key2','cf1:name','key2'
put 'phoenix2','key3','cf1:name','key3'
put 'phoenix2','key4','cf1:name','key4'
put 'phoenix2','key5','cf1:name','key5'
put 'phoenix2','key6','cf1:name','key6'
put 'phoenix2','key7','cf1:name','key7'
put 'phoenix2','key8','cf1:name','key8'
put 'phoenix2','key9','cf1:name','key9'

查看Hbase的web界面,可以看到生成了4个Region
Table Regions
Name Region Server Start Key End Key Locality Requests
phoenix2,1713767154009.1e1a7e1962249ebb0419c0be83e884f0. whtpiodscshd01t,21302,1710927618816 key1 0.0 1
phoenix2,key1,1713767154009.bee445cc4e6c81de2a31f5b8cdf61aca. whtpiodscshd02t,21302,1710927704067 key1 key5 0.0 4
phoenix2,key5,1713767154009.c92a61e074907b5bdab9e6619615ac27. whtpiodscshd02t,21302,1710927704067 key5 key8 0.0 3
phoenix2,key8,1713767154009.0029739e798ac34f4f34b5a70d31a19c. whtpiodscshd03t,21302,1710927771892 key8 0.0 2