暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

HBase入门

数据湖 2020-09-14
280

HBase在大数据生态中的地位举足轻重,它是谷歌bigtable的开源实现,是一种分布式存储的NoSQL数据库,能自动分片和故障转移,与HDFS高度集成,适合海量数据的高效查询。我目前用过的业务场景包括:

1.存储日志数据

2.存储车辆GPS数据,设备上报数据

3.kafka tpoic的offset

HBase架构

我们可以通过HBase的web管理界面来直观感受HBase的架构

1.HBase依赖Zookeeper,Zookeeper存储其元数据,对Master和RegionServer进行分布式协调

2.HDFS作为HBase运行的底层文件系统

3.RegionServer为从节点,是数据节点,存储数据

4.Master RegionServer需要实时地向Master上报运行状况

常用HBase Shell命令

查看帮助信息

    [root@cdh3 ~]# hbase shell
    HBase Shell
    Use "help" to get list of supported commands.
    Use "exit" to quit this interactive shell.
    For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
    Version 2.2.3.7.1.3.0-100, rUnknown, Wed Aug 5 10:49:56 UTC 2020
    Took 0.0012 seconds
    hbase(main):001:0> help
    HBase Shell, version 2.2.3.7.1.3.0-100, rUnknown, Wed Aug 5 10:49:56 UTC 2020
    Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
    Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.


    COMMAND GROUPS:
    Group name: general
    Commands: processlist, status, table_help, version, whoami


    Group name: ddl
    Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters


    Group name: namespace
    Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables


    Group name: dml
    Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
    复制

    创建一个表,必须要指定表名称和列簇名

      hbase(main):002:0> create 'test', 'cf'
      Created table test
      Took 1.8893 seconds
      => Hbase::Table - test
      复制

      列出表信息

        hbase(main):003:0> list 'test'
        TABLE
        test
        1 row(s)
        Took 0.0231 seconds
        => ["test"]
        复制

        查看表详细信息,使用describe命令

          hbase(main):004:0> describe 'test'
          Table test is ENABLED
          test
          COLUMN FAMILIES DESCRIPTION
          {NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CAC
          HE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFIL
          TER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'fa
          lse', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
          1 row(s)
          QUOTAS
          0 row(s)
          Took 0.2339 seconds
          复制

          添加数据

            hbase(main):005:0> put 'test','row1','cf:a','value1'
            Took 0.1284 seconds
            hbase(main):006:0> put 'test','row2','cf:b','value2'
            Took 0.0048 seconds
            hbase(main):007:0> put 'test','row3','cf:c','value3'
            Took 0.0085 seconds
            复制

            查看表中的所有数据

              hbase(main):008:0> scan 'test'
              ROW COLUMN+CELL
              row1 column=cf:a, timestamp=1600003697537, value=value1
              row2 column=cf:b, timestamp=1600003711262, value=value2
              row3 column=cf:c, timestamp=1600003732296, value=value3
              3 row(s)
              Took 0.0235 seconds
              复制

              获取单行的数据

                hbase(main):009:0> get 'test','row1'
                COLUMN CELL
                cf:a timestamp=1600003697537, value=value1
                1 row(s)
                Took 0.0139 second
                复制

                HBase Java常用API

                并以创建表为例进行测试

                  import org.apache.hadoop.conf.Configuration;
                  import org.apache.hadoop.hbase.HBaseConfiguration;
                  import org.apache.hadoop.hbase.HColumnDescriptor;
                  import org.apache.hadoop.hbase.HTableDescriptor;
                  import org.apache.hadoop.hbase.TableName;
                  import org.apache.hadoop.hbase.client.*;
                  import org.apache.hadoop.hbase.util.Bytes;
                  import java.io.IOException;
                  import java.lang.reflect.Field;
                  import java.text.SimpleDateFormat;
                  import java.util.*;


                  public class HBaseUtil {
                  private static Connection connection = null;


                  /**
                  * 初始化hbase的连接
                  *
                  * @throws IOException
                  */
                  private static void initConnection() throws IOException {
                  if (connection == null || connection.isClosed()) {
                  Configuration conf = HBaseConfiguration.create();
                  conf.set("hbase.zookeeper.quorum", "192.168.0.171,192.168.0.207,192.168.0.208");
                  conf.set("hbase.zookeeper.property.clientPort", "2181");
                  connection = ConnectionFactory.createConnection(conf);
                  }
                  }


                  /**
                  * 获得连接
                  *
                  * @return
                  * @throws IOException
                  */
                  public static Connection getConnection() throws IOException {
                  if (connection == null || connection.isClosed()) {
                  initConnection();
                  }


                  //连接可用直接返回连接
                  return connection;
                  }


                  /**
                  * 创建表
                  *
                  * @param tableNameString
                  * @param columnFamily
                  * @throws IOException
                  */
                  public static void createTable(Connection connection, String tableNameString, String columnFamily) throws IOException {
                  Admin admin = connection.getAdmin();
                  TableName tableName = TableName.valueOf(tableNameString); //d2h (data to HBase)
                  HTableDescriptor table = new HTableDescriptor(tableName);
                  HColumnDescriptor family = new HColumnDescriptor(columnFamily);
                  table.addFamily(family);
                  //判断表是否已经存在
                  if (admin.tableExists(tableName)) {
                  admin.disableTable(tableName);
                  admin.deleteTable(tableName);
                  }
                  admin.createTable(table);
                  }


                  /**
                  * 判断hbase的表是否存在
                  *
                  * @param tableName
                  * @return
                  * @throws Exception
                  */
                  public static boolean tableExists(String tableName) throws Exception {
                  if (connection == null || connection.isClosed()) {
                  initConnection();
                  }


                  Admin admin = connection.getAdmin();
                  if (admin.tableExists(TableName.valueOf(tableName))) {
                  return true;
                  }
                  return false;
                  }


                  public static Table getTable(String tableName)throws Exception{
                  Connection connection = getConnection();
                  return connection.getTable(TableName.valueOf(tableName));
                  }


                  /**
                  * 获取插入HBase的操作put
                  *
                  * @param rowKeyString
                  * @param familyName
                  * @param columnName
                  * @param columnValue
                  * @return
                  */
                  public static Put createPut(String rowKeyString, byte[] familyName, String columnName, String columnValue) {
                  byte[] rowKey = rowKeyString.getBytes();
                  Put put = new Put(rowKey);
                  put.addColumn(familyName, columnName.getBytes(), columnValue.getBytes());
                  return put;
                  }




                  /**
                  * 获取插入HBase的操作put
                  *
                  * @param rowKeyString
                  * @param familyName
                  * @param columns 列
                  * @return
                  */
                  public static Put createPut(String rowKeyString, byte[] familyName, Map<String, String> columns) {
                  byte[] rowKey = rowKeyString.getBytes();
                  Put put = new Put(rowKey);
                  for (Map.Entry<String, String> entry : columns.entrySet()) {
                  put.addColumn(familyName, entry.getKey().getBytes(), entry.getValue().getBytes());
                  }
                  return put;
                  }

                  /**
                  * 打印HBase查询结果
                  *
                  * @param result
                  */
                  public static void print(Result result) {
                  //result是个四元组<行键,列族,列(标记符),值>
                  byte[] row = result.getRow(); //行键
                  NavigableMap<byte[], NavigableMap<byte[], NavigableMap<Long, byte[]>>> map = result.getMap();
                  for (Map.Entry<byte[], NavigableMap<byte[], NavigableMap<Long, byte[]>>> familyEntry : map.entrySet()) {
                  byte[] familyBytes = familyEntry.getKey(); //列族
                  for (Map.Entry<byte[], NavigableMap<Long, byte[]>> entry : familyEntry.getValue().entrySet()) {
                  byte[] column = entry.getKey(); //列
                  for (Map.Entry<Long, byte[]> longEntry : entry.getValue().entrySet()) {
                  Long time = longEntry.getKey(); //时间戳
                  byte[] value = longEntry.getValue(); //值
                  System.out.println(String.format("行键rowKey=%s,列族columnFamily=%s,列column=%s,时间戳timestamp=%d,值value=%s", new String(row), new String(familyBytes), new String(column), time, new String(value)));
                  }
                  }
                  }


                  }


                  // 测试创建hbase_test表
                  public static void main(String[] args) throws Exception {
                  Connection connection = HBaseUtil.getConnection();
                  Boolean tableExists = HBaseUtil.tableExists("hbase_test");
                  System.out.println(tableExists);
                  HBaseUtil.createTable(connection, "hbase_test", "cf1");
                  tableExists = HBaseUtil.tableExists("hbase_test");
                  System.out.println(tableExists);
                  }
                  }
                  复制

                  控制台打印

                    20/09/13 20:49:05 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.0.171:2181,192.168.0.207:2181,192.168.0.208:2181 sessionTimeout=90000 watcher=hconnection-0x704d6e830x0, quorum=192.168.0.171:2181,192.168.0.207:2181,192.168.0.208:2181, baseZNode=/hbase
                    20/09/13 20:49:09 INFO zookeeper.ClientCnxn: Opening socket connection to server cdh3.macro.com/192.168.0.208:2181. Will not attempt to authenticate using SASL (unknown error)
                    20/09/13 20:49:09 INFO zookeeper.ClientCnxn: Socket connection established to cdh3.macro.com/192.168.0.208:2181, initiating session
                    20/09/13 20:49:09 INFO zookeeper.ClientCnxn: Session establishment complete on server cdh3.macro.com/192.168.0.208:2181, sessionid = 0x10014dbded00993, negotiated timeout = 60000
                    false
                    20/09/13 20:49:13 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
                    20/09/13 20:49:17 INFO client.HBaseAdmin: Created hbase_test
                    true
                    复制

                    首次连接HBase,判断表不存在为false,创建表之后,判断表存在为true

                    查看HBase表,发现hbase_test表被成功创建

                      hbase(main):009:0> list
                      TABLE
                      Student
                      blobstore
                      hbase_test
                      kylin_metadata
                      40 row(s)
                      Took 0.0595 seconds
                      复制

                      本文大致介绍了Hbase入门需要知道的一些原理和实践,另外HBase的官方文档非常详细,是入门HBase的不二之选,推荐读者多阅读其官方文档。

                      文章转载自数据湖,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                      评论