Introduction and use of HBase
1, introduction
hbase is the open source java version of bigtable. It is a database system based on hdfs, which provides high reliability, high performance, column storage, scalability, real-time reading and writing of nosql.
2. Differences with RDBMS
1. RDBMS
Structure:
- The database exists as a table
- Support FAT, NTFS, EXT and file system
- Using Commit log to store logs
- Reference system is coordinate system
- Use primary key
- Support partition
- Use row, column, cell
Functions: - Support for upward expansion
- Use SQL query
- Row oriented, that is, each row is a continuous unit
- Total data depends on server configuration
- With ACID support
- Suitable for structured data
- Traditional relational databases are generally centralized
- Support transaction
- Support Join
2. HBase
Structure:
- Database exists in the form of region
- HDFS file system support
- Use WAL (write ahead logs) to store logs
- The reference system is Zookeeper
- Use row key
- Support fragmentation
- Use rows, columns, column families, and cells
Functions: - Support for outward expansion
- Using API and MapReduce to access HBase table data
- Column oriented, that is, each column is a continuous unit
- The total amount of data does not depend on a specific machine, but depends on the number of machines
- HBase does not support ACID (Atomicity, Consistency, Isolation, Durability)
- Suitable for structured data and unstructured data
- Generally distributed
- HBase does not support transactions
- Join is not supported
3. Infrastructure
HMaster
Functions:
- Monitoring RegionServer
- Handling regional server failover
- Handling metadata changes
- Handling the allocation or removal of region s
- Load balancing data in idle time
- Publish your location to the client through Zookeeper
RegionServer
Functions: - Responsible for storing HBase's actual data
- Process the Region assigned to it
- Refresh cache to HDFS
- Maintenance of HLog
- Execution compression
- Responsible for processing Region segmentation
Components:
-
Write-Ahead logs
The modification record of HBase. When reading and writing data to HBase, the data is not directly written to the disk, it will remain in memory for a period of time (time and data volume threshold can be set). But saving data in memory may cause data loss. To solve this problem, data will be written in a file called write ahead logfile, and then written into memory. So in case of system failure, data can be reconstructed through this log file. -
HFile
This is the actual physical file that stores the original data on the disk. It is the actual storage file. -
Store
HFile is stored in the Store, and a Store corresponds to a column family in HBase table. -
MemStore
As the name implies, it is memory storage, which is located in memory and is used to save the current data operation. Therefore, when the data is saved in the WAL, the regsion server will store the key value pairs in memory. -
Region
The HBase table will be divided into different regions according to the RowKey value and stored in the region server. There can be multiple different regions in a region server.
4. Common Shell
- Enter the client command: hbase shell
- View help command: help
- View which tables are in the current database: list
- Create a table:
create 'user', 'info', 'data' create 'user', {NAME => 'info', VERSIONS => '3'},{NAME => 'data'}
- Add data operation:
put 'user', 'rk0001', 'info:name', 'zhangsan'
- Query data operation:
get 'user', 'rk0001'
- Query all data: scan 'user'
- Column family query: scan 'user', {columns = > 'info'}
- Update data: same as insert
- Delete data:
delete 'user', 'rk0001', 'info:name'
- Clear table data: truncate 'user'
- Delete table: disable first, delete
disable 'user' drop 'user'
- count the number of data in a table: count 'user'
5,Java API
Create table
@Test public void createTable() throws IOException { //Create a profile object and specify the connection address of zookeeper Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.property.clientPort", "2181"); configuration.set("hbase.zookeeper.quorum", "node01,node02,node03"); //Cluster configuration //configuration.set("hbase.zookeeper.quorum", "101.236.39.141,101.236.46.114,101.236.46.113"); configuration.set("hbase.master", "node01:60000"); Connection connection = ConnectionFactory.createConnection(configuration); Admin admin = connection.getAdmin(); //HTableDescriptor is used to set parameters of our table, including table name, column family, etc HTableDescriptor hTableDescriptor = new HTableDescriptor(TableName.valueOf("myuser")); //Add column family hTableDescriptor.addFamily(new HColumnDescriptor("f1")); //Add column family hTableDescriptor.addFamily(new HColumnDescriptor("f2")); //Create table boolean myuser = admin.tableExists(TableName.valueOf("myuser")); if(!myuser){ admin.createTable(hTableDescriptor); } //Close client connection admin.close(); }
Add data to table
@Test public void addDatas() throws IOException { //Get connection Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.quorum", "node01:2181,node02:2181"); Connection connection = ConnectionFactory.createConnection(configuration); //Access table Table myuser = connection.getTable(TableName.valueOf("myuser")); //Create a put object and specify rowkey Put put = new Put("0001".getBytes()); put.addColumn("f1".getBytes(),"id".getBytes(), Bytes.toBytes(1)); put.addColumn("f1".getBytes(),"name".getBytes(), Bytes.toBytes("Zhang San")); put.addColumn("f1".getBytes(),"age".getBytes(), Bytes.toBytes(18)); put.addColumn("f2".getBytes(),"address".getBytes(), Bytes.toBytes("Earthlings")); put.addColumn("f2".getBytes(),"phone".getBytes(), Bytes.toBytes("15874102589")); //insert data myuser.put(put); //Close table myuser.close(); }
Query data
@Test public void searchData() throws IOException { Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181"); Connection connection = ConnectionFactory.createConnection(configuration); Table myuser = connection.getTable(TableName.valueOf("myuser")); Get get = new Get(Bytes.toBytes("0003")); Result result = myuser.get(get); Cell[] cells = result.rawCells(); //Get all column names and values for (Cell cell : cells) { //Note that if the column property is of type int, it will not be shown here System.out.println(Bytes.toString(cell.getQualifierArray(),cell.getQualifierOffset(),cell.getQualifierLength())); System.out.println(Bytes.toString(cell.getValueArray(),cell.getValueOffset(),cell.getValueLength())); } myuser.close(); }
Filter query
Comparison filter:
- rowKey filter
- Family filter
- Column filterqualifilter
- Column value filter
Special filter: - Single column value filter
- Column value exclusion filtersinglecolumnvalueexcludefilter
- rowkey PrefixFilter
- PageFilter
Multi filter comprehensive query FilterList
Delete data according to rowkey
@Test public void deleteByRowKey() throws IOException { //Get connection Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181"); Connection connection = ConnectionFactory.createConnection(configuration); Table myuser = connection.getTable(TableName.valueOf("myuser")); Delete delete = new Delete("0001".getBytes()); myuser.delete(delete); myuser.close(); }
Delete table
@Test public void deleteTable() throws IOException { //Get connection Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181"); Connection connection = ConnectionFactory.createConnection(configuration); Admin admin = connection.getAdmin(); admin.disableTable(TableName.valueOf("myuser")); admin.deleteTable(TableName.valueOf("myuser")); admin.close(); }