Phoenix installation and use
1. Background introduction
1.1 Phoenix definition
Phoenix is an open source SQL skin for HBase. You can use the standard JDBC API instead of the HBase client API to create tables, insert data, and query HBase data.
1.2 Phoenix features
-
Easy integration: such as Spark, Hive, Pig, Flume and Map Reduce.
-
Good performance: directly using HBase API, coprocessor and user-defined filter can provide millisecond resolution for small queries performance Or provide seconds of performance for tens of millions of rows.
-
The operation is simple: DML commands and creating tables and versioning incremental changes through DDL commands.
-
Safety function: GRANT and REVOKE are supported .
-
Perfect support for Hbase secondary index creation.
2.Phoenix installation and deployment
2.1 official website:
http://phoenix.apache.org/index.html
Preparation: install Zookeeper, Hadoop and Hbase in advance
2.2 upload the jar package to the specified location and unzip it
Renamed phoenix
2.3 copy the jar package to hbase/lib of each node
jar package:
phoenix-4.14.1-HBase-1.2-server.jar phoenix-core-4.14.1-HBase-1.2.jar
2.4 configuration environment
Add the following contents to / etc/profile under root permission
#phoenix export PHOENIX_HOME=/opt/module/phoenix export PHOENIX_CLASSPATH=$PHOENIX_HOME export PATH=$PATH:$PHOENIX_HOME/bin
2.5 restart Hbase
stop-hbase.sh start-hbase.sh
2.6 starting Phoenix
sqlline.py hmcs030,hmcs031,hmcs032:2181
3.Phoenix use
3.1 phoenix table operation
3.1.1 display all tables
!table Or! tables
3.1.2 create table
CREATE TABLE IF NOT EXISTS test ( State CHAR(2) NOT NULL, City VARCHAR NOT NULL, Population BIGINT CONSTRAINT my_pk PRIMARY KEY (state, city));
In phoenix, by default, table names are automatically converted to uppercase. To lowercase, use double quotation marks, such as "test".
3.1.3 insert record
upsert into test values('NY','NewYork',123);
3.1.4 query records
select * from test ; select * from test wherestate='NY';
3.1.5 deleting records
delete from test wherestate='NY';
3.1.6 delete table
drop table us_population;
3.1.7 exit command line
!quit
3.2 phoenix table mapping
3.2.1 relationship between Phoenix and Hbase table
By default, tables created directly in hbase cannot be viewed through phoenix.
If you want to operate on a table created directly in hbase in phoenix, you need to map the table in phoenix. There are two mapping methods: View mapping and table mapping
3.2.2 create test table in HBase command line
The table structure of test in Hbase is as follows: two column clusters, name and company
Rowkey | name | company | ||
---|---|---|---|---|
empid | firstname | lastname | name | address |
Start the Hbase shell
hbase shell
Create Hbase table test
create 'test','name','company'
3.2.3 view mapping
The view created by Phoenix is read-only, so it can only be used for query. You cannot modify the source data through the view.
Create view test table in phoenix
0: jdbc:phoenix:hmsc030> create view "test"(empid varchar primary key,"name"."firstname" varchar,"name"."lastname" varchar,"company"."name" varchar,"company"."address" varchar);
Delete view
0: jdbc:phoenix:hmcs030> drop view "test";
3.2.4 table mapping
There are two ways to create a table map to HBase using Apache Phoenix:
1) When a table already exists in HBase, you can create an associated table in a similar way to creating a view. You only need to change the create view to create table.
0: jdbc:phoenix:hmcs030> create table "test"(empid varchar primary key,"name"."firstname" varchar,"name"."lastname" varchar,"company"."name" varchar,"company"."address" varchar);
2) When there is no table in HBase, you can directly use the create table instruction to create the required table, and the system will automatically create a person in Phoenix and HBase_ Information and initializes the table structure according to the parameters in the instruction.
0: jdbc:phoenix:hmcs030> create table "person_infomation"(empid varchar primary key,"name"."firstname" varchar,"name"."lastname" varchar,"company"."name" varchar,"company"."address" varchar);
3.2.5 comparison and summary of view mapping and table mapping
Compared with directly creating a mapping table, the query efficiency of the view will be low. The reason is that when creating a mapping table, Phoenix will create some empty key value pairs in the table. The existence of these empty key value pairs can be used to improve the query efficiency.
For the associated table created with create table, if the table is modified, the source data will also change. At the same time, if the associated table is deleted, the source table will also be deleted. But the view will not. If the view is deleted, the source data will not change.
4.Phoenix creates Hbase secondary index
4.1 configure Hbase to support Phoenix to create secondary indexes
4.1.1 add the following configuration to hbase-site.xml of the Hregionserver node of Hbase
<!-- phoenix regionserver configuration parameter --> <property> <name>hbase.regionserver.wal.codec</name> <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value> </property> <property> <name>hbase.region.server.rpc.scheduler.factory.class</name> <value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value> <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description> </property> <property> <name>hbase.rpc.controllerfactory.class</name> <value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value> <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description> </property>
4.1.2 add the following configuration to hbase-site.xml of Hmaster node in Hbase
<!-- phoenix master configuration parameter --> <property> <name>hbase.master.loadbalancer.class</name> <value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value> </property> <property> <name>hbase.coprocessor.master.classes</name> <value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value> </property>
4.2 index creation
4.2.1 index classification of Phoenix
1) global index is the default index format. It is applicable to the business scenario of reading more and writing less. When writing data, it will consume a lot of overhead, because the index table also needs to be updated, and the index table is distributed on different data nodes. Cross node data transmission brings great performance consumption. When reading data, Phoenix will select the index table to reduce the query time. If the field to be queried is not an index field, the index table will not be used, that is, it will not improve the query speed.
CREATE INDEX my_index ON my_table (my_index)
2) Local index is suitable for scenarios with frequent write operations. The index data and data table data are stored in the same server, avoiding the additional overhead of writing indexes to the index tables of different servers during write operations. The query field is not an index field, and the index table will also be used, which will improve the query speed.
CREATE LOCAL INDEX my_index ON my_table (my_index)
4.2.2 three query methods to improve efficiency
1) CREATE INDEX my_index ON my_table (v1) INCLUDE (v2) |
---|
2) SELECT /*+ INDEX(my_table my_index) */ v2 FROM my_table WHERE v1 = 'foo' |
3) CREATE LOCAL INDEX my_index ON my_table (v1) |
4.2.3 how to delete an index
DROP INDEX my_index ON my_table