Today, let's talk about how to build a watch efficiently

Keywords: Programming encoding PostgreSQL SQL

"High efficiency" means the speed of table building is fast? no, it's 9012 years. What do you want? Efficient table creation refers to how to quickly query this table. Under the same resource conditions, the query performance is better than that of ordinary tables!
When we build a table, we usually attach attributes to the table, such as storage form, life cycle TTL, etc. at the same time, we also build some indexes, so that the query results can be more efficient. However, the programming languages used in different products are not the same, and the syntax of table building may be different. Today's editor will tell you how to build tables efficiently in interactive analysis. (I'll tell you secretly that after reading this article, your table building speed will be faster.)

1. field

First of all, the simplest table is composed of table name and field. The field can understand the table and the column name of a table. The field type is the row data type of each column field. The data type of the same column must be unique. For example, a column named age cannot have the row data tom below. A simple table is built in the example as follows. You can refer to the data types supported by the current interactive analysis Document data type.

CREATE TABLE  (
 id bigint ,
 name text ,
 time timestamptz ,
 c double precision 
);

2. Set field properties

1) primary key

The first is the PRIMARY KEY, which we often call pk (PRIMARY KEY). The PRIMARY KEY is the only index of the data table. For example, there are student ID and name in the student table, and the name may have the same name, but the student ID is the only one. If you want to search a record from the student table, such as looking for a person, you can only find the only one according to the student ID, which is the PRIMARY KEY.

CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);

2) available

Secondly, it is nullable, which means that the row data of this column can be null, that is, there is no data.

CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision, //This field can be empty
PRIMARY KEY (id)
);

3) array

Then there is array. For the usage of array, we suggest you refer to interactive analysis Document array type.

3. Set table storage properties

After a table is built, the server is required to store the table. There are two ways to store the table: row storage and column storage. Column store stores the data of the same field in all records; row store stores the data of all fields in each record. In interactive analysis, both kinds of storage support, but the default is column storage. Column storage is more friendly to olap scenarios and suitable for various complex queries; row storage is more friendly to kv scenarios and suitable for primary key based point query and scan.
To create a column table:

BEGIN;
CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);
CALL SET_TABLE_PROPERTY('my_table', 'orientation', 'column');
COMMIT;

To create a row storage table:

BEGIN;
CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);
CALL SET_TABLE_PROPERTY('my_table', 'orientation', 'row');
COMMIT;

4. Set table properties

In order to better query the table, some properties are usually set for the table.

1) life cycle

The life cycle of a table means that the table will be automatically recycled after the specified time has not changed since the last time the data is updated.

BEGIN;
CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);
CALL SET_TABLE_PROPERTY('my_table', 'orientation', 'column');
CALL SET_TABLE_PROPERTY('my_table', 'time_to_live_in_seconds', '36000');
COMMIT;

2) index

Index is built to speed up query table. In interactive analysis, there are several indexes that can be built as follows

  • clustering key

The clustered index is actually a sort index. The type of index is strictly related to the order of columns. Clustered index can speed up users' range and filter queries on index columns. Clustered index can help speed up queries with certain columns as where conditions.

BEGIN;
CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);
CALL SET_TABLE_PROPERTY('my_table', 'orientation', 'column');
CALL SET_TABLE_PROPERTY('my_table', 'clustering_key', 'id');
CALL SET_TABLE_PROPERTY('my_table', 'time_to_live_in_seconds', '36000');
COMMIT;
  • dictionary encoding columns

Dictionary encoding column, specifying the value of column to build dictionary map. Dictionary coding can turn string comparison into number comparison, and speed up group by, filter and other queries. By default, all text columns are implicitly set to dictionary  encoding  columns.

BEGIN;
CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);
CALL SET_TABLE_PROPERTY('my_table', 'orientation', 'column');
CALL SET_TABLE_PROPERTY('my_table', 'clustering_key', 'id');
CALL SET_TABLE_PROPERTY('my_table', 'dictionary_encoding_columns', 'name');
CALL SET_TABLE_PROPERTY('my_table', 'time_to_live_in_seconds', '36000');
COMMIT;
  • bitmap columns

The bitmap index builds bit encoding on these columns. Bitmap can filter the internal data of segment quickly, so it is recommended to build the equivalent query column of where condition into bit code. By default, all text columns are implicitly set to bitmap \ columns.

BEGIN;
CREATE TABLE my_table (
 id bigint NOT NULL,
 name text NOT NULL,
 time timestamptz NOT NULL,
 c double precision NOT NULL,
PRIMARY KEY (id)
);
CALL SET_TABLE_PROPERTY('my_table', 'orientation', 'column');
CALL SET_TABLE_PROPERTY('my_table', 'clustering_key', 'id');
CALL SET_TABLE_PROPERTY('my_table', 'bitmap_columns', 'name');
CALL SET_TABLE_PROPERTY('my_table', 'dictionary_encoding_columns', 'name');
CALL SET_TABLE_PROPERTY('my_table', 'time_to_live_in_seconds', '36000');
COMMIT;

For more information on the use of index, you can refer to the interactive analysis document setting table properties.

After learning so many contents, I'm sure you will still have some doubts. For example, interactive analysis is compatible with Postgresql, but some friends may not be particularly familiar with some of the syntax of pg, especially the part of building index. How can I use the call set to the extreme and a little confused? How can I build tables efficiently under such circumstances?
Don't worry, HoloStudio, the magic weapon of interactive analysis, is here The initial experience of dataworks holostudio in interactive analysis of six pulse sword )Using the UI of HoloStudio to create a table, you don't need to enter code, you just need to move your hand a little bit to build a table. No, look at the following picture:

With the UI, you can quickly create tables and indexes by clicking on them, which greatly saves time and grammar learning cost. In addition, you can immediately generate the SQL language for table creation in the edit box below, which is convenient for comparison and learning, and you don't need to worry about programming inferior to others! What are you waiting for? Open interactive analysis to use.
Click to enter Interactive analysis official website
After the table is built, how can we efficiently import data to the table? With table data, how to query quickly, or how to improve query performance? Please look forward to the next issue!


If you have any questions or are interested in interactive analysis in the process of using, welcome to consult with nail group!

Posted by birdie on Fri, 13 Dec 2019 00:37:57 -0800