data type
In order to improve performance, ClickHouse provides composite data types compared with traditional databases. ClickHouse's Upadate and Delete are implemented by the Alter variant.
1 integer
Integer includes signed integer and unsigned integer
1.1 signed integer
type | Range | byte |
---|---|---|
int8 | [-128,127] | 1 |
int16 | [-32768 : 32767] | 2 |
int32 | [-2147483648 : 2147483647] | 3 |
int64 | [-9223372036854775808 : 9223372036854775807] | 4 |
1.2 unsigned integer
type | Range | byte |
---|---|---|
UInt8 | [0 : 255] | 1 |
UInt6 | [0 : 65535] | 2 |
UInt32 | [0 : 4294967295] | 3 |
UInt64 | [0 : 18446744073709551615] | 4 |
2 floating point
type | Effective precision (digits) | byte |
---|---|---|
Float32 | 7 | 4 |
Float64 | 16 | 8 |
Note: floating point number calculation has errors, so use it with caution
gcw1 :) select 1-0.9 ; SELECT 1 - 0.9 ┌───────minus(1, 0.9)─┐ │ 0.09999999999999998 │ └─────────────────────┘ 1 rows in set. Elapsed: 0.002 sec. gcw1 :)
3 Decimal
If higher precision is required, you can choose the Decimal type
Format: Decimal (P,S)
P: Represents the precision, determines the total number of digits (positive part + decimal part), and the value range is 0-38
S: Represents the scale, determines the decimal places, and the value range is 0-P
ClickHouse provides three abbreviations for Decimal: Decimal32, Decimal64 and Decimal128
3.1 the accuracy of addition and subtraction is greater
If the two numbers are added and subtracted, the precision of the result is the higher
gcw1 :) SELECT toDecimal32(1, 3) + toDecimal32(2, 2) ; SELECT toDecimal32(1, 3) + toDecimal32(2, 2) ┌─plus(toDecimal32(1, 3), toDecimal32(2, 2))─┐ │ 3.000 │ └────────────────────────────────────────────┘
3.2 sum of multiplication accuracy
gcw1 :) SELECT toDecimal32(1, 3) * toDecimal32(2, 2) ; SELECT toDecimal32(1, 3) * toDecimal32(2, 2) ┌─multiply(toDecimal32(1, 3), toDecimal32(2, 2))─┐ │ 2.00000 │ └────────────────────────────────────────────────┘
3.3 division precision takes the divisor
gcw1 :) SELECT toDecimal32(1, 3) / toDecimal32(2, 2) ; SELECT toDecimal32(1, 3) / toDecimal32(2, 2) ┌─divide(toDecimal32(1, 3), toDecimal32(2, 2))─┐ │ 0.500 │ └──────────────────────────────────────────────┘
4 string
String string string can be any length
It can contain any byte set, including empty bytes.
FixedString(N) a string of fixed length N, n must be a strictly positive natural number.
When the server reads a string with a length less than N, the length of N bytes is reached by adding empty bytes at the end of the string.
When the server reads a string with a length greater than N, an error message will be returned.
gcw1 :) SELECT toFixedString('zzzz', 5),LENGTH(toFixedString('zzzz', 5)) AS LENGTH; SELECT toFixedString('zzzz', 5), LENGTH(toFixedString('zzzz', 5)) AS LENGTH ┌─toFixedString('zzzz', 5)─┬─LENGTH─┐ │ zzzz │ 5 │ └──────────────────────────┴────────┘
5 UUID
ClickHouse directly makes UUID, which acts as the primary key in the traditional database, a data type
CREATE TABLE UUID_TEST ( `s1` UUID, `s2` String ) ENGINE = TinyLog;
insert into UUID_TEST select generateUUIDv4(),'s1'; insert into UUID_TEST(s2)values('s2');
Query display, auto supplement 0
gcw1 :) select * from UUID_TEST; SELECT * FROM UUID_TEST ┌───────────────────────────────────s1─┬─s2─┐ │ 7da9a8b3-e655-4218-aac3-e66ed758e094 │ s1 │ │ 00000000-0000-0000-0000-000000000000 │ s2 │ └──────────────────────────────────────┴────┘
6 enumeration type
Include Enum8 and Enum16 types. Enum saves the correspondence of 'string'= integer.
Enum8 is described with 'String'= Int8 pairs. Enum16 is described with 'String'= Int16 pairs.
CREATE TABLE t_enum ( x Enum8('pc' = 1, 'mobile' = 2, 'pad' = 3) ) ENGINE = TinyLog
When inserting an enumeration table, you can only insert the values listed in the definition, such as PC, mobile and pad. Inserting unknown values will throw exceptions
gcw1 :) INSERT INTO t_enum VALUES ('pc'), ('pc'), ('pad'); INSERT INTO t_enum VALUES Ok.
Insert unknown value
gcw1 :) INSERT INTO t_enum VALUES ('pc1'); INSERT INTO t_enum VALUES Exception on client: Code: 36. DB::Exception: Unknown element 'pc1' for type Enum8('pc' = 1, 'mobile' = 2, 'pad' = 3) Connecting to database default at localhost:19000 as user default. Connected to ClickHouse server version 20.5.4 revision 54435.
7 array
Array(T): an array of elements of type T.
T can be any type, including array type.
However, multi-dimensional arrays are not recommended. ClickHouse has limited support for multi-dimensional arrays.
For example, you cannot store multidimensional arrays in the MergeTree table.
You can use the array function to create an array:
- array(T)
- Square brackets [] can also be used
ClickHouse can automatically infer data types
gcw1 :) select [1,2,3] as arr, toTypeName(arr); SELECT [1, 2, 3] AS arr, toTypeName(arr) ┌─arr─────┬─toTypeName([1, 2, 3])─┐ │ [1,2,3] │ Array(UInt8) │ └─────────┴───────────────────────┘
8 tuples
Tuple(T1, T2,...): a tuple in which each element has a separate type
gcw1 :) SELECT tuple(1,'a') AS x, (1,2,3) as y, toTypeName(x), toTypeName(y) ; SELECT (1, 'a') AS x, (1, 2, 3) AS y, toTypeName(x), toTypeName(y) ┌─x───────┬─y───────┬─toTypeName(tuple(1, 'a'))─┬─toTypeName(tuple(1, 2, 3))─┐ │ (1,'a') │ (1,2,3) │ Tuple(UInt8, String) │ Tuple(UInt8, UInt8, UInt8) │ └─────────┴─────────┴───────────────────────────┴────────────────────────────┘
9 Date,DateTime
Time types are divided into DateTime, DateTime64 and Date. It should be noted that ClickHouse currently has no timestamp type, that is, the highest precision of time type is seconds. Therefore, if you need to process time with millisecond and microsecond precision, you can only use UInt type
Date type
It is stored in two bytes, indicating the date value from 1970-01-01 (unsigned) to the current date. There is no time zone information stored in the date
CREATE TABLE t_date (x date) ENGINE=TinyLog; INSERT INTO t_date VALUES('2021-11-06'); SELECT x,toTypeName(x) FROM t_date;
DateTime
The Unix timestamp is stored in four bytes (unsigned). It is allowed to store values in the same range as the date type. The minimum value is 0000-00-00 00:00:00. The timestamp type value is accurate to seconds (no leap seconds). The time zone uses the system time zone when the client or server is started
CREATE TABLE t_datetime(`timestamp` DateTime) ENGINE = TinyLog; INSERT INTO t_datetime Values('2020-11-06 00:00:00'); SELECT * FROM t_datetime; ┌───────────timestamp─┐ │ 2020-11-06 00:00:00 │ └─────────────────────┘ ## DateTime type is time zone sensitive SELECT toDateTime(timestamp, 'Asia/Shanghai') AS column, toTypeName(column) AS x FROM t_datetime ┌──────────────column─┬─x─────────────────────────┐ │ 2020-11-06 00:00:00 │ DateTime('Asia/Shanghai') │ └─────────────────────┴───────────────────────────┘
10 boolean type
There is no separate type to store Booleans. You can use the UInt8 type with values limited to 0 or 1