1. Problem description
I saw a blank question from my colleagues in the group two days ago. The general phenomenon is as follows:
mysql> select @@version; +-----------+ | @@version | +-----------+ | 8.0.25 | +-----------+ 1 row in set (0.00 sec) mysql> create table t1( -> c1 int, -> c2 varchar(4) check(c2<>'') #No spaces between single quotes -> )engine=innodb; Query OK, 0 rows affected (0.21 sec) mysql> insert into t1 select 1,' '; #Insert two spaces in the c2 field ERROR 3819 (HY000): Check constraint 't1_chk_1' is violated.
Check defines c2 < > '', and inserts a space into the c2 field to prompt that the check constraint is violated.
Why is' '(there is one or more spaces between single quotes) in the insert statement judged as'' (there is no space between single quotes), resulting in insertion failure?
2. Involving knowledge
2.1,Stored and Retrieved
https://dev.mysql.com/doc/refman/8.0/en/char.html
When CHAR values are stored, they are right-padded with spaces to the specified length. When CHAR values are retrieved, trailing spaces are removed unless the PAD_CHAR_TO_FULL_LENGTH SQL mode is enabled.
VARCHAR values are not padded when they are stored. Trailing spaces are retained when values are stored and retrieved, in conformance with standard SQL.
CHAR(N): when the number of characters inserted is less than N, it will add spaces to the right of the string until the total number of characters reaches n; When the query returns data, the space at the end of the string will be removed by default, unless SQL_MODE setting pad_ CHAR_ TO_ FULL_ Length (the manual shows 8.0.13 deprecated and 8.0.25 can still be used).
VARCHAR(N): when the number of inserted characters is less than N, it will not supplement spaces on the right side of the string, and the insert content will be stored intact; If there is a space on the right of the original string, the space will be retained during storage and query return.
2.2,Collation Pad Attribute
https://dev.mysql.com/doc/refman/8.0/en/char.html
Values in CHAR, VARCHAR, and TEXT columns are sorted and compared according to the character set collation assigned to the column.
MySQL collations have a pad attribute of PAD SPACE, other than Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD.
For CHAR, VARCHAR and TEXT fields, the sorting and comparison operations depend on the Collation on the field. The Pad attribute of Collation controls the processing of spaces at the end of the string.
You can use information_ In the schema.collections table, view the Pad attribute used by the collage:
mysql> select collation_name,pad_attribute from information_schema.collations; +----------------------------+---------------+ | collation_name | pad_attribute | +----------------------------+---------------+ | armscii8_general_ci | PAD SPACE | ... | utf8mb4_0900_bin | NO PAD | +----------------------------+---------------+ 272 rows in set (0.01 sec)
2.3,Trailing Space Handling in Comparisons
For nonbinary strings (CHAR, VARCHAR, and TEXT values), the string collation pad attribute determines treatment in comparisons of trailing spaces at the end of strings:
• For PAD SPACE collations, trailing spaces are insignificant in comparisons; strings are compared without regard to trailing spaces.
• NO PAD collations treat trailing spaces as significant in comparisons, like any other character.
"Comparison" in this context does not include the LIKE pattern-matching operator, for which trailing spaces are significant, regardless of collation.
PAD SPACE: space at the end of the string is ignored in sorting and comparison operations.
NO PAD: in sorting and comparison operations, spaces at the end of the string are regarded as ordinary characters and cannot be ignored.
3. Problem solving
The following operations are based on MySQL 8.0.25 Community Edition
3.1. Viewing collations used in fields
mysql> show full fields in t1; +-------+------------+--------------------+------+-----+---------+-------+---------------------------------+---------+ | Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment | +-------+------------+--------------------+------+-----+---------+-------+---------------------------------+---------+ | c1 | int | NULL | YES | | NULL | | select,insert,update,references | | | c2 | varchar(4) | utf8mb4_unicode_ci | YES | | NULL | | select,insert,update,references | | +-------+------------+--------------------+------+-----+---------+-------+---------------------------------+---------+ 2 rows in set (0.00 sec)
The collision of c2 column is utf8mb4_unicode_ci.
3.2. View the Pad attribute of collision
mysql> select COLLATION_NAME,PAD_ATTRIBUTE from INFORMATION_SCHEMA.COLLATIONS where COLLATION_NAME in('utf8mb4_unicode_ci','utf8mb4_0900_ai_ci'); +--------------------+---------------+ | COLLATION_NAME | PAD_ATTRIBUTE | +--------------------+---------------+ | utf8mb4_0900_ai_ci | NO PAD | | utf8mb4_unicode_ci | PAD SPACE | +--------------------+---------------+ 1 row in set (0.00 sec)
utf8mb4_ unicode_ The Pad attribute of Ci is PAD SPACE. It can be seen from 2.3 that c2 column ignores the space at the end of the string in sorting and comparison operations.
Therefore, during check comparison, the spaces in the inserted '' will be ignored. Obviously, ignoring the spaces conflicts with the check constraint, and the insertion fails.
mysql> SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci; Query OK, 0 rows affected (0.00 sec) mysql> select ' ' = ''; +--------+ | ' '='' | +--------+ | 1 | +--------+ 1 row in set (0.00 sec)
3.3. How to make a check constraint take effect according to normal logic
The general rule here is that spaces are spaces and should not be ignored. Just change the c2 field to the collision of NO PAD, and the space can be inserted normally:
mysql> insert into t1 select 1,' '; #Insert two spaces in the c2 field ERROR 3819 (HY000): Check constraint 't1_chk_1' is violated. mysql> alter table t1 modify c2 varchar(4) collate utf8mb4_0900_ai_ci; #Modified to collage of NO PAD Query OK, 0 rows affected (0.15 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> insert into t1 select 1,' '; #Insert two spaces in the c2 field Query OK, 1 row affected (0.12 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> insert into t1 select 1,''; #There is no space between '' ERROR 3819 (HY000): Check constraint 't1_chk_1' is violated. mysql> select c1,c2,hex(c2) from t1; +------+------+---------+ | c1 | c2 | hex(c2) | +------+------+---------+ | 1 | | 2020 | +------+------+---------+ 1 row in set (0.01 sec)
4. Expand
4.1. If c2 column is of CHAR type, is it the same as the previous question
Same. When sorting and comparing CHAR, VARCHAR and TEXT, the spaces at the end of the string are processed according to the Pad attribute of the column collision. At this time, the string used for comparison operation is the content of insert.
4.2 what is the expression form in WHERE condition
Create a new table and insert data
mysql> create table t3( -> c1 int, -> c2 char(4) collate utf8mb4_unicode_ci, -> c3 char(4) collate utf8mb4_0900_ai_ci, -> c4 varchar(4) collate utf8mb4_unicode_ci, -> c5 varchar(4) collate utf8mb4_0900_ai_ci -> )engine=innodb; Query OK, 0 rows affected (0.29 sec) mysql> insert into t3 select 1,'a','a','a','a'; Query OK, 1 row affected (0.09 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> insert into t3 select 2,'a ','a ','a ','a '; #Each column contains 1 space Query OK, 1 row affected (0.20 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> insert into t3 select 3,'a ','a ','a ','a '; #There are 3 spaces in the first two columns and 2 spaces in the last two columns Query OK, 1 row affected (0.17 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> insert into t3 select 4,'a ','a ','a ','a '; #There are 2 spaces in the first two columns and 3 spaces in the last two columns Query OK, 1 row affected (0.14 sec) Records: 1 Duplicates: 0 Warnings: 0
Observe the return result of WHERE condition. The return of char type is controlled by PAD_CHAR_TO_FULL_LENGTH impact (reference 2.1)
mysql> set sql_mode=''; Query OK, 0 rows affected (0.00 sec) mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c2='a'; +------+------+------+------+------+---------+---------+----------+----------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+---------+---------+----------+----------+ | 1 | a | a | a | a | 61 | 61 | 61 | 61 | | 2 | a | a | a | a | 61 | 61 | 6120 | 6120 | | 3 | a | a | a | a | 61 | 61 | 612020 | 612020 | | 4 | a | a | a | a | 61 | 61 | 61202020 | 61202020 | +------+------+------+------+------+---------+---------+----------+----------+ 4 rows in set (0.00 sec) c2 char->The space at the end of the string is removed from the returned data c2 utf8mb4_unicode_ci->PAD SPACE->Sort and compare operations, ignoring spaces at the end of the string mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c3='a'; +------+------+------+------+------+---------+---------+----------+----------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+---------+---------+----------+----------+ | 1 | a | a | a | a | 61 | 61 | 61 | 61 | | 2 | a | a | a | a | 61 | 61 | 6120 | 6120 | | 3 | a | a | a | a | 61 | 61 | 612020 | 612020 | | 4 | a | a | a | a | 61 | 61 | 61202020 | 61202020 | +------+------+------+------+------+---------+---------+----------+----------+ 4 rows in set (0.01 sec) c3 char->The space at the end of the string is removed from the returned data c3 utf8mb4_0900_ai_ci->NO PAD->Sorting and comparison operations, the spaces at the end of the string are regarded as ordinary characters mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c4='a'; +------+------+------+------+------+---------+---------+----------+----------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+---------+---------+----------+----------+ | 1 | a | a | a | a | 61 | 61 | 61 | 61 | | 2 | a | a | a | a | 61 | 61 | 6120 | 6120 | | 3 | a | a | a | a | 61 | 61 | 612020 | 612020 | | 4 | a | a | a | a | 61 | 61 | 61202020 | 61202020 | +------+------+------+------+------+---------+---------+----------+----------+ 4 rows in set (0.00 sec) c4 varchar->Returns the space at the time of data retention insertion c4 utf8mb4_unicode_ci->PAD SPACE->Sort and compare operations, ignoring spaces at the end of the string mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c5='a'; +------+------+------+------+------+---------+---------+---------+---------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+---------+---------+---------+---------+ | 1 | a | a | a | a | 61 | 61 | 61 | 61 | +------+------+------+------+------+---------+---------+---------+---------+ 1 row in set (0.00 sec) c5 varchar->Returns the space at the time of data retention insertion c5 utf8mb4_0900_ai_ci->NO PAD->Sorting and comparison operations, the spaces at the end of the string are regarded as ordinary characters mysql> set sql_mode='PAD_CHAR_TO_FULL_LENGTH'; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c2='a'; +------+------+------+------+------+----------+----------+----------+----------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+----------+----------+----------+----------+ | 1 | a | a | a | a | 61202020 | 61202020 | 61 | 61 | | 2 | a | a | a | a | 61202020 | 61202020 | 6120 | 6120 | | 3 | a | a | a | a | 61202020 | 61202020 | 612020 | 612020 | | 4 | a | a | a | a | 61202020 | 61202020 | 61202020 | 61202020 | +------+------+------+------+------+----------+----------+----------+----------+ 4 rows in set (0.00 sec) c2 char->PAD_CHAR_TO_FULL_LENGTH->Return the supplementary space to the right of the data string c2 utf8mb4_unicode_ci->PAD SPACE->Sort and compare operations, ignoring spaces at the end of the string mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c3='a'; Empty set (0.00 sec) c3 char->PAD_CHAR_TO_FULL_LENGTH->Return the supplementary space to the right of the data string c3 utf8mb4_0900_ai_ci->NO PAD->Sorting and comparison operations, the spaces at the end of the string are regarded as ordinary characters 1~4 that 's ok c3 Column return values contain spaces, and c3 Columnar Collation yes NO PAD,Spaces at the end of the string cannot be ignored, where Filter cannot find record mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c4='a'; +------+------+------+------+------+----------+----------+----------+----------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+----------+----------+----------+----------+ | 1 | a | a | a | a | 61202020 | 61202020 | 61 | 61 | | 2 | a | a | a | a | 61202020 | 61202020 | 6120 | 6120 | | 3 | a | a | a | a | 61202020 | 61202020 | 612020 | 612020 | | 4 | a | a | a | a | 61202020 | 61202020 | 61202020 | 61202020 | +------+------+------+------+------+----------+----------+----------+----------+ 4 rows in set (0.00 sec) c4 varchar->Returns the space at the time of data retention insertion c4 utf8mb4_unicode_ci->PAD SPACE->Sort and compare operations, ignoring spaces at the end of the string mysql> select *,hex(c2),hex(c3),hex(c4),hex(c5) from t3 where c5='a'; +------+------+------+------+------+----------+----------+---------+---------+ | c1 | c2 | c3 | c4 | c5 | hex(c2) | hex(c3) | hex(c4) | hex(c5) | +------+------+------+------+------+----------+----------+---------+---------+ | 1 | a | a | a | a | 61202020 | 61202020 | 61 | 61 | +------+------+------+------+------+----------+----------+---------+---------+ 1 row in set (0.00 sec) c5 varchar->Returns the space at the time of data retention insertion c5 utf8mb4_0900_ai_ci->NO PAD->Sorting and comparison operations, the spaces at the end of the string are regarded as ordinary characters
At this time, the string used for comparison operation is the content of Retrieved. When char and VARCHAR return data, the space at the end of the string is handled differently, and PAD_CHAR_TO_FULL_LENGTH only affects char types.
4.3. Impact on unique index
https://dev.mysql.com/doc/refman/8.0/en/char.html
For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters results in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error.
If there is a unique index (single column, character type) and the number of spaces in the tail of the inserted data is different, a duplicate key error may be reported:
mysql> select c1,c4,c5,hex(c4),hex(c5) from t3; +------+------+------+----------+----------+ | c1 | c4 | c5 | hex(c4) | hex(c5) | +------+------+------+----------+----------+ | 1 | a | a | 61 | 61 | | 2 | a | a | 6120 | 6120 | | 3 | a | a | 612020 | 612020 | | 4 | a | a | 61202020 | 61202020 | +------+------+------+----------+----------+ 4 rows in set (0.00 sec) mysql> alter table t3 add unique(c4); ERROR 1062 (23000): Duplicate entry 'a' for key 't3.c4' mysql> alter table t3 add unique(c5); Query OK, 0 rows affected (0.44 sec) Records: 0 Duplicates: 0 Warnings: 0
You can see that c4 column failed to create a unique index, and c5 column succeeded in creating a unique index.
c4 utf8mb4_ unicode_ CI - > pad space - > sorting and comparison operations, ignoring the spaces at the end of the string, and 4 lines of data are repeated.
c5 utf8mb4_ 0900_ ai_ CI - > no pad - > sorting and comparison operations, the spaces at the end of the string are regarded as ordinary characters, and the four lines of data are different.
5. Summary
Stored
- | CHAR(N) | VARCHAR(N) |
---|---|---|
Stored | If the character is less than N, fill in the space on the right | Keep the space at the time of insertion, and no additional space will be added on the right |
Retrieved
SQL_MODE | CHAR(N) | VARCHAR(N) |
---|---|---|
Default Value | Remove the spaces at the end of the string | Keep spaces when inserting |
PAD_CHAR_TO_FULL_LENGTH | Return the complete string, and fill in the space on the right when it is less than n | Keep spaces when inserting |
Comparison (excluding like)
Pad Attribute | CHAR(N)/VARCHAR(N) |
---|---|
PAD SPACE | Ignore trailing spaces in string |
NO PAD | The space at the end of the string is regarded as an ordinary character and cannot be ignored |
Enjoy GreatSQL 😃
Article recommendation:
Technology sharing | MGR best practice
https://mp.weixin.qq.com/s/66u5K7a9u8GcE2KPn4kCaA
Technology sharing | the way to repair MGR Bug in Wanli database
https://mp.weixin.qq.com/s/IavpeP93haOKVBt7eO8luQ
Macos system compiler percona and the difference of some functions on Macos system
https://mp.weixin.qq.com/s/jAbwicbRc1nQ0f2cIa_2nQ
Technology sharing | using systemd to manage single machine and multiple instances of MySQL
https://mp.weixin.qq.com/s/iJjXwd0z1a6isUJtuAAHtQ
Product | GreatSQL to create a better MGR ecology
https://mp.weixin.qq.com/s/ByAjPOwHIwEPFtwC5jA28Q
Product | GreatSQL MGR optimization reference
https://mp.weixin.qq.com/s/5mL_ERRIjpdOuONian8_Ow
About GreatSQL
GreatSQL is a MySQL branch maintained by Wanli database. It focuses on improving the reliability and performance of MGR and supports the parallel query feature of InnoDB. It is a branch version of MySQL suitable for financial applications.
Gitee:
https://gitee.com/GreatSQL/GreatSQL
GitHub:
https://github.com/GreatSQL/GreatSQL
Wechat & QQ group:
You can scan code to add GreatSQL community assistant wechat friends, send verification information "add group" to join GreatSQL/MGR exchange wechat group, or directly scan code to join GreatSQL/MGR exchange QQ group.
This article is composed of blog one article multi posting platform OpenWrite release!