PostgreSQL 10.0 preview performance improvement - radix tree improves character encoding conversion performance

Keywords: PostgreSQL encoding git github

Label

PostgreSQL, 10.0, Radix tree, character encoding conversion

background

PostgreSQL 10.0 uses radix tree to improve the performance of UTF-8 and other character encoding conversions.

The encoding map file is arranged according to the new radix tree, and its performance is much better than that of binary search.

Use radix tree for character encoding conversions.  
  
author	Heikki Linnakangas <heikki.linnakangas@iki.fi>	  
Mon, 13 Mar 2017 18:46:39 +0000 (20:46 +0200)  
committer	Heikki Linnakangas <heikki.linnakangas@iki.fi>	  
Mon, 13 Mar 2017 18:46:39 +0000 (20:46 +0200)  
Replace the mapping tables used to convert between UTF-8 and other  
character encodings with new radix tree-based maps. Looking up an entry in  
a radix tree is much faster than a binary search in the old maps. As a  
bonus, the radix tree representation is also more compact, making the  
binaries slightly smaller.  
  
The "combined" maps work the same as before, with binary search. They are  
much smaller than the main tables, so it doesn't matter so much. However,  
the "combined" maps are now stored in the same .map files as the main  
tables. This seems more clear, since they're always used together, and  
generated from the same source files.  
  
Patch by Kyotaro Horiguchi, with lot of hacking by me at various stages.  
Reviewed by Michael Paquier and Daniel Gustafsson.  
  
Discussion: https://www.postgresql.org/message-id/20170306.171609.204324917.horiguchi.kyotaro%40lab.ntt.co.jp

For this discussion of patch, see Mail Group, URL at the end of this article.

PostgreSQL community style is very rigorous, a patch may be discussed in the mail group for several months or even years, according to your opinion repeated amendments, patch merged into master is very mature, so the stability of PostgreSQL is well known.

Reference resources

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=aeed17d00037950a16cc5ebad5b5592e5fa1ad0f

Posted by sysop on Wed, 13 Feb 2019 08:00:18 -0800

Programmer Group

PostgreSQL 10.0 preview performance improvement - radix tree improves character encoding conversion performance

Label

background

Reference resources

Hot Keywords