Prefix Tree (Trie Tree)

Keywords: C++ data structure

Trie Tree

A Trie tree, also known as a dictionary tree, or word search tree or key tree, is a tree structure and a variant of a hash tree. Typical applications are for counting and sorting large numbers of strings (but not just strings)It has the advantage of minimizing unnecessary string comparisons and being more efficient than hash tables.

The core idea of Trie is space for time. Use common prefixes of strings to reduce the cost of query time for efficiency.
It has three basic properties:

  1. The root node does not contain a character, and every node except the root node contains only one character.
  2. From the root node to a node, the characters passed along the path are concatenated to be the string corresponding to that node.
  3. All child nodes of each node contain different characters.

prefix query

"As mentioned above,"For example, for a word, we want to ask if its prefix has ever appeared. That way hash is not good, but trie is still simple."Let's take a look at this prefix query:

Known n words with an average length of 10 lowercase letters, determine if there is a prefix substring where one string is another. Here are three ways to compare:

  1. The easiest thing to think of is searching back and forth from the string set to see if each string is a prefix to a string in the string set, with an O(n^2) complexity.
  2. Using hash: We use hash to save all prefix substrings of all strings and establish that the complexity of the existing substring hash is O(n*len) and the complexity of the query is O(n)* O(1)= O(n).
  3. Use trie: because when querying whether a string, such as string a B c, is a prefix to a string, it is obvious that a string that does not start with a, such as b,c,d..., does not need to be searched. So the complexity of building a trie is O(n*len)Thus, the overall complexity is O(n*len), and the complexity of the actual query is only O(len). (To put it plainly, the average height of the Trie tree is len, so the query complexity of the Trie tree is O(h)=O(len)Like a binary balanced tree whose height is logN, the average time complexity of insertion is O(logN).
#include <iostream>
#include <cstring>
#include <string>
using namespace std;

const int branchNum = 26;
struct TrieNode{
    bool isStr;
    TrieNode* next[branchNum];
};

void InsertTrie(TrieNode* root, string& str){
    TrieNode* location = root;
    int len = str.size(),i=0;

    while(i<len){
        if(!location->next[str[i]-'a']){
            TrieNode* newNode = new TrieNode();
            newNode->isStr = false;
            memset(newNode->next,NULL,sizeof(newNode->next));
            location->next[str[i]-'a'] = newNode;
        }
        location = location->next[str[i]-'a'];
        i+=1;
    }
    location->isStr = true;
}

bool SerachTrie(TrieNode* root, string& str){
    int len = str.size(),i=0;
    TrieNode *location = root;
    while(i<len&&location){
        location = location->next[str[i]-'a'];
        i+=1;
    }
    return location&&location->isStr;
}

void Delete(TrieNode* root){
    TrieNode* location = root;
    for(int i=0;i<branchNum;i++){
        if(location->next[i]) Delete(location->next[i]);
    }
    delete location;
    location = NULL;
}

int main(){
    TrieNode* root = new TrieNode();
    root->isStr = false;
    memset(root->next, NULL, sizeof(root->next));

    string s1 = "abcd";
    string s2 = "adg";
    InsertTrie(root, s1);
    InsertTrie(root, s2);

    if(SerachTrie(root,s2)){
        std::cout<<"adg exists!"<<std::endl;
    }

    string s3 = "bashk";
    if(SerachTrie(root,s3)){
        std::cout<<"bashk exists!"<<std::endl;       
    }

    Delete(root);
}

Reference resources: trie tree (prefix tree) - rossonchao - blog Park (cnblogs.com)

Posted by savagenoob on Wed, 29 Sep 2021 10:00:10 -0700