Huffman coding (data structure tree, c language version)
1, Experimental topic
1) Initialization. Read in the weight of each character and establish HuffTree;
2) Code. Coding with the built Huffman tree;
3) Output. Display the established Huffman tree and the corresponding coding table;
1. Data structure
//----- storage representation of Huffman tree ----- typedef struct{ int weight; //Node weight int parent,lchild,rchild; //Subscript of parent, left child and right child of node ) HTNode,*HuffmanTree; //Dynamically allocate arrays to store Huffman trees
2. Algorithm
Algorithm 1 constructs Huffman tree
Algorithm 2 calculates Huffman coding according to Huffman tree
2.1 constructing Huffman tree
Algorithm steps
The implementation of constructing Huffman tree algorithm can be divided into two parts.
1. Initialization: first dynamically apply for 2n units; Then cycle 2n-1 times, starting from unit 1, initialize the subscripts of parents, left children and right children in all units 1 to 2n-1 to 0 in turn; Finally, recycle n times and input the weight of leaf nodes in the first n units.
2. Create tree: cycle n-1 times to create Huffman tree through n-1 times of selection, deletion and merging. Selection is to select two tree root nodes s1 and s2 with parents of 0 and minimum weight from the current forest; Deletion refers to changing the parents of nodes s1 and s2 to non-zero; Merging is to store the weight sum of s1 and s2 as the weight of a new node in the cell after n +1 of the array, and record that the subscript of the left child of the new node is s1 and the subscript of the right child is s2.
Algorithm description
void CreateHuffmanTree(HuffmanTree &HT,int n) {//Constructing Huffman tree if(n<=1) return; m=2*n-1; HT=new HTNode[m+1]; //Cell 0 is not used, so m+1 cells need to be dynamically allocated, and HT[m] represents the root node for(i=1;i<=m;++i) //Initialize the subscripts of parents, left children and right children in units 1~m to 0 {HT[i] .parent=O;HT[i] .lchild=O;HT[i] .rchild=O;} for(i=l;i<=n;++i} //Enter the weight of the leaf node in the first n cells cin>>HT[i].weight; /*- ---- ----- -After initialization, let's start to create Huffman tree */ for (i=n+1; i<=m; ++i} {//Create Huffman tree through n-1 selection, deletion and merging Select (HT, i-1, sl, s2}; //In HT [k] (1 < = k < = i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers s1 and s2 in HT HT[s1].parent=i;HT[s2].parent=i; //Get the new node i, delete SL and s2 from the forest, and change the parent domain of s1 and s2 from 0 to 1 HT[i].lchild=s1;HT[i].rchild=s2; ///S1 and S2 are the left and right children of i HT[i].weight=HT[s1].weight+HT[s2].weight; //The weight of i is the sum of the weight of left and right children }//for }
2.2 Huffman coding according to Huffman tree
Algorithm steps
1. Allocate a coding table space HC storing n character codes, with a length of n+1; Allocate the dynamic array space cd for temporarily storing each character encoding, and set cd[n-1] to \ 0 '.
2. Solve the encoding of n characters one by one, cycle n times, and perform the following operations:
2.1 set the variable start to record the position where the code is stored in the cd. Start initially points to the last, that is, the position of the code terminator n-1;
2.2 set the variable c to record the subscript of the node from the leaf node to the root node. Initially, c is the subscript i of the current character to be encoded, and f is used to record the subscript of the parent node of I;
2.3 trace back from the leaf node to the root node to obtain the code of character i. when f does not reach the root node, the following operations are performed in a loop:
2.3.1 backtracking once, Start refers to one position forward, i.e. – start;
2.3.2 if node c is the left child of f, code 0 is generated, otherwise code 1 is generated, and the generated code 0 or 1 is saved in cd[start];
2.3.3 continue to backtrack upward and change the values of c and f
.
2.4 allocate space HC [i] for the ith character encoding according to the string length of array cd, and then encode the character in array cd
Copy to HC [i].
3. Release temporary space cd.
Algorithm description
void CreatHuffmanCode(HuffmanTree HT,HuffmanCode &HC,int n) {//Reverse the Huffman code of each character from leaf to root and store it in the coding table HC HC=new char* [n+1]; //Allocate an encoding table space that stores n character encodings cd=new char [n]; //Allocate dynamic array space for temporarily storing each character encoding cd[n-1]='\0'; //Encoding Terminator for(i=1;i<=n;++i) //Huffman coding character by character { start=n-1; //start points to the last at the beginning, that is, the position of the encoding terminator c=i; f=HT[i].parent; //f points to the parent node of node c while(f!=O) //Backtracking from the leaf node up to the root node { --start; //Backtracking once start points forward to a position if(HT[f].lchild==c) cd[start]='O'; //If node c is the left child of f, code 0 is generated else cd[start]='1'; //If node c is the right child of f, code 1 is generated c=f;f=HT[f].parent; //Continue to backtrack up } //Find the encoding of the ith character HC[i]=new char[n-start]; //Allocate space for the ith character encoding strcpy(HC[i],&cd[start]); //Copy the obtained code from the temporary space cd to the current line of HC }//for delete cd; //Free up temporary space }
2, Tool environment
Window10 operating system, Microsoft Visual C++2010 Express Edition, integrated development environment, C language
3, Experimental code
#include<stdio.h> #include<stdlib.h> #include<string.h> typedef struct{ int weight; //Node weight int parent,lchild,rchild; //Subscript of parent, left child and right child of node } HTNode,*HuffmanTree; typedef char **HuffmanCode; void Select(HuffmanTree HT,int n,int *s1,int *s2) {//Select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers sl and s2 in HT int num[100],index[100],i,j,k=0,max=0,temp; for(i=1;i<=n;i++) { if(HT[i].parent==0) {//Parent domain is 0 k++; num[k]=HT[i].weight; index[k]=i; //Record the sequence number and weight of the node whose parent domain is 0 } } for(i=1;i<k;i++) {//Arrange the smallest two nodes for(j=i+1;j<=k;j++) { if(num[i]>num[j]) { temp=num[i]; num[i]=num[j]; num[j]=temp; temp=index[i]; index[i]=index[j]; index[j]=temp;//Serial number and weight are exchanged at the same time } } } *s1=index[1];//Return the sequence numbers of the two nodes with the minimum weight to S1 and S2 *s2=index[2]; } void CreateHuffmanTree(HuffmanTree *HT,int n,int number[]) {//Constructing Huffman tree int m=2*n-1; int i,s1=1,s2=1; if(n<=1) return; *HT=(HuffmanTree)malloc((m+1)*sizeof(HTNode)); //Cell 0 is not used, so m+l cells need to be dynamically allocated, HT[m) represents the root node for(i=1;i<=m;++i) {//Initialize the subscripts of parents, left children and right children in unit l~m to 0 (*HT)[i].parent=0; (*HT)[i].lchild=0; (*HT)[i].rchild=0; } for(i=1;i<=n;++i) {//Input the weights of leaf nodes in the first n cells (*HT)[i].weight=number[i-1]; } //--------------------------------- for ( i=n+1; i<=m; ++i) {//Create Huffman tree through n-1 selection, deletion and merging Select(*HT, i-1, &s1, &s2); //In HT [k] (L < = k < = i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers sl and s2 in HT (*HT)[s1].parent=i; (*HT)[s2].parent=i; //Get the new node i, delete sl and s2 from the forest, and change the parent domain of sl and s2 from 0 to I (*HT)[i].lchild=s1; (*HT)[i].rchild=s2; //SL and S2 are the left and right children of i (*HT)[i].weight=(*HT)[s1].weight+(*HT)[s2].weight; //The weight of i is the sum of the weight of left and right children }//for } void CreateHuffmanCode(HuffmanTree HT,HuffmanCode *HC,int n) {//Reverse the Huffman code of each character from leaf to root and store it in the coding table HC int i,start,c,f; char *cd; *HC=(char **)malloc((n+1)*sizeof(char *)); //Allocate an encoding table space that stores n character encodings cd=(char *)malloc(n*sizeof(char)); //Allocate dynamic array space for temporarily storing each character encoding cd[n-1]='\0'; //Encoding Terminator for(i=1;i<=n;++i) //Huffman coding character by character { start=n-1; //start points to the last at the beginning, that is, the position of the encoding terminator c=i; f=HT[i].parent; //f points to the parent node of node c while(f!=0) //Backtracking from the leaf node up to the root node { --start;//Backtracking once start points forward to a position if(HT[f].lchild==c) cd[start]='0'; //If node c is the left child of f, code 0 is generated else cd[start]='1'; //If node c is the right child of f, code 1 is generated c=f; f=HT[f].parent; //Continue to backtrack up } //Find the encoding of the ith character (*HC)[i]=(char *)malloc((n-start)*sizeof(char)); //Allocate space for the ith character encoding strcpy((*HC)[i],&cd[start]); //Copy the obtained code from the temporary space cd to the current line of HC }//for cd=NULL; //Null pointer free(cd); //Free up temporary space } void ShowResults(HuffmanTree HT,HuffmanCode HC,int n) {//Show the created Huffman tree and the corresponding Huffman code int i; printf("\n\t node i\tweight\tparent\tlchild\trchild\n"); for(i=1;i<=2*n-1;++i) { printf("\t%d\t%d\t%d\t%d\t%d\n",i,HT[i].weight,HT[i].parent,HT[i].lchild,HT[i].rchild); } printf("\n\n\n\t node i\tHuffmanCode\n"); for(i=1;i<=n;++i) { printf("\t%d\t%s\n",i,HC[i]); } } int main() { HuffmanTree HT=NULL; HuffmanCode HC=NULL; int n=8,number[8]={5,29,7,8,14,23,3,11};//Here, number stores weights. Readers can write and read the initialization functions of weights and quantities by themselves CreateHuffmanTree(&HT,n,number); CreateHuffmanCode(HT,&HC,n); ShowResults(HT,HC,n); return 0; }