# Huffman coding (data structure tree, c language version)

Keywords: C data structure

# 1, Experimental topic

1) Initialization. Read in the weight of each character and establish HuffTree;
2) Code. Coding with the built Huffman tree;
3) Output. Display the established Huffman tree and the corresponding coding table;

## 1. Data structure

```//----- storage representation of Huffman tree -----
typedef struct{
int weight; //Node weight
int parent,lchild,rchild; //Subscript of parent, left child and right child of node
) HTNode,*HuffmanTree; //Dynamically allocate arrays to store Huffman trees
```

## 2. Algorithm

Algorithm 1 constructs Huffman tree
Algorithm 2 calculates Huffman coding according to Huffman tree

### 2.1 constructing Huffman tree

Algorithm steps

The implementation of constructing Huffman tree algorithm can be divided into two parts.
1. Initialization: first dynamically apply for 2n units; Then cycle 2n-1 times, starting from unit 1, initialize the subscripts of parents, left children and right children in all units 1 to 2n-1 to 0 in turn; Finally, recycle n times and input the weight of leaf nodes in the first n units.
2. Create tree: cycle n-1 times to create Huffman tree through n-1 times of selection, deletion and merging. Selection is to select two tree root nodes s1 and s2 with parents of 0 and minimum weight from the current forest; Deletion refers to changing the parents of nodes s1 and s2 to non-zero; Merging is to store the weight sum of s1 and s2 as the weight of a new node in the cell after n +1 of the array, and record that the subscript of the left child of the new node is s1 and the subscript of the right child is s2.

Algorithm description

```void CreateHuffmanTree(HuffmanTree &HT,int n)
{//Constructing Huffman tree
if(n<=1) return;
m=2*n-1;
HT=new HTNode[m+1]; //Cell 0 is not used, so m+1 cells need to be dynamically allocated, and HT[m] represents the root node
for(i=1;i<=m;++i) //Initialize the subscripts of parents, left children and right children in units 1~m to 0
{HT[i] .parent=O;HT[i] .lchild=O;HT[i] .rchild=O;}
for(i=l;i<=n;++i} //Enter the weight of the leaf node in the first n cells
cin>>HT[i].weight;
/*－ －－－－ －－－－－ －After initialization, let's start to create Huffman tree */
for (i=n+1; i<=m; ++i}
{//Create Huffman tree through n-1 selection, deletion and merging
Select (HT, i-1, sl, s2};
//In HT [k] (1 < = k < = i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers s1 and s2 in HT
HT[s1].parent=i;HT[s2].parent=i;
//Get the new node i, delete SL and s2 from the forest, and change the parent domain of s1 and s2 from 0 to 1
HT[i].lchild=s1;HT[i].rchild=s2; ///S1 and S2 are the left and right children of i
HT[i].weight=HT[s1].weight+HT[s2].weight; //The weight of i is the sum of the weight of left and right children
}//for
}
```

### 2.2 Huffman coding according to Huffman tree

Algorithm steps

1. Allocate a coding table space HC storing n character codes, with a length of n+1; Allocate the dynamic array space cd for temporarily storing each character encoding, and set cd[n-1] to \ 0 '.

2. Solve the encoding of n characters one by one, cycle n times, and perform the following operations:

2.1 set the variable start to record the position where the code is stored in the cd. Start initially points to the last, that is, the position of the code terminator n-1;
2.2 set the variable c to record the subscript of the node from the leaf node to the root node. Initially, c is the subscript i of the current character to be encoded, and f is used to record the subscript of the parent node of I;
2.3 trace back from the leaf node to the root node to obtain the code of character i. when f does not reach the root node, the following operations are performed in a loop:

2.3.1 backtracking once, Start refers to one position forward, i.e. – start;
2.3.2 if node c is the left child of f, code 0 is generated, otherwise code 1 is generated, and the generated code 0 or 1 is saved in cd[start];
2.3.3 continue to backtrack upward and change the values of c and f
.
2.4 allocate space HC [i] for the ith character encoding according to the string length of array cd, and then encode the character in array cd
Copy to HC [i].

3. Release temporary space cd.

Algorithm description

```void CreatHuffmanCode(HuffmanTree HT,HuffmanCode &HC,int n)
{//Reverse the Huffman code of each character from leaf to root and store it in the coding table HC
HC=new char* [n+1]; //Allocate an encoding table space that stores n character encodings
cd=new char [n]; //Allocate dynamic array space for temporarily storing each character encoding
cd[n-1]='\0'; //Encoding Terminator
for(i=1;i<=n;++i) //Huffman coding character by character
{
start=n-1; //start points to the last at the beginning, that is, the position of the encoding terminator
c=i; f=HT[i].parent; //f points to the parent node of node c
while(f!=O) //Backtracking from the leaf node up to the root node
{
--start; //Backtracking once start points forward to a position
if(HT[f].lchild==c) cd[start]='O'; //If node c is the left child of f, code 0 is generated
else cd[start]='1'; //If node c is the right child of f, code 1 is generated
c=f;f=HT[f].parent; //Continue to backtrack up
} //Find the encoding of the ith character
HC[i]=new char[n-start]; //Allocate space for the ith character encoding
strcpy(HC[i],&cd[start]); //Copy the obtained code from the temporary space cd to the current line of HC
}//for
delete cd; //Free up temporary space
}
```

# 2, Tool environment

Window10 operating system, Microsoft Visual C++2010 Express Edition, integrated development environment, C language

# 3, Experimental code

```#include<stdio.h>
#include<stdlib.h>
#include<string.h>

typedef struct{
int weight; //Node weight
int parent,lchild,rchild; //Subscript of parent, left child and right child of node
} HTNode,*HuffmanTree;
typedef char **HuffmanCode;

void Select(HuffmanTree HT,int n,int *s1,int *s2)
{//Select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers sl and s2 in HT
int num,index,i,j,k=0,max=0,temp;
for(i=1;i<=n;i++)
{
if(HT[i].parent==0)
{//Parent domain is 0
k++;
num[k]=HT[i].weight;
index[k]=i;	//Record the sequence number and weight of the node whose parent domain is 0
}
}
for(i=1;i<k;i++)
{//Arrange the smallest two nodes
for(j=i+1;j<=k;j++)
{
if(num[i]>num[j])
{
temp=num[i];
num[i]=num[j];
num[j]=temp;
temp=index[i];
index[i]=index[j];
index[j]=temp;//Serial number and weight are exchanged at the same time
}
}
}
*s1=index;//Return the sequence numbers of the two nodes with the minimum weight to S1 and S2
*s2=index;
}

void CreateHuffmanTree(HuffmanTree *HT,int n,int number[])
{//Constructing Huffman tree
int m=2*n-1;
int i,s1=1,s2=1;
if(n<=1)  return;
*HT=(HuffmanTree)malloc((m+1)*sizeof(HTNode)); //Cell 0 is not used, so m+l cells need to be dynamically allocated, HT[m) represents the root node
for(i=1;i<=m;++i)
{//Initialize the subscripts of parents, left children and right children in unit l~m to 0
(*HT)[i].parent=0;
(*HT)[i].lchild=0;
(*HT)[i].rchild=0;
}
for(i=1;i<=n;++i)
{//Input the weights of leaf nodes in the first n cells
(*HT)[i].weight=number[i-1];
}
//－－－－－－－－－－－－－－－－－－－－－－－－－－－－－－－－－
for ( i=n+1; i<=m; ++i)
{//Create Huffman tree through n-1 selection, deletion and merging
Select(*HT, i-1, &s1, &s2);
//In HT [k] (L < = k < = i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers sl and s2 in HT
(*HT)[s1].parent=i;
(*HT)[s2].parent=i;
//Get the new node i, delete sl and s2 from the forest, and change the parent domain of sl and s2 from 0 to I
(*HT)[i].lchild=s1;
(*HT)[i].rchild=s2;
//SL and S2 are the left and right children of i
(*HT)[i].weight=(*HT)[s1].weight+(*HT)[s2].weight; //The weight of i is the sum of the weight of left and right children
}//for
}

void CreateHuffmanCode(HuffmanTree HT,HuffmanCode *HC,int n)
{//Reverse the Huffman code of each character from leaf to root and store it in the coding table HC
int i,start,c,f;
char *cd;
*HC=(char **)malloc((n+1)*sizeof(char *));  //Allocate an encoding table space that stores n character encodings
cd=(char *)malloc(n*sizeof(char)); //Allocate dynamic array space for temporarily storing each character encoding
cd[n-1]='\0';  //Encoding Terminator
for(i=1;i<=n;++i)     //Huffman coding character by character
{
start=n-1;        //start points to the last at the beginning, that is, the position of the encoding terminator
c=i;
f=HT[i].parent; //f points to the parent node of node c
while(f!=0)       //Backtracking from the leaf node up to the root node
{
--start;//Backtracking once start points forward to a position
if(HT[f].lchild==c) cd[start]='0'; //If node c is the left child of f, code 0 is generated
else cd[start]='1'; //If node c is the right child of f, code 1 is generated
c=f;
f=HT[f].parent; //Continue to backtrack up
}                       //Find the encoding of the ith character
(*HC)[i]=(char *)malloc((n-start)*sizeof(char)); //Allocate space for the ith character encoding
strcpy((*HC)[i],&cd[start]); //Copy the obtained code from the temporary space cd to the current line of HC
}//for
cd=NULL;   //Null pointer
free(cd);  //Free up temporary space
}

void ShowResults(HuffmanTree HT,HuffmanCode HC,int n)
{//Show the created Huffman tree and the corresponding Huffman code
int i;
printf("\n\t node i\tweight\tparent\tlchild\trchild\n");
for(i=1;i<=2*n-1;++i)
{
printf("\t%d\t%d\t%d\t%d\t%d\n",i,HT[i].weight,HT[i].parent,HT[i].lchild,HT[i].rchild);
}
printf("\n\n\n\t node i\tHuffmanCode\n");
for(i=1;i<=n;++i)
{
printf("\t%d\t%s\n",i,HC[i]);
}
}

int main()
{
HuffmanTree HT=NULL;
HuffmanCode HC=NULL;
int n=8,number={5,29,7,8,14,23,3,11};//Here, number stores weights. Readers can write and read the initialization functions of weights and quantities by themselves
CreateHuffmanTree(&HT,n,number);
CreateHuffmanCode(HT,&HC,n);
ShowResults(HT,HC,n);
return 0;
}
```

Posted by akrocks_extreme on Tue, 16 Nov 2021 06:20:09 -0800