Huffman coding (data structure tree, c language version)

Keywords: C data structure

Huffman coding (data structure tree, c language version)

1, Experimental topic

1) Initialization. Read in the weight of each character and establish HuffTree;
2) Code. Coding with the built Huffman tree;
3) Output. Display the established Huffman tree and the corresponding coding table;

1. Data structure

//----- storage representation of Huffman tree -----
typedef struct{ 
int weight; //Node weight
int parent,lchild,rchild; //Subscript of parent, left child and right child of node
) HTNode,*HuffmanTree; //Dynamically allocate arrays to store Huffman trees

2. Algorithm

Algorithm 1 constructs Huffman tree
Algorithm 2 calculates Huffman coding according to Huffman tree

2.1 constructing Huffman tree

Algorithm steps

The implementation of constructing Huffman tree algorithm can be divided into two parts.
1. Initialization: first dynamically apply for 2n units; Then cycle 2n-1 times, starting from unit 1, initialize the subscripts of parents, left children and right children in all units 1 to 2n-1 to 0 in turn; Finally, recycle n times and input the weight of leaf nodes in the first n units.
2. Create tree: cycle n-1 times to create Huffman tree through n-1 times of selection, deletion and merging. Selection is to select two tree root nodes s1 and s2 with parents of 0 and minimum weight from the current forest; Deletion refers to changing the parents of nodes s1 and s2 to non-zero; Merging is to store the weight sum of s1 and s2 as the weight of a new node in the cell after n +1 of the array, and record that the subscript of the left child of the new node is s1 and the subscript of the right child is s2.

Algorithm description

void CreateHuffmanTree(HuffmanTree &HT,int n) 
{//Constructing Huffman tree
	if(n<=1) return; 
	m=2*n-1; 
	HT=new HTNode[m+1]; //Cell 0 is not used, so m+1 cells need to be dynamically allocated, and HT[m] represents the root node
	for(i=1;i<=m;++i) //Initialize the subscripts of parents, left children and right children in units 1~m to 0
		{HT[i] .parent=O;HT[i] .lchild=O;HT[i] .rchild=O;} 
	for(i=l;i<=n;++i} //Enter the weight of the leaf node in the first n cells
		cin>>HT[i].weight; 
/*- ---- ----- -After initialization, let's start to create Huffman tree */
	for (i=n+1; i<=m; ++i} 
	{//Create Huffman tree through n-1 selection, deletion and merging
		Select (HT, i-1, sl, s2}; 
		//In HT [k] (1 < = k < = i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers s1 and s2 in HT
		HT[s1].parent=i;HT[s2].parent=i; 
		//Get the new node i, delete SL and s2 from the forest, and change the parent domain of s1 and s2 from 0 to 1
		HT[i].lchild=s1;HT[i].rchild=s2; ///S1 and S2 are the left and right children of i
		HT[i].weight=HT[s1].weight+HT[s2].weight; //The weight of i is the sum of the weight of left and right children
	}//for
}

2.2 Huffman coding according to Huffman tree

Algorithm steps

1. Allocate a coding table space HC storing n character codes, with a length of n+1; Allocate the dynamic array space cd for temporarily storing each character encoding, and set cd[n-1] to \ 0 '.

2. Solve the encoding of n characters one by one, cycle n times, and perform the following operations:

2.1 set the variable start to record the position where the code is stored in the cd. Start initially points to the last, that is, the position of the code terminator n-1;
2.2 set the variable c to record the subscript of the node from the leaf node to the root node. Initially, c is the subscript i of the current character to be encoded, and f is used to record the subscript of the parent node of I;
2.3 trace back from the leaf node to the root node to obtain the code of character i. when f does not reach the root node, the following operations are performed in a loop:

2.3.1 backtracking once, Start refers to one position forward, i.e. – start;
2.3.2 if node c is the left child of f, code 0 is generated, otherwise code 1 is generated, and the generated code 0 or 1 is saved in cd[start];
2.3.3 continue to backtrack upward and change the values of c and f
.
2.4 allocate space HC [i] for the ith character encoding according to the string length of array cd, and then encode the character in array cd
Copy to HC [i].

3. Release temporary space cd.

Algorithm description

void CreatHuffmanCode(HuffmanTree HT,HuffmanCode &HC,int n) 
{//Reverse the Huffman code of each character from leaf to root and store it in the coding table HC
	HC=new char* [n+1]; //Allocate an encoding table space that stores n character encodings
	cd=new char [n]; //Allocate dynamic array space for temporarily storing each character encoding
	cd[n-1]='\0'; //Encoding Terminator
	for(i=1;i<=n;++i) //Huffman coding character by character
	{
		start=n-1; //start points to the last at the beginning, that is, the position of the encoding terminator
		c=i; f=HT[i].parent; //f points to the parent node of node c
		while(f!=O) //Backtracking from the leaf node up to the root node
		{
			--start; //Backtracking once start points forward to a position
			if(HT[f].lchild==c) cd[start]='O'; //If node c is the left child of f, code 0 is generated
			else cd[start]='1'; //If node c is the right child of f, code 1 is generated
			c=f;f=HT[f].parent; //Continue to backtrack up
		} //Find the encoding of the ith character
		HC[i]=new char[n-start]; //Allocate space for the ith character encoding
		strcpy(HC[i],&cd[start]); //Copy the obtained code from the temporary space cd to the current line of HC
	}//for 
		delete cd; //Free up temporary space
}

2, Tool environment

Window10 operating system, Microsoft Visual C++2010 Express Edition, integrated development environment, C language

3, Experimental code

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

typedef struct{
	int weight; //Node weight
	int parent,lchild,rchild; //Subscript of parent, left child and right child of node
} HTNode,*HuffmanTree; 
typedef char **HuffmanCode;

void Select(HuffmanTree HT,int n,int *s1,int *s2)
{//Select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers sl and s2 in HT
	int num[100],index[100],i,j,k=0,max=0,temp;
	for(i=1;i<=n;i++)
	{
		if(HT[i].parent==0)
		{//Parent domain is 0
			k++;
			num[k]=HT[i].weight;
			index[k]=i;	//Record the sequence number and weight of the node whose parent domain is 0
		}
	}
	for(i=1;i<k;i++)
	{//Arrange the smallest two nodes
		for(j=i+1;j<=k;j++)
		{
			if(num[i]>num[j])
			{
				temp=num[i];
				num[i]=num[j];
				num[j]=temp;
				temp=index[i];
				index[i]=index[j];
				index[j]=temp;//Serial number and weight are exchanged at the same time
			}
		}
	}
	*s1=index[1];//Return the sequence numbers of the two nodes with the minimum weight to S1 and S2
	*s2=index[2];
}

void CreateHuffmanTree(HuffmanTree *HT,int n,int number[])
{//Constructing Huffman tree
	int m=2*n-1;
	int i,s1=1,s2=1;
	if(n<=1)  return; 
    *HT=(HuffmanTree)malloc((m+1)*sizeof(HTNode)); //Cell 0 is not used, so m+l cells need to be dynamically allocated, HT[m) represents the root node
	for(i=1;i<=m;++i)    
	{//Initialize the subscripts of parents, left children and right children in unit l~m to 0
		(*HT)[i].parent=0;
	    (*HT)[i].lchild=0;
		(*HT)[i].rchild=0;
	} 
	for(i=1;i<=n;++i) 
    {//Input the weights of leaf nodes in the first n cells
		(*HT)[i].weight=number[i-1]; 
	}
	//---------------------------------
	for ( i=n+1; i<=m; ++i)
	{//Create Huffman tree through n-1 selection, deletion and merging
		Select(*HT, i-1, &s1, &s2); 
	//In HT [k] (L < = k < = i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their sequence numbers sl and s2 in HT
		(*HT)[s1].parent=i;
		(*HT)[s2].parent=i; 
	//Get the new node i, delete sl and s2 from the forest, and change the parent domain of sl and s2 from 0 to I
		(*HT)[i].lchild=s1;
		(*HT)[i].rchild=s2; 
	//SL and S2 are the left and right children of i
		(*HT)[i].weight=(*HT)[s1].weight+(*HT)[s2].weight; //The weight of i is the sum of the weight of left and right children
    }//for	
} 

void CreateHuffmanCode(HuffmanTree HT,HuffmanCode *HC,int n) 
{//Reverse the Huffman code of each character from leaf to root and store it in the coding table HC
	int i,start,c,f;
    char *cd;
	*HC=(char **)malloc((n+1)*sizeof(char *));  //Allocate an encoding table space that stores n character encodings
	cd=(char *)malloc(n*sizeof(char)); //Allocate dynamic array space for temporarily storing each character encoding
	cd[n-1]='\0';  //Encoding Terminator
	for(i=1;i<=n;++i)     //Huffman coding character by character
	{
		start=n-1;        //start points to the last at the beginning, that is, the position of the encoding terminator
        c=i; 
		f=HT[i].parent; //f points to the parent node of node c
		while(f!=0)       //Backtracking from the leaf node up to the root node
		{
			--start;//Backtracking once start points forward to a position
			if(HT[f].lchild==c) cd[start]='0'; //If node c is the left child of f, code 0 is generated
			else cd[start]='1'; //If node c is the right child of f, code 1 is generated
			c=f;
			f=HT[f].parent; //Continue to backtrack up
		}                       //Find the encoding of the ith character
		(*HC)[i]=(char *)malloc((n-start)*sizeof(char)); //Allocate space for the ith character encoding
		strcpy((*HC)[i],&cd[start]); //Copy the obtained code from the temporary space cd to the current line of HC
	}//for 
	cd=NULL;   //Null pointer
	free(cd);  //Free up temporary space
}

void ShowResults(HuffmanTree HT,HuffmanCode HC,int n)
{//Show the created Huffman tree and the corresponding Huffman code
	int i;
	printf("\n\t node i\tweight\tparent\tlchild\trchild\n");
	for(i=1;i<=2*n-1;++i) 
    {
		printf("\t%d\t%d\t%d\t%d\t%d\n",i,HT[i].weight,HT[i].parent,HT[i].lchild,HT[i].rchild); 
	}
	printf("\n\n\n\t node i\tHuffmanCode\n");
	for(i=1;i<=n;++i)     
	{
		printf("\t%d\t%s\n",i,HC[i]);
	}
}

int main()
{
	HuffmanTree HT=NULL;
	HuffmanCode HC=NULL;
	int n=8,number[8]={5,29,7,8,14,23,3,11};//Here, number stores weights. Readers can write and read the initialization functions of weights and quantities by themselves
	CreateHuffmanTree(&HT,n,number);
	CreateHuffmanCode(HT,&HC,n);
	ShowResults(HT,HC,n);
	return 0;
}

Posted by akrocks_extreme on Tue, 16 Nov 2021 06:20:09 -0800