LeetCode 820. Word compression

Keywords: Java github encoding network

My LeetCode: https://leetcode-cn.com/u/ituring/

My LeetCode source code [GitHub]: https://github.com/izhoujie/Algorithmcii

LeetCode 820. Word compression

subject

Given A list of words, we encode this list into an index string S and an index list A.

For example, if the list is ["time", "me", "bell"], we can express it as s = "time" bell "and indexes = [0, 2, 5].

For each index, we can recover our previous word list by reading the string from the index position in string S until the end of "×".

What is the minimum string length to successfully encode a given word list?

Example:

Input: words = ["time", "me", "bell"]
Output: 10
 Note: S = "time bell", indexes = [0, 2, 5].

Tips:

  • 1 <= words.length <= 2000
  • 1 <= words[i].length <= 7
  • Each word is lowercase.

Source: LeetCode
Links: https://leetcode-cn.com/problems/short-encoding-of-words
Copyright belongs to the network. For commercial reprint, please contact the official authorization. For non-commercial reprint, please indicate the source.

Solving problems

  • After analyzing the questions, it is found that as long as there is the same suffix, only the longer string length can be combined and added with 1 (ාaccounting for one length)
  • The problem lies in how to judge whether a bunch of strings have the same suffix. It is easy to think of two methods, endwith() and indexOf(), which are similar in principle but too inefficient;
  • Endwits() and indexOf() are slow. Each character needs a full scan every time. If it can be scanned and recorded once, the efficiency will be multiplied. Therefore, it is more efficient to construct and record index, namely Trie (word search tree / dictionary tree);

Idea 1 - use endswitches() or indexOf() to determine whether there is the same suffix

Steps:

  1. Sort all words in descending order of length;
  2. One time traversal, append the word to StringBuilder and append one extra ා, the precondition of appending is that the indexof (current word + ා) result in StringBuilder is - 1;
  3. Returning the length of StringBuilder is the result;

Idea 2 - to construct Trie, nodes need to be constructed

Steps:

  1. All words are sorted by length descending order, Node class is defined, and an array of 26 nodes is held, 26 numbers correspond to a to z;
  2. Take each word in turn, traverse each character in reverse order, and construct a dictionary tree. If you need to create a new node to indicate that it is a new word, add 1 to its length after construction. If you do not need to construct a node to indicate that the word is a suffix of a word, do not count its length;
  3. The length of new words in cumulative statistics is the result;

The Trie tree is actually a multi fork tree with 26 sub nodes in each layer. In fact, it is the same logic as the binary tree, except that there are more nodes in each layer

Algorithm source code example

package leetcode;

import java.util.Arrays;

/**
 * @author ZhouJie
 * @date 2020 9:53:38 PM, March 28, 2015 
 * @Description: 820. Compression coding of words
 *
 */
public class LeetCode_0820 {

}

class Solution_0820 {
	/**
	 * @author: ZhouJie
	 * @date: 2020 9:54:12 PM, March 28, 2015 
	 * @param: @param words
	 * @param: @return
	 * @return: int
	 * @Description: 1-Sort by length in descending order, then splice, using indexof()
	 *
	 */
	public int minimumLengthEncoding_1(String[] words) {
		// Sort by length descending
		Arrays.sort(words, (s1, s2) -> s2.length() - s1.length());
		StringBuilder sb = new StringBuilder();
		for (String s : words) {
			if (sb.indexOf(s + "#") == -1) {
				sb.append(s).append("#");
			}
		}
		return sb.length();
	}

	/**
	 * @author ZhouJie
	 * @date 2020 1:33:22 am, March 29, 2010 
	 * @Description: Secondary inode
	 *
	 */
	class TailNode {
		TailNode[] next = new TailNode[26];
	}

	/**
	 * @author: ZhouJie
	 * @date: 2020 1:33:24 am, March 29, 2010 
	 * @param: @param words
	 * @param: @return
	 * @return: int
	 * @Description: 2-Tail,Build dictionary index
	 *
	 */
	public int minimumLengthEncoding_2(String[] words) {
		// Sort by length descending
		Arrays.sort(words, (s1, s2) -> s2.length() - s1.length());
		int minLen = 0;
		// Root index
		TailNode root = new TailNode();
		for (String s : words) {
			// Search from root every time
			TailNode currNode = root;
			// Do you need to create a new index for the current word
			boolean f = false;
			for (int i = s.length() - 1; i > -1; i--) {
				int index = s.charAt(i) - 'a';
				// Whether the current character has been indexed. If not, create a new index and update the Boolean value
				if (currNode.next[index] == null) {
					f = true;
					currNode.next[index] = new TailNode();
				}
				// Search next index
				currNode = currNode.next[index];
			}
			if (f) {
				// Record the increased length of new index
				minLen += s.length() + 1;
			}
		}
		return minLen;
	}

}

Posted by bettydailey on Sun, 29 Mar 2020 08:11:11 -0700