My blog test

Keywords: Big Data Java

Catalog

My blog test

1. My blog test

Information retrieval is an important part of Internet applications,

1.1 sub title of my blog

It is closely related to people's daily life, and web application as a text form is still the mainstream application of the Internet. How to retrieve information more effectively from a large number of Web texts is still a problem faced by many researchers. Text automatic classification is not only an important branch of natural language processing, but also the basis and important part of information retrieval and data mining.

private static void loadForwardDictionary() {
    if (null != forwardDicArr){
        return;
    }
    LOGGER.log(Level.CONFIG, "Start load reverse dictionary.");
    forwardDicArr = new Map[Unicode.MAX_CHINESE - Unicode.MIN_CHINESE];
    
    //Record the previous map
    Map<Character, Map> prevMap = null;
    //Record the next map
    Map<Character, Map> nextMap = null;
    // Read dictionary file, use space character to divide words
    String dicStr = FileUtil.readText(DIC_PATH, DIC_ENCODING);
    String[] wordArr = dicStr.split("\n");
    //Loop through each word, and remove the first line to avoid character parsing error caused by BOM,
    //java read files are processed in BOM free mode by default
    for (int i = 1; i < wordArr.length; i++) {
        String word = wordArr[i];
        char[] chars = word.toCharArray();
        if (chars.length <= 0) {
            continue;
        }
        char currChar = ' ';
        //Loop through each character
        for (int j = 0; j < chars.length; j++) {
            currChar = chars[j];
            if (j == 0) {
                if (Unicode.getCharType(currChar).equals(CharType.COMMON_CHINESE)) {
                    int index = currChar - Unicode.MIN_CHINESE;
                    nextMap = forwardDicArr[index];
                    if (null == nextMap) {
                        nextMap = new HashMap<>();
                        forwardDicArr[index] = nextMap;
                    }
                }
                continue;
            }
            if (null == nextMap) {
                nextMap = new HashMap<>();
                nextMap.put(currChar, null);
                if (null != prevMap) {
                    prevMap.put(chars[j - 1], nextMap);
                }
            } else if (!nextMap.containsKey(currChar)) {
                nextMap.put(currChar, null);
            }
            prevMap = nextMap;
            nextMap = nextMap.get(currChar);
        }
        //Add a mark at the end of a word to indicate that a word can be formed at this level
        if (null == nextMap) {
            nextMap = new HashMap<>();
            if (null != prevMap) {
                prevMap.put(currChar, nextMap);
            }
        }
        nextMap.put(SegmentArgs.END_MARK, null);
    }

    LOGGER.log(Level.INFO, "Load forward dictionary successfully.");
}

Posted by youngloopy on Tue, 10 Dec 2019 07:34:21 -0800