NLP Course - Notes-01

Keywords: REST Python Asterisk

Article Directory

Lesson-01

Pre-class TIPS

  • Be good at finding problems and scenarios
  • AI actually develops step by step, the structure should be well learned, the foundation should be well established, and the study should not be greedy.

AI Paradigm

  1. Rule Based (Language Generation-Lesson-01)
  2. Probability based (language model - Lesson-01)
  3. Problem Solving: Search Based
  4. Mathematical or Analytic Based(Lesson-03)
  5. Machine Learning (deep learning) Based

Graph

1. Generate sentences randomly according to the grammatical structure
2. Get a lot of text data and build language model 2-GRAM
3. Use the model from 2 to make a reasonable prediction of the sentence generated by 1

1.Rule Based Model

Task Requirements:

Randomly generate sentences based on the defined grammatical structure

STEP1:

The grammar of a language is structured, and we can create a sentence from some well-defined structures.First, we define the following rules:

simple_grammar_frame = """
sentence => noun_phrase verb_phrase
noun_phrase => Article Adj* noun
Adj* => null | Adj Adj*
verb_phrase => verb noun_phrase
Article =>  One | this
noun =>   woman |  Basketball | Table | kitten
verb => Watch   |  Sit down |  Listen | seeing
Adj =>  Blue | Good-looking | Small
"""

STEP2:

We write the structure resolution of STEP1 into the python dictionary with the following code:

def Create_Grammer(gram_frame, split = '=>'):
    grammer = {}
    for line in gram_frame.split('\n'):
        if not line.strip(): continue
        exp, stmt = line.split(split)
        grammer[exp.strip()] = [word.split() for word in stmt.split('|')] #word.split() also removes spaces
    return grammer

STEP3:

By passing the structure simple_grammar_frame of STEP1 into the function defined in STEP2, we get a dictionary of the following syntax

{'sentence': [['noun_phrase', 'verb_phrase']],
 'noun_phrase': [['Article', 'Adj*', 'noun']],
 'Adj*': [['null'], ['Adj', 'Adj*']],
 'verb_phrase': [['verb', 'noun_phrase']],
 'Article': [['One'], ['this']],
 'noun': [['woman'], ['Basketball'], ['Table'], ['kitten']],
 'verb': [['Watch'], ['Sit down'], ['Listen'], ['seeing']],
 'Adj': [['Blue'], ['Good-looking'], ['Small']]}

STEP4:

Based on the syntax structure dictionary obtained by STEP3, we randomly generate sentences with the following code:
target parameter: Indicates which item of the grammar structure we want to generate. The default is sentence

def Create_Sentence(gram, target = 'sentence'):
    if target not in gram: return target
    
    expanded = random.choice(gram[target])
    return ''.join(Create_Sentence(gram, target=r) for r in expanded if r!='null')

STEP5:

Now we can use the above functions to generate sentences at random, and the grammatical structure in STEP1 can be defined according to your requirements, for example:

#In the western world, a human language can be defined as:
human = """
human = Find your own activity
 Self=Me|Me|We 
Find=Find|Find a Point 
Activity=Fun|Playing
"""

#The language of a Receptionist can be defined as
host = """
host = Number of greetings asking for business-related end 
Number=I'm a number,
Number=Single Number|Number Single Number 
Single Number= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 
Greeting = salutation
 Title = Person,
Person=Mr|Ms|Kid
 Hello=Hello|Hello 
Ask = Ask you to ask | You need
 Business Related = Playing with the Toy Body Business
 Play= null
 Business=Drinking|Playing Cards|Hunting|Gambling
 End=?
"""

STEP6:

Suppose we use the above learning to generate some sentences randomly, as follows:

['A nice blue basketball listening to a basketball',
 'This table is sitting on a blue woman',
 'A basketball listens to a woman',
 'A table saw a small basketball',
 'A kitten sits in a little kitten',
 'This little table sits on this small, beautiful table',
 'A woman listens to this basketball',
 'A little beautiful woman sees a beautiful woman',
 'The woman saw a table',
 'The beautiful cat sits in a woman']

We can see that these sentences are like human words, while others are not like human words, such as:'A table sees a small basketball', oh, this table is afraid to be refined.How can the machine automatically judge the rationality of a sentence?This uses the Probability based Language model that we'll learn next

2.Probability Based Language Model

Task Requirements:

Implement a model that predicts the probability of a sentence appearing

PRE:

We divide a sentence into several words, because the frequency of each word can be obtained from a large amount of textual material, and the probability of the whole sentence can be transformed according to the Bayesian probability: for example
Tonight's dinner is divided into **'Tonight','Go','Eat','Dinner'**
(2.1) Pro (w1 W2 W3 w4) =Pr (w1_w2 W3 W4 w4) P(w2_w3 W3 w4) P(w2_w3ww4) Pr(w3_wwww4) or Pro (w1wwwwww2w3wwww4) =Pr (w4_w1w1w2www3) www2ww3wwwww4) Pr (ww2878787878787878787878787) Pro(w_1 w_2 w_3 w_4) = Pr(w_1 | w_2 w_3 w_ 4) * P(w2 | w_3 w_4) * Pr(w_3 | w_4) * Pr(w_4) \\ Or \Pro(w_1 w_2 w_3 w_4) = Pr(w_4 | w_1 w_2 w_ 3) * P(w3 | w_1 w_2) * Pr(w_2 | w_1) * Pr(w_1)\tag{2.1} Pro(w1 W2 W3 W3 w4) =Pr(w1 W2 W3 W3 w4) =Pr(w1 W2 W3 W3 w4) P(w2 W3 W3 W3 w4) Pr(w3 w4) 87878727Pr(w4) or Pro(w1 W2 W3 W3 w4) =Pr(w4 W1 W2 W2 W3 3 w4) =Pr(w4 W2 W2 W2 2 W2 2 2 2 2 w3) P(w3 W3 8787878787878787w2 w1) Pr(w1) (2.1)
Since our (2.1) formula is also too complex and every calculation is cumbersome, we assume that:
P(wn|w1w2...Wn-1)P(wn|wn-1) so 2.1 is simplified to the following:
(2.2)Pro(w1w2w3w4)∼Pr(w1∣w2)∗P(w2∣w3)∗Pr(w3∣w4)∗Pr(w4) Pro(w_1 w_2 w_3 w_4) \sim Pr(w_1 | w_2 ) * P(w2 | w_3 ) * Pr(w_3 | w_4) * Pr(w_4)\tag{2.2} Pro(w1​w2​w3​w4​)∼Pr(w1​∣w2​)∗P(w2∣w3​)∗Pr(w3​∣w4​)∗Pr(w4​)(2.2)
Similar: For 3-gram, the assumption is P(wn|w1w2...wn-1) P(wn|wn-1wn-2), and so on
Our Task: Using Formula 2.2 to compute the possibilities of randomly generated sentences in Rule Bsed Model

STEP1:

We need a function to split sentences into words, as follows:

import jieba
def cut(string): return list(jieba.cut(string)) #Jieba.cut returns the iterator, where jieba.lcut(string) can be used instead of list(jieba.cut(string))

cut('go to dinner tonight')
['Tonight','Go','Eat','Dinner']

STEP2:

We need to get a lot of text as the basis of the language model: Here we select the news material we get on Sina. com, first clean the text we get (punctuation, space, etc.), and keep the text PS (regular expression or other implementation).

def token(string):
    return re.findall('\w+', string)
articles_clean = [''.join(token(str(a))) for a in articles]

The cleaned text looks like articles_clean[1]:
'Jiaolong 835, the only ARM processor certified by Windows10 Desktop Platform Qualcomm, emphasises that it will not block out small cores just because of performance considerations. Instead, they are working with Microsoft to find a perfect solution for desktop platforms that takes performance and power into account, they report that Microsoft has'

STEP3:

All texts obtained in STEP2 are word-cutting processed and stored in TOKEN (using STEP1 word-cutting method)

TOKEN = []
for i,article in enumerate(articles_clean):
    if i%10000 == 0 :print(i)
    TOKEN += cut(article)

Frequency statistics of the words in TOKEN (to get P(wn)) are stored in words_count:

from collections import Counter
words_count = Counter(TOKEN)

words_count.most_common(4) View the top four words_count frequencies

[('Of', 328262),
 ('Yes', 102420),
 ('yes', 73106),
 ('I', 50338),]

Combine two adjacent words in TOKEN and make frequency statistics (to get P(wnwn-1)). The result is stored in words_count_2

TOKEN_2_GRAM = [''.join(TOKEN[i:i+2]) for i in range(len(TOKEN[:-2]))]
words_count_2 = Counter(TOKEN_2_GRAM)

STEP4:

Build a 2-GRAM model and define the functions for calculating P(wn) and P(wn wn-1) based on STEP3 data, as follows:

def prob_1(word):
    if word in word_count:return word_count[word] / len(TOKEN)
    else:return 0.5/len(TOKEN)

def prob_2(word1, word2):
    if word1+word2 in word_count_2_GRAM: return word_count_2_GRAM[word1+word2] / len(TOKEN_2_GRAM)
    else: return 0.5 / len(TOKEN_2_GRAM)

Here is a brief explanation of the Laplacian smoothing:
(2.3)P(wi)=ciN General Probability Formula P(Laplace)(wi)=ci+1N+V Laplacian Smoothing Formula ci:Frequency of an Event N:Frequency of All Events and Number of Categories of V:Events \begin{aligned}\tag{2.3} P(w_i) &= \frac{c_i}{N} \quad general probability formula\ P(Laplace) (w_i) &= \frac{c_i+1}{N+V} \quad Laplace Smoothing Formula\ C_i&: Frequency of occurrence of an event\ N&: Frequency of all events and \ V&: Number of categories of events \end{aligned} P(wi)P(Laplace)(wi)ci NV=N ci.General probability formula=N+Vci+1 Laplacian smoothing formula:Frequency of occurrence of an event:Frequency of all events and number of categories of events(2.3)
There is no useful (2.4) formula here:

(2.4) This is the simplest and most crude method in smoothing method. The principle is to let the frequency of each statistic be at least 1. By adding 1 to each statistic frequency molecule and 1 to the denominator and the number of all statistic Chinese characters, the original probability of 0 will be turned into a very small value, which is also reasonable.
However, there will be some problems in the practical application. Since the total statistical probability must be 1, these increased probabilities will inevitably result in the reduction of the original probability. Moreover, experiments show that this reduction is huge, and the results may be inaccurate.
The additive smoothing method, which is relatively good, controls the value of delta without too much pulling down the high probability

Here we have built the foundation to test it:
Single words such as:
prob_1('Eat')
1.8730573415504168e-05
Two adjacent phrases such as:
Possibility of appearing at meal
prob_2('At','Eat')
1.1135086188907629e-06
Possibility of hitting someone
prob_2('on','beating')
5.6759319823555707e-08

STEP5:

Next, we'll define a function that calculates the probability of the occurrence of an entire sentence, using the 2-GRAM formula (2.1) we analyzed earlier.

def get_probability(sentence):
    words = cut(sentence)
    sentence_prob = 1
    for i, word in enumerate(words[:-1]):
        next_word = words[i+1]
        probability_1 = prob_1(next_word)
        probability_2 = prob_2(word, next_word)

        sentence_prob *= (probability_2 / probability_1)
    sentence_prob *= probability_1
    return sentence_prob

STEP6:

We have completed the construction of the whole language model above. In the next final step, we apply the model to our randomly generated sentences to filter out the reasonable sentences.

need_compared = [
    "You're invited to dinner this evening, we're going to have dinner together tomorrow evening, and we're going to have apples together",
    "What a beautiful cat is really a beautiful cat",
    "I'll eat hot pot tonight I'll eat hot pot tonight",
]
for s in need_compared:
    s1, s2 = s.split()
    p1, p2 = get_probablity(s1), get_probablity(s2)   

better = s1 if p1 > p2 else s2

print('{} is more possible'.format(better))
print('-'*4 + ' {} with probility {}'.format(s1, p1))
print('-'*4 + ' {} with probility {}'.format(s2, p2))

To see if the results are as good as we can tell:

I'll treat you to a big meal this evening. Let's eat Japanese food together is more possible
 --- invite you to have a big meal this evening. Let's have a daily meal with probility 8.905905868517037e-68
 --- invite you to dinner tomorrow evening. Let's have apples with probility 7.124724694813629e-68
 What a beautiful kitten is more possible
 --What a beautiful kitten with probility 3.952326410335282e-35
 --What a beautiful kitten with probility 1.2993812566532249e-27
 I'm going to eat hot pot this evening is more possible
 - Tonight I'm going to have hot pot with probility 2.014937789658057e-20
 - Hotpot to eat me with probility 1.6434861744230511e-28 tonight

That's still reasonable, isn't it?Ha-ha

FINAL:

The accuracy of language models depends on the following:

  1. Initial text material is very important, what kind of scene are you targeting, so it's best to base on the material of the target scene
  2. Is the language model 2-GRAM, 3-GRAM, or more?
  3. Smoothing in probability calculation, adjusting to actual conditions
  4. Not expected for the moment

Pattern Match

Task Requirements

Enable machine to talk to people
input: 'I need iPhone'
output: ' Image you will get iPhone soon'

STEP1

Define a grammar structure - single character matching:

defined_patterns ={
    "I need ?X": ["Image you will get ?X soon", "Why do you need ?X ?"], 
    "My ?X told me something": ["Talk about more about your ?X", "How do you think about your ?X ?"] }

Where **? X** represents a placeholder, the words and answers recognized by the machine should conform to the above grammatical structure in order to achieve the dialogue effect

STEP2

Match the user input statements to the syntax structure defined in STEP1 with the following matching codes:

def is_variable(pat):
    #Determine if the pat matches the placeholder vraiable? X; that is, whether the element in the pattern is a placeholder
    return pat.startswith('?') and all(s.isalpha() for s in pat[1:])

def pat_match(pattern, saying):
    #If the placeholder is not at the end, the method can't tell. Look down
    if is_variable(pattern[0]): return True
    else:
        if pattern[0] != saying[0]: return False
        else:
            return pat_match(pattern[1:], saying[1:])

STEP3:

The function of STPE2 can determine whether two pattern s match. To get what each placeholder represents, we improved the pat_match function as follows:

def pat_match(pattern, saying):
    if is_variable(pattern[0]):
        return pattern[0], saying[0]
    else:
        if pattern[0] != saying[0]: return False
        else:
            return pat_match(pattern[1:], saying[1:])

Effect:
**IN:**pat_match('I want ?X'.split(), "I want holiday ".split())
OUT:('?X', 'holiday')

STEP4:

However, if we have two variables in Attrn, or the placeholder is not at the end, then the above program cannot be solved. We can modify pat_match as follows:

#Return value becomes list, before tuple
def pat_match(pattern, saying):
    if not pattern or not saying: return []
    if is_variable(pattern[0]):
        return [(pattern[0], saying[0])] + pat_match(pattern[1:], saying[1:])#Continue matching what follows
    else:
        if pattern[0] != saying[0]: return []
        else:
            return pat_match(pattern[1:], saying[1:])

Effect:
**IN:**pat_match("?X greater than ?Y".split(), "3 greater than 2".split())
OUT:[('?X', '3'), ('?Y', '2')]

STEP5:

To facilitate the next replacement, we create two new functions

def pat_to_dict(patterns):
    #Parse STEP4 into pairs, convert dictionary [('?X','3')]{'?X':'3'}
    return {k: v for k, v in patterns}
def subsitite(rule, parsed_rules):
  	#Replace the input according to the resulting pairing
  	#@param rule: input grammar is sliced into words in a list
    #Paired dictionary from @param parsed_rules:pat_to_dict function
    if not rule: return [] #end condition
    return [parsed_rules.get(rule[0], rule[0])] + subsitite(rule[1:], parsed_rules)

Effect:
**IN:**got_patterns = pat_match("I want ?X".split(), "I want iPhone".split())
**IN:**subsitite("What if you mean if you got a ?X".split(), pat_to_dict(got_patterns))
OUT:['What', 'if', 'you', 'mean', 'if', 'you', 'got', 'a', 'iPhone']

STEP6:

We can have some initial conversations in this form above, but our patterns match word by word, "I need iPhone" and "I need? X" can match, but "I need an iPhone" and "I need? X" don't match, what can we do?
To solve this problem, we create a new variable type'?*X', which has an asterisk (*) to match multiple

First, like STEP2, we need a function to determine whether it matches?*X, coded as follows:

def is_pattern_segment(pattern):
    return pattern.startswith('?*') and all(a.isalpha() for a in pattern[2:])

Similarly, we will modify the previous pat_match function as follows:

fail = (True, None)#Match Failure Default Return
#Input:pattern is similar:'?*P is very good'
#Say (to match) similar:'My dog and my cat is very good' 
#Output: The result of the match, here it should be ('?P', ['My','dog','and','my','cat']), 5)
def pat_match_with_seg(pattern, saying):
    if not pattern or not saying: return [] #Base Case
    pat = pattern[0]
    if is_variable(pat):#pattern[0] if it represents a single placeholder
        return [(pat, saying[0])] + pat_match_with_seg(pattern[1:], saying[1:])
    elif is_pattern_segment(pat):#pattern[0] if it represents multiple placeholders
        match, index = segment_match(pattern, saying)#call
        if(match[1] == False):#Matching Failed
            return [fail]
        else:#Match succeeded, element after continue matching
            return [match] + pat_match_with_seg(pattern[1:], saying[index:])
    elif pat == saying[0]:#pattern[0] if same as saying[0]
        return pat_match_with_seg(pattern[1:], saying[1:])
    else:
        return [fail]

An important new function in this program is segment_match, which is a pattern that starts with segment_pattern and does its best to match the part of the variable pair with this side length, as follows:

#When pattern [0]=?*X, determine if the pattern matches saying
#Return matching pairs if the match succeeds and Flase if the match fails
def segment_match(pattern, saying):
    seg_pat, rest = pattern[0], pattern[1:]
    seg_pat = seg_pat.replace('?*', '?')
    ```
    #If pattern?* is followed by no elements, then?* matches all elements currently in saying
    if not rest: return (seg_pat, saying), len(saying)    

    #Find the position of the element after pattern?* in saying
    for i, token in enumerate(saying):
        if rest[0] == token and is_match(rest[1:], saying[(i + 1):]):
            return (seg_pat, saying[:i]), i

    #If none of the elements following pattern?* matches saying, then?* matching fails
    #return (seg_pat, saying), len(saying)
    return (seg_pat, False), len(saying)
```

#Determine if rest and saying are equal?Why not judge directly with ==
#Because when the input is'Really Very? X'and'Really Match'
#Also match successfully, because? X and match are matchable
def is_match(rest, saying):
    if not rest and not saying:
        return True
    #The implementation input is a match of'Really great? X'and'Really great match'.
    if not all(a.isalpha() for a in rest[0]):
        return True
    if rest[0] != saying[0]:
        return False
    return is_match(rest[1:], saying[1:])

Effect:
**IN:**segment_match('?*P is very good'.split(), "My dog and my cat is very good".split())
OUT:(('?P', ['My', 'dog', 'and', 'my', 'cat']), 5)

**IN:**pat_match_with_seg('?*P is very good and ?*X'.split(), "My dog is very good and my cat is very cute".split())
OUT:[('?P', ['My', 'dog']), ('?X', ['my', 'cat', 'is', 'very', 'cute'])]

STEP7:

Now that we have a new function, pat_match_with_seg, that matches multiple characters, we will modify the pat_to_dict function in STEP5
The unmodified dictionary is: ['?X': ['an','iPhone']]
Modified: [('?X','an iPhone')]

def pat_to_dict(patterns):
    return {k: ' '.join(v) if isinstance(v, list) else v for k, v in patterns}

Okay, now let's see how the sentence replacement function subsitite works
**IN:**subsitite("Hi, ?X how do you do?".split(), pat_to_dict(pat_match_with_seg('?*X hello ?*Y'.split(), "I am mike, hello ".split())))
OUT:['Hi,', 'I am mike,', 'how', 'do', 'you', 'do?']

STEP8:

Now that we've done most of the basics, it's time to implement the function of a specific conversation. The get_response(saying, response_rules) input is what we want to say + the rules that we define, such as the pattern we wrote above, and the output is an answer:

def get_response(saying, rules):
    """"
    >>> get_response('I need iPhone') 
    >>> Image you will get iPhone soon
    >>> get_response("My mother told me something")
    >>> Talk about more about your monther.
    """
    get_pattern = [(True, None)]
    for pattern in rules:
       # print(cut(pattern), cut(saying))
        get_pattern = pat_match_with_seg(pattern.split(), saying.split())
        if((True, None) not in get_pattern):
               break
    #print(get_pattern)
    #print(pattern)
    if get_pattern == []: return 'I dont understand'
    return ' '.join(subsitite(random.choice(rules[pattern]).split() , pat_to_dict(get_pattern)), )

For example, let's now define the grammar:

defined_patterns = {
    "I need ?X": ["Image you will get ?X soon", "Why do you need ?X ?"], 
    '?*x I want ?*y': ['what would it mean if you got ?y', 'Why do you want ?y', 'Suppose you got ?y soon'],
    "My ?X told me something": ["Talk about more about your ?X", "How do you think about your ?X ?"]
}

**IN:**get_response('Leo I want a apple', defined_patterns)
OUT: 'what would it mean if you got a apple'

Similarly, as long as the sentences we enter conform to the grammar in defined_patterns, we will find the corresponding answers in the grammar library to achieve the dialog effect.

STEP9:

We have completed the Pattern Match experiment above, but it seems that the model only supports English input. Continue to improve it so that it can input Chinese:

The convenience of English input is that we use the string-built method. split() to split sentences directly, and divide them by spaces. To support Chinese input in this model, we have to design a cut function that can divide Chinese, but not? X or?*X.

  1. First we slice the input with the jieba participle, which gives the result of the slicing, and also divides? X into? And X.
  2. Step 1 results are aligned so that? And X are reconnected
  3. Replace the.split() function from the previous function with our new cut function

Code implementation:

def cut(string): 
    cut_unclean  = [word for word in jieba.lcut(string) if word!=' ']
    cut_clean = []
    i = 0
    j = 0
    while(i < len(cut_unclean)):
        if(cut_unclean[i] == '?' and i<len(cut_unclean)-2 and cut_unclean[i+1]=='*' ):
            cut_clean.append(str(cut_unclean[i]+ cut_unclean[i+1] + cut_unclean[i+2]))
            i += 3
        elif(cut_unclean[i] == '?' and  i<len(cut_unclean)- 1):
            cut_clean.append(str(cut_unclean[i]+ cut_unclean[i+1]))
            i += 2
        else:
            cut_clean.append(cut_unclean[i])
            i +=1
    return cut_clean

Effect:
**IN:**cut("?*x I want?*yxx")
OUT: ['?*x','I','Want','?*yxx']

No problem, now let's replace the.split() function with our new cut function

def get_response_chinese_support(saying, rules):
    get_pattern = [(True, None)]
    for pattern in rules:
        get_pattern = pat_match_with_seg(cut(pattern), cut(saying))
        if((True, None) not in get_pattern):
               break
    if get_pattern == []: return 'what you input do not have pattern yet'
    return ' '.join(subsitite(cut(random.choice(rules[pattern])) , pat_to_dict(get_pattern)), )

We define Attrn in Chinese to try the effect:

rule_responses = {
    '?*x hello ?*y': ['How do you do', 'Please state your problem'],
    '?*x I want ?*y': ['what would it mean if you got ?y', 'Why do you want ?y', 'Suppose you got ?y soon'],
    '?*x if ?*y': ['Do you really think its likely that ?y', 'Do you wish that ?y', 'What do you think about ?y', 'Really-- if ?y'],
    '?*x no ?*y': ['why not?', 'You are being a negative', 'Are you saying \'No\' just to be negative?'],
    '?*x I was ?*y': ['Were you really', 'Perhaps I already knew you were ?y', 'Why do you tell me you were ?y now?'],
    '?*x I feel ?*y': ['Do you often feel ?y ?', 'What other feelings do you have?'],
    '?*x Hello?*y': ['How do you do', 'Please tell me your problem'],
    '?*x I think?*y': ['You feel?y What does it mean?', 'Why do you want to?y', 'You can think that you'll be able to do it soon?y Yes'],
    '?*x I want?*y': ['?x To ask you, what do you think?y What does it mean??', 'Why do you want to?y', '?x Think... You can think you'll soon have it?y Yes', 'Look?x image?y No', 'I see you like?y'],
    '?*x like?*y': ['like?y Where is it?', '?y What's good?', 'Do you want to?y Are you?'],
    '?*x Hate?*y': ['?y How could it be so annoying??', 'Hate?y Where is it?', '?y What's wrong?', 'You don't want to?y Are you?'],
    '?*xAI?*y': ['Why do you want to mention AI Things?', 'Why do you think AI To solve your problem?'],
    '?*x Robot?*y': ['Why are you talking about robots?', 'Why do you think robots need to solve your problem?'],
    '?*x I'm sorry?*y': ['You don't have to apologize.', 'Why do you think you need to apologize??'],
    '?*x I remember?*y': ['Do you often think about this?', 'except?y What else will you remember?', 'Why did you mention it to me??y'],
    '?*x If?*y': ['You really think?y Will it happen?', 'You hope?y Are you??', 'Really?If?y Words', 'about?y What's your opinion?'],
    '?*x I?*z Dream about?*y':['Really?? --- ?y', 'When you were awake, you had imagined it before?y Are you?', 'You've dreamed before?y Are you?'],
    '?*x Mom?*y': ['Except in your home?y Who else??', 'Well, well, a little more about your family', 'Does she make a big difference to you?'],
    '?*x Dad?*y': ['Except in your home?y Who else??', 'Well, well, a little more about your family', 'Does he make a big difference to you?', 'Whenever you think of your father, do you still think of something else??'],
    '?*x I do?*y': ['I can help you?y Are you?', 'You can explain why you think?y'],
    '?*x I'm sorry because?*y': ['I'm sorry to hear that.', '?y You shouldn't be so sad'],
    '?*x Sorry?*y': ['I'm sorry to hear that.',
                 'You shouldn't be so sad. You won't be so sad about what you think you have?',
                 'You won't be sad about what you think has happened?'],
    '?*x just like?*y': ['You feel?x and?y What are the similarities?', '?x and?y Does it really matter?', 'How do I say this?'],
    '?*x and?*y all?*z': ['You feel?z Is there anything wrong with that??', '?z How will it affect you??'],
    '?*x and?*y equally?*z': ['You feel?z Is there anything wrong with that??', '?z How will it affect you??'],
    '?*x I am?*y': ['Really?', '?x To tell you, I probably knew you were?y', 'Why did you tell me now that you are?y'],
    '?*x I am?*y Are you?': ['If you are?y What happens?', 'You think you are?y Are you?', 'If you are?y,What's that one?'],
    '?*x You are??*y Are you?':  ['Why are you doing this to me??y Be interested?', 'Then you want me to be?y Are you?', 'If you like it, I'll be?y'],
    '?*x You are??*y' : ['Why do you think I am?y'],
    '?*x because?*y' : ['?y Is that the real reason?', 'Do you think there's any other reason?'],
    '?*x I can't?*y': ['You might be able to do it now?*y', 'If you can?*y,What happens?'],
    '?*x I think?*y': ['Do you often feel this way?', 'Do you have any other feelings than this?'],
    '?*x I?*y you?*z': ['It's very likely that we'll be able to interact with each other?y'],
    '?*x Why don't you?*y': ['Why don't you do it yourself?y', 'You think I won't?y', 'When I feel better, I will?y'],
    '?*x Well?*y': ['Well', 'You're a positive energy person'],
    '?*x Hmm Hmm?*y': ['Well', 'You're a positive energy person'],
    '?*x Not at all?*y': ['Why not?', 'You have a little negative energy', 'You said no, do you want to express what you don't want?'],
    '?*x No?*y': ['Why not?', 'You have a little negative energy', 'You said no, do you want to express what you don't want?'],
    '?*x Some people?*y': ['Who exactly is that??'],
    '?*x Some people?*y': ['Who exactly is that??'],
    '?*x Someone?*y': ['Who exactly is that??'],
    '?*x Everybody?*y': ['I'm sure not everyone is', 'Can you think of something special?', 'For example, who?', 'You see only a handful of people'],
    '?*x All?*y': ['I'm sure not everyone is', 'Can you think of something special?', 'For example, who?', 'You see only a handful of people'],
    '?*x always?*y': ['Can you think of anything else??', 'When, for example?', 'What exactly did you say?', 'Really?---Always?'],
    '?*x always?*y': ['Can you think of anything else??', 'When, for example?', 'What exactly did you say?', 'Really?---Always?'],
    '?*x perhaps?*y': ['You look uncertain'],
    '?*x Probably?*y': ['You look uncertain'],
    '?*x They are?*y Are you?': ['You think they might not be?y?'],
    '?*x': ['Very interesting', 'Please continue', 'I'm not sure I understand what you said, Can you explain in a little more detail??']
}

The end result:

**IN:**get_response_chinese_support('Ming I remember bread', rule_responses)
OUT:'What else do you think about besides bread?'

**IN:**get_response_chinese_support('Xiao Ming I'm sorry because I have no money', rule_responses)
OUT:'Money should not make you so sad'

**IN:**get_response_chinese_support('boring', rule_responses)
OUT:'Please continue'

**IN:**get_response_chinese_support('Are you a pig, rule_responses)*
OUT: *Why are you interested in me being a pig?

Posted by Dan_Mason on Thu, 01 Aug 2019 18:29:17 -0700