python self study journal 9 -- select data structure

Keywords: Python IPython Attribute Programming

1. Write a program to read in a word list from the file, and print out the word set of all palindromes

This palindrome is different from the one previously understood, for example:

['deltas','desalt','lasted','salted','slated','staled']

['retainers','ternaries']

All words with the same letters are included in the palindrome word set

According to the principle of capturing key information and converting it into the method I have mastered, I have two general directions: one is to read each word one by one, and use the letter word frequency counter mentioned above to decompose the word, if the word frequency counter is the same, put it into a set; the other is to organize 26 letters into each word by arrangement and combination, and then compare them in the word list Obviously, the second method violates the original principle, because it's not in my current grasp, and it's very complex and requires a lot of calculation.

But later we found that there is a simpler way, which is to use the sorting function we have already mastered to separate and sort the letters of each word, if the results are the same, put them into a set.

def paixu_pinjie(word):
    t=list(word)
    t.sort()
    a=''.join(t)
    return a

fin=open('words.txt')
d=dict()
for line in fin:
    word=line.strip()
    b=paixu_pinjie(word)
    if b not in d:
        d[b]=word
    else:
        d[b].append(word)
return d

for key,val in d.items():
    if len(val)>1:
        print(val)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-c76bff3c7e52> in async-def-wrapper()
     17 return d
     18 
---> 19 for key,val in d.items():
     20     if len(val)>1:
     21         print(val)

AttributeError: 'str' object has no attribute 'append'

The problem found here is that on d[b]=word, word is a string, so append cannot be used. If you want to add word programming word set later, you need to change word into a list:

def paixu_pinjie(word):
    t=list(word)
    t.sort()
    a=''.join(t)
    return a

fin=open('words.txt')
d=dict()
for line in fin:
    word=line.strip()
    b=paixu_pinjie(word)
    if b not in d:
        d[b]=[word]
    else:
        d[b].append(word)
return d

for key,val in d.items():
    if len(val)>1:
        print(val)

In this way, there is no error, but it is found that there is a set of words in the output result. Obviously, the code behind the final for loop is not effective. It is found that the return above for is caused by the return above. Generally speaking, return can only be used in functions. Now there are two solutions, one is to delete this return, the other is to write the above as a function.

def paixu_pinjie(word):
    t=list(word)
    t.sort()
    a=''.join(t)
    return a

def all_anagrams(filename):
    fin=open(filename)
    d=dict()
    for line in fin:
        word=line.strip().lower()
        b=paixu_pinjie(word)
        if b not in d:
            d[b]=[word]
        else:
            d[b].append(word)
    return d
def print_anagram_set(d):
    for key,val in d.items():
        if len(val)>1:
            print(val)

d=all_anagrams('words.txt')
print_anagram_set(d)

Change the above code to a function. In addition, add. lower() after word=line.strip(). You need to change the upper case letters to lower case letters. The principle for deciding whether to write a piece of code as a function is whether the code will be reused, and whether the code can be detached separately.

Modify the previous problem program and print in reverse order according to the number of words in the set.

def print_anagram_sets_in_order(d):
    t=list() #Create an empty list to store data
    for key,val in d.items():
        if len(val)>1:
            t.append((len(val),val))
    t.sort(reverse=True) #sort
    #Print in reverse order
    for x in t:
        print(x)
print_anagram_sets_in_order(d)

Because you need to reuse some of the above code later, you need to convert some of it into functions, otherwise you have to write it again.

2. Write a choose from hist, accept a histogram as a parameter, and return a value randomly in proportion to the frequency from the histogram

Histogram has been written in "python self study diary 7 - Dictionary". Through this, I also think of a question about what is a good python learning material. I decided that one of the good learning materials is to be able to string the knowledge points before and after the exercise. Let's talk about the others separately.

Histogram function has been written for a long time, and the rest is based on the random value of probability. From the official documents, we can see that there are choices in random that meet this requirement. One of the key parameters, weights, can be taken in the list according to the probability of each number in the list after weights. In this case, the histogram will get a dictionary, which contains the frequency of words that do not give letters and corresponding letters. We can divide the keys and values in the dictionary into two lists, The list of keys is the list of random values, and the list of values is assigned to weights as the probability of previous values. The following code is obtained

def histogram(s):
    d=dict()
    for c in s:
        d[c]=int(d.get(c,'0'))+1  #get can receive a key and a default value. If there is a key in the dictionary, the corresponding value of the key will be returned. Otherwise, the default value will be returned
    return d
def choose_from_hist(s):
    d=histogram(s) #Based on the result of histogram function, the function is run again
    a=[] 
    b=[]
    for key,val in d.items():
        a.append(key)
        b.append(val)
    print(a) #List of keys
    print(b) #List of values
    return random.choices(a,weights=b,k=20)
h=histogram('aaaaaabb')        
choose_from_hist(h)
['a', 'b']
[1, 1]

The result return value median list should have been [6,2] and the result was [1,1]. The reason is that the histogram function has been run twice, that is to say, the histogram function has been passed in as a parameter, and the histogram function has been run once. So remove the histogram code in the function

def histogram(s):
    d=dict()
    for c in s:
        d[c]=int(d.get(c,'0'))+1
    return d


def choose_from_hist(s):
    a=[]
    b=[]
    for key,val in d.items():
        a.append(key)
        b.append(val)
    print(a)
    print(b)
    return random.choices(a,weights=b,k=5) #k is a random number of k at a time
h=histogram('aaaaaabb')        
choose_from_hist(h)
['a', 'b']
[6, 2]

Out[85]:

['a', 'a', 'a', 'a', 'b']

Posted by theoph on Mon, 04 Nov 2019 11:34:12 -0800