012: tkinter + crawler design couplet software

Today, design a small couplet software with Python!

Applied to tkinter module:

tkinter is a graphical interface library from Python to tk, which can facilitate graphical interface design and interactive programming. It can be applied in Windows and Macintosh systems to realize the local window style.

Let's take a look at our software rendering:

First of all, the interface of the program is designed by tkinter.
Enter our uplink through the Entry class of tkinter, and then start the program for design through the button button.
The reason why the button key can complete the subsequent operation is that it binds our subsequent operation function (that is, the get_xialian function in the code).
The crawling of crawlers and the preservation of couplets are realized through bound functions.

The interface design code is as follows: if __name__ == '__main__':

root = Tk()
root.title("Couplet design")
root.geometry("1200x500")
root.geometry("+100+150")

# Make a background decoration
pic1 = Image.open('1.png').resize((1200, 500))#Load picture and resize to window size
pic = ImageTk.PhotoImage(pic1)
render = Label(root, image=pic, compound=tk.CENTER, justify=tk.LEFT)
render.place(x=0, y=0)

#Label and input box
label = Label(root, text='Upper couplet (within 10 words)', font=('Microsoft YaHei ', 20), fg='black')
label.grid(row=0,column=0,sticky=W)
entry = Entry(root, font=('Song typeface', 25),width = 15)
entry.grid(row=0, column=1,sticky=W)

#Button
button = Button(root,text='Design next line', font=('Microsoft YaHei ', 20), command=get_xialian)
button.grid(row=1, column=0, sticky=W,pady=10)

root.mainloop()

The interface design is completed, and the bound function has been determined. The next step is how to use the bound function to realize our subsequent operations, that is, crawling and saving. First, analyze the web page, open the couplet design web page and enter the developer mode to easily find the goal we need, as shown in the figure below:

View the url of its request as follows:

http://mduilian.388g.com/c.php?in=%E6%98%A5%E9%A3%8E%E9%80%81%E6%9A%96&len=0000&time=Thu%20Jan%2031%202019%2018:23:16%20GMT×tamp=1548930196082

This url is very complex and requires careful analysis:

First, we can see that there are four parts of the url that we need to input. Then, through careful analysis, we find that its in is the uplink of our input
Time and timestamp are time and timestamp as the name suggests
The meaning of len is the number of words input, which is replaced by the number of 0

Some students may ask, where is the time and input? It's all alphabetic symbols. This is because the time and input here are in url encoded format. We can check it through the unquote function in urllib. To sum up, the above url is more responsible, including the input of url encoding, the time of specific encoding format, timestamp and puzzling len.

Knowing the structure and organization of url, we can easily simulate the browser to access and return the content we need. However, in the follow-up, there is the problem of garbled code. For the problem of garbled code, we will use "utf-8" coding to solve the crawled data. Because the json library cannot be parsed, we use regular expressions to match the "next line" and return the "next line".

The code is as follows:

def Get_couplet(text):
    if text:
        patten = re.compile('''"XialianCandidates":\["(.*?)"+\]''', re.S)
        data2 = re.findall(patten, text)
        return data2
    else:
        return ["Error occurred, please redesign!"]



def get_xialian():
    shanglian  = entry.get()
    xialians = Get_couplet(Get_text(shanglian))
    scr1 = scrolledtext.ScrolledText(root, width=10, height=10, font=("official script", 18))#Set parameters for scrolling windows
    scr1.place(x=10, y=150)  # Scroll the position of the text box on the page
    scr1.insert(END, 'Uplink:\t\t')
    scr2 = scrolledtext.ScrolledText(root, width=10, height=10, font=("official script", 18))  # Set parameters for scrolling windows
    scr2.place(x=1050, y=150)  # Scroll the position of the text box on the page
    scr2.insert(END, 'Next line:\n')

    for Xialian in xialians:
        xialians = Xialian.split(',')
        for xialian in xialians:
            scr1.insert(END,shanglian+'\t\n')
            scr2.insert(END,xialian.replace('"','')+'\n')

Finally, save the obtained data in the root directory. When the couplet is saved, the program will pop up a dialog box to remind that the couplet has been completed and view it locally. The operation effect is shown in the figure below.

The complete code is as follows:

import requests
from urllib.parse import urlencode,quote,unquote
import time
import re

def Get_text(shanglian):
    GMT_FORMAT = '%a, %b %d %Y %H:%M:%S GMT'
    date = time.strftime(GMT_FORMAT,time.gmtime())
    url = 'http://mduilian.388g.com/c.php?in={0}&len={1}&time={2}+0800%20×tamp={3}'.format(quote(shanglian),
                                                                                                len(shanglian)*'0',quote(date),int(time.time()))
    headers = {
        'User-Agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Mobile Safari/537.36',
        'Referer':'http://mduilian.388g.com/',
        'Host':'mduilian.388g.com',
        'Cookie':'PHPSESSID=llcj5u13als5fdqm3ps3pcbp47; __jsluid=0bb3b698bd1bcfe76b1190eada3bada8; BDTUJIAID=f47eb00df7502b6e56e052efda83cc57; Hm_lvt_b0a06229d110088000f04b20c9024b7d=1548900294,1548911084,1548911171,1548911511; Hm_lpvt_b0a06229d110088000f04b20c9024b7d=1548911511; Hm_lvt_696881765bc4548d58a559b742b5b6d0=1548900295,1548911084,1548911171,1548911511; Hm_lpvt_696881765bc4548d58a559b742b5b6d0=1548911511; Hm_lvt_3c8ecbfa472e76b0340d7a701a04197e=1548900300,1548911090,1548911177,1548911517; Hm_lpvt_3c8ecbfa472e76b0340d7a701a04197e=1548911517',
        'Accept-Encoding': 'gzip, deflate',
        'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
        'Connection': 'keep-alive'
    }
    # print(url)
    # print(unquote(url))
    response = requests.get(url,headers=headers,timeout=50)
    response.encoding = 'utf-8'
    # print(response.text)
    return response.text


def Get_couplet(text):
    if text:
        patten = re.compile('''"XialianCandidates":\["(.*?)"+\]''', re.S)
        data2 = re.findall(patten, text)
        return data2
    else:
        return ["Error occurred, please redesign!"]



def get_xialian():
    shanglian  = entry.get()
    xialians = Get_couplet(Get_text(shanglian))
    scr1 = scrolledtext.ScrolledText(root, width=10, height=10, font=("official script", 18))#Set parameters for scrolling windows
    scr1.place(x=10, y=150)  # Scroll the position of the text box on the page
    scr1.insert(END, 'Uplink:\t\t')
    scr2 = scrolledtext.ScrolledText(root, width=10, height=10, font=("official script", 18))  # Set parameters for scrolling windows
    scr2.place(x=1050, y=150)  # Scroll the position of the text box on the page
    scr2.insert(END, 'Next line:\n')

    for Xialian in xialians:
        xialians = Xialian.split(',')
        for xialian in xialians:
            scr1.insert(END,shanglian+'\t\n')
            scr2.insert(END,xialian.replace('"','')+'\n')


from tkinter import *
import tkinter as tk
from PIL import Image, ImageTk
from tkinter import scrolledtext

if __name__ == '__main__':
    root = Tk()
    root.title("LX Design")
    root.geometry("1200x500")
    root.geometry("+100+150")

    # Make a background decoration
    pic1 = Image.open('1.png').resize((1200, 500))#Load picture and resize to window size
    pic = ImageTk.PhotoImage(pic1)
    render = Label(root, image=pic, compound=tk.CENTER, justify=tk.LEFT)
    render.place(x=0, y=0)

    #Label and input box
    label = Label(root, text='Upper couplet (within 10 words)', font=('Microsoft YaHei ', 20), fg='black')
    label.grid(row=0,column=0,sticky=W)
    entry = Entry(root, font=('Song typeface', 25),width = 15)
    entry.grid(row=0, column=1,sticky=W)

    #Button
    button = Button(root,text='Design next line', font=('Microsoft YaHei ', 20), command=get_xialian)
    button.grid(row=1, column=0, sticky=W,pady=10)

    root.mainloop()

Posted by RJDavison on Mon, 22 Nov 2021 00:20:54 -0800

Programmer Group

012: tkinter + crawler design couplet software

Hot Keywords