Summer Progress Plan 2: Climb the details of a Top100 movie and store the results in the database

Keywords: Database MySQL Windows

Crawl the top 100 movie information of an eye and store it in the database

It's hard for me to make it today. Cry.

  1. number
  2. picture
  3. Full name
  4. To star
  5. time
  6. score

Let's put the code below:

from urllib import request
from bs4 import BeautifulSoup
import pymysql

conn = pymysql.connect(host = 'localhost', user = 'root', password = '523310', db = 'mysql')
cur = conn.cursor()
num = 0

headers = {
        'user-agent' : 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
        'host' : 'maoyan.com'
    }

for i in range(0,10):
    url_top = 'https://maoyan.com/board/4?offset={}'.format(str(i*10))
    
    html_top = requests.get(link, headers = headers, timeout = 10)
    soup = BeautifulSoup(html_top.text, 'html5lib')
    
    name_li = soup.find_all('p', class_ = 'name')#i.a.text.strip()
    actor = soup.find_all('p', class_ = 'star')#i.text.strip().split('starring:') [1]
    img_li = soup.find_all('img', class_ = 'board-img')
    time_li = soup.find_all('p', class_ = 'releasetime')#i.text.strip().split('release time:') [1]
    score = soup.find_all('p', class_ = 'score')
    score1_li = soup.find_all('p', class_ = 'integer')
    score2_li = soup.find_all('p', class_ = 'fraction')

    long = len(name_li)
    
    for j in range(0,long):
        num = num+1
        number = str(num)
        #print(number)
        img = img_li[j]['data-src']
        #print(img)
        name = name_li[j].a.text.strip()
        #print(name)
        act = actor[j].text.strip().split('To star:')[1]
        #print(act)
        date = time_li[j].text.strip().split("Show time:")[1]
        #print(date)
        scor =score[j].text
        #print(scor)
        try:
            cur.execute('insert into top100(no, pic, name, actor, time, score) values(%s, %s, %s, %s, %s, %s)',(number, img, name, act, date, scor))
            conn.commit()
            print('wanc')
        except:
            conn.rollback()
            print('shibai')

cur.close()
conn.commit()
conn.close()

There is a lot of information missing in this process, and I also asked the big man.
The questions are as follows:
1. The code I wrote before is as follows:

actor = actor[j].text.strip().split('To star:')[1]

After that, I always reported that there was no text method in str type, but I didn't encounter this problem in other code blocks. I just checked the usage of various types and texts. At one time, it was very collapsed. At last, I saw a problem under my classmate's reminder that my variables in my code were the same as the list name. Then I changed it. That's all right. I was stuck for a week, ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah ah.

2. There is something wrong with this question.
Before I wrote to the database, there was no try module in my code, so the code had not been able to run. After that, I did not know what a mistake was, but I missed the data in the database.
Notice the varchar and utf8 in this. Okay. Sit back.

Posted by amclean on Sun, 06 Oct 2019 04:50:19 -0700