Python Crawler Project Actual Warfare - Crawl Cat Eye Movie

Keywords: Python JSON Database

How to learn Python reptiles, the difficulty of reptiles is not the reptiles themselves. It's a variety of anti-reptile measures. Here is a small case to share with you the charm of python.

How to learn Python reptiles, the difficulty of reptiles is not the reptiles themselves. It's a variety of anti-reptile measures. Here is a small case to share with you the charm of python.

How to learn Python reptiles, the difficulty of reptiles is not the reptiles themselves. It's a variety of anti-reptile measures. Here is a small case to share with you the charm of python.

Climb the cat's eye information of "Sadness rivers upstream". Share the source code of the project:

 1 '''
 2 What can I learn from my learning process?
 3 python Learning Exchange Button qun,934109170
 4 There are good learning tutorials, development tools and e-books in the group.
 5 Share with you python Enterprises'Current Demand for Talents and How to Learn Well from Zero Foundation python,And learn what.
 6 '''
 7 import requests
 8 from fake_useragent import UserAgent
 9 import json
10 import pymongo
11  
12 #Save to database
13 clien=pymongo.MongoClient(host='Fill in the database IP')
14 db=clien.The_cat_s_eye_essay
15 coll=db.eye_essay
16  
17 #Create a Random Generation user-aengt Object
18 ua=UserAgent()
19  
20 #Extract the short comment we want
21 def parse_json(json):
22     if json:
23         items=json.get('cmts')
24         i=0
25         for item in items:
26             data={
27                 'ID':item.get('nickName'),
28                 'Short commentary':item.get('content'),
29                 'score':item.get('score'),
30                 'User location':item.get('cityName'),
31                 'Commentary time':item.get('startTime'),
32                 'Reply number':item.get('reply'),
33                 'Gender':item.get('gender')
34             }
35             #coll.insert_one(data)
36       print(data)
37  
38  
39  
40  
41 def Crawl_JSON():
42     ua = UserAgent()
43     headers={
44         'User-Agent':ua.random,
45         'Host':'m.maoyan.com',
46         'Referer':'http://m.maoyan.com/movie/1217236/comments?_v_=yes'
47     }
48  
49     #Cat's Eye Movie Short Comment Interface
50     #Because the cat's eye data is AJAX Type inner offset The first time for change is 0, second times, 15 times third times 30, and so on. page Equivalent to 100/15 Then cycle
51     #Self-confident observation of cat's eyes AJAX The data request parameters will be known.
52     page=100
53     u=0
54     for i in range(page):
55         try:
56             offset=u
57             startTime = '2018-10-11'
58             comment_api = 'http://m.maoyan.com/mmdb/comments/movie/1217236.json?_v_=yes&offset={0}&startTime={1}%2021%3A09%3A31'.format(offset,startTime)
59             #Send out get request
60             response_coment=requests.get(url=comment_api,headers=headers)
61             json_comment=response_coment.text
62             json_comments=json.loads(json_comment)
63             parse_json(json_comments)
64             u+=15
65         except Exception as e:
66             print('Error occurred:',e.args)
67  
68  
69  
70 parse_json(Crawl_JSON())

Posted by MaxD on Sat, 05 Oct 2019 11:15:13 -0700