Crawler requests Library

Keywords: Python encoding Selenium Vue Windows

If you want to use selenium to realize the functions of B station automatic login and click like, you can check how to solve the sliding unlocking. However, it's about the content of the crawler, and then you start to learn about the crawler. Before long, you want to make the website that records your life, so your friends recommend the layui framework. After a night, you think it's a framework for back-end programmers to get started. Vue feels too difficult, and starts to make Bo Otstrap, didn't come up with a reason. Because of idle mood, I began to learn reptiles again. Every time I write an article, I always read a paragraph in pieces. Someone in the comment area said that I was hypocritical, which is true.

http request returns response object property

Coding problems

import requests
r.encoding='gbk' or  r.encoding=r.apparent_encoding
 #The content of the page returned by Baidu is ISO-8859-1 encoded. If it is not set to gbk, it will be garbled

Library Exception Handling of requests


Main methods of requests Library

1 import requests
2 r = requests.get('')
3 r = requests.head('')
4 r ='',key='value')
5 r = requests.put('',key='value')
6 r = requests.patch('',key='value')
7 r = requests.options('')
8 r = requests.delete('')

General framework for crawling web pages

 1 import requests
 2 def get_Html(url):
 3     try:
 4         r = requests.get(url,timeout=30)
 5         r.raise_for_status()
 6         r.encoding=r.apparent_encoding
 7         return r.text
 8     except:
 9         return "raise an exception"
11 if __name__=="__main__":
12     url = ""
13     print(get_Html(url))

A few small cases

Case 1 Jingdong commodity crawling
import requests
url = ''
    r = requests.get(url)
    print(r.text[:1000]) #1000 Represents the first 1000 characters of interception
    print("Crawl failure")
//Case 2 Amazon
import requests
url = ''
    kv = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36'} #Simulation browser
    r= requests.get(url,headers = kv)
    print("Crawl failure")
//Case 300 degrees
import requests
kv = {'wd':'python'}
r = requests.get('',params=kv)

When I learn something new, I will update it. I remember when I first learned it, I started to post it. Now I look at the foundation honestly

To be continued!

Posted by JMJimmy on Sat, 14 Mar 2020 08:22:52 -0700