Basic use of Requests
install
I. Requests module request
- Get web page (without parameters)
r = requests.get('http://www.chinahufei.com')
r = requests.post('http://www.chinahufei.com')
r = requests.delete('http://www.chinahufei.com')
r = requests.head('http://www.chinahufei.com')
r = requests.options('http://www.chinahufei.com')
- Get web page (with parameters)
# get mode
r = requests.get("http://api.chinahufei.com", params = { 'page': 1 })
# post mode
r = requests.post('http://api.chinahufei.com', data = {'kwd':'hufei'})
# General mode
r = requests.request("get", "http://api.chinahufei.com/")
# Other
payload = {'page': '1', 'kwd': ['hufei', 'china']}
r = requests.get('http://api.chinahufei.com', params=payload)
- Get web page (with header and UserAgent)
# get mode
kw = {'kwd':'The Great Wall'}
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
# params receives the query parameters of a dictionary or string. The dictionary type is automatically converted to url encoding, and urlencode() is not required
response = requests.get("http://api.chinahufei.com", params = kwd, headers = headers)
# post mode
formdata = {
"type":"AUTO",
"i":"i love python",
"doctype":"json",
"xmlVersion":"1.8",
"keyfrom":"fanyi.web",
"ue":"UTF-8",
"action":"FY_BY_ENTER",
"typoResult":"true"
}
url = "http://api.chinahufei.com"
headers={ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
r = requests.post(url, data = formdata, headers = headers)
import requests
# Select different agents according to the protocol type
proxies = {
"http": "http://12.34.56.79:9527",
"https": "http://12.34.56.79:9527"
}
response = requests.get("http://api.chinahufei.com", proxies = proxies)
print response.text
# Private agent verification
import requests
# If the proxy needs to use HTTP Basic Auth, you can use the following format:
proxy = { "http": "mr_mao_hacker:sffqry9r@61.158.163.130:16816" }
response = requests.get("http://api.chinahufei.com", proxies = proxy)
print response.text
# Web client authentication
import requests
auth=('test', '123456')
response = requests.get('http://192.168.199.107', auth = auth)
print response.text
- Get web page (for redirection)
# Not allow
r = requests.head('http://github.com', allow_redirects=False)
- HTTPS requests SSL certificate validation
# If we want to skip the certificate validation of 12306, set verify to False to request normally.
r = requests.get("https://www.12306.cn/mormhweb/", verify = False)
II. Request module response
- Response content - text (data in Unicode format)
- Response content (byte stream data)
- Response content - JSON (data of JSON type)
- url address - url (full address)
- Response code - status? Code
- Response headers()
- Response header character encoding encoding
- Cookies-cookies
import requests
response = requests.get("http://www.baidu.com/")
# Return CookieJar object
cookiejar = response.cookies
# Turning CookieJar into a dictionary
cookiedict = requests.utils.dict_from_cookiejar(cookiejar)
print cookiejar
print cookiedict
# Renren simulated Login
import requests
# 1. Create session object to save Cookie value
ssion = requests.session()
# 2. Processing headers
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
# 3. User name and password to log in
data = {"email":"mr_mao_hacker@163.com", "password":"alarmchime"}
# 4. Send the request with user name and password, and obtain the Cookie value after login, and save it in the mission
ssion.post("http://www.renren.com/PLogin.do", data = data)
# 5. The session contains the Cookie value of the user after login. You can directly access the pages that can be accessed only after login
response = ssion.get("http://www.renren.com/410043129/profile")
# 6. Print response content
print response.text
Three solutions to the encoding and decoding problem of Request module
- response.content.decode()
- response.content.decode('gbk')
- response.text