Detailed Description of requests Library

Keywords: JSON Session Java Python

Introduction to Requests

  • Requests are HTTP libraries written in Python language and based on urllib, using Apache 2 Licensed Open Source Protocol.
  • It's more convenient than urllib, and it saves us a lot of work and is completely satisfying
    HTTP test requirements.
  • As a matter of fact, Requests is a simple HTTP library implemented by Python.
    ##### Installation of Request
    pip install requests
    # Requests Actual Warfare
    ### 1. Initiate HTTP requests
import requests
response = requests.get('https://www.baidu.com/')
print(type(response))
print(response.status_code)
print(type(response.text))
print(response.text)
print(response.cookies)

2. Various requests

import requests
requests.post('http://httpbin.org/post')
requests.put('http://httpbin.org/put')
requests.delete('http://httpbin.org/delete')
requests.head('http://httpbin.org/get')
requests.options('http://httpbin.org/get')

2. Basic get requests

import requests
response = requests.get('http://httpbin.org/get')
print(response.text)

3. GET requests with parameters

import requests
response = requests.get("http://httpbin.org/get?name=germey&age=22")
print(response.text)

The above method of splicing parameters may be inconvenient. The following is another method with parameter get request initiation, which is equivalent to the above method.

import requests
data = {
    'name': 'germey',
    'age': 22
}
response = requests.get("http://httpbin.org/get", params=data)
print(response.text)

4. parsing json

import requests
import json
response = requests.get("http://httpbin.org/get")
print(type(response.text))#str format
print(response.json())#json format output
print(json.loads(response.text))#The result is the same as the previous one.
print(type(response.json()))#json is dict format, that is, dictionary format

5. Getting binary data

import requests
response = requests.get("https://github.com/favicon.ico")
print(type(response.text), type(response.content))#str and bytes formats
print(response.text)#Random code
print(response.content)#Binary byte stream

Save the picture:

import requests
response = requests.get("https://github.com/favicon.ico")
with open('favicon.ico', 'wb') as f:
    f.write(response.content)
    f.close()

6. Add headers

When writing crawlers, without headers, the server may refuse access, such as:

import requests
response = requests.get("https://www.zhihu.com/explore")
print(response.text)#The status code in response is 400, the error request

Add headers

import requests
headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'
}
response = requests.get("https://www.zhihu.com/explore", headers=headers)
print(response.text)#Return to normal results

7. Basic post requests

post requests require users to carry some data, typically user login.

import requests
data = {'name': 'germey', 'age': '22'}
response = requests.post("http://httpbin.org/post", data=data)
print(response.text)

[2]response

1. Main attributes of response

import requests
response = requests.get('http://www.jianshu.com')
print(type(response.status_code), response.status_code)
print(type(response.headers), response.headers)
print(type(response.cookies), response.cookies)
print(type(response.url), response.url)
print(type(response.history), response.history)

2. State Code Judgment

import requests

response = requests.get('http://www.jianshu.com/hello.html')
exit() if not response.status_code == requests.codes.not_found else print('404 Not Found')
import requests

response = requests.get('http://www.jianshu.com')
exit() if not response.status_code == 200 else print('Request Successfully')

3. Common status codes: This is also a question frequently asked in interviews.

100: ('continue',),
101: ('switching_protocols',),
102: ('processing',),
103: ('checkpoint',),
122: ('uri_too_long', 'request_uri_too_long'),

200: ('ok', 'okay', 'all_ok', 'all_okay', 'all_good', '\o/', '✓'),
201: ('created',),
202: ('accepted',),
203: ('non_authoritative_info', 'non_authoritative_information'),
204: ('no_content',),
205: ('reset_content', 'reset'),
206: ('partial_content', 'partial'),
207: ('multi_status', 'multiple_status', 'multi_stati', 'multiple_stati'),
208: ('already_reported',),
226: ('im_used',),

Redirection.

300: ('multiple_choices',),
301: ('moved_permanently', 'moved', '\o-'),
302: ('found',),
303: ('see_other', 'other'),
304: ('not_modified',),
305: ('use_proxy',),
306: ('switch_proxy',),
307: ('temporary_redirect', 'temporary_moved', 'temporary'),
308: ('permanent_redirect',
'resume_incomplete', 'resume',), # These 2 to be removed in 3.0

Client Error.

400: ('bad_request', 'bad'),
401: ('unauthorized',),
402: ('payment_required', 'payment'),
403: ('forbidden',),
404: ('not_found', '-o-'),
405: ('method_not_allowed', 'not_allowed'),
406: ('not_acceptable',),
407: ('proxy_authentication_required', 'proxy_auth', 'proxy_authentication'),
408: ('request_timeout', 'timeout'),
409: ('conflict',),
410: ('gone',),
411: ('length_required',),
412: ('precondition_failed', 'precondition'),
413: ('request_entity_too_large',),
414: ('request_uri_too_large',),
415: ('unsupported_media_type', 'unsupported_media', 'media_type'),
416: ('requested_range_not_satisfiable', 'requested_range', 'range_not_satisfiable'),
417: ('expectation_failed',),
418: ('im_a_teapot', 'teapot', 'i_am_a_teapot'),
421: ('misdirected_request',),
422: ('unprocessable_entity', 'unprocessable'),
423: ('locked',),
424: ('failed_dependency', 'dependency'),
425: ('unordered_collection', 'unordered'),
426: ('upgrade_required', 'upgrade'),
428: ('precondition_required', 'precondition'),
429: ('too_many_requests', 'too_many'),
431: ('header_fields_too_large', 'fields_too_large'),
444: ('no_response', 'none'),
449: ('retry_with', 'retry'),
450: ('blocked_by_windows_parental_controls', 'parental_controls'),
451: ('unavailable_for_legal_reasons', 'legal_reasons'),
499: ('client_closed_request',),

Server Error.

500: ('internal_server_error', 'server_error', '/o\', '✗'),
501: ('not_implemented',),
502: ('bad_gateway',),
503: ('service_unavailable', 'unavailable'),
504: ('gateway_timeout',),
505: ('http_version_not_supported', 'http_version'),
506: ('variant_also_negotiates',),
507: ('insufficient_storage',),
509: ('bandwidth_limit_exceeded', 'bandwidth'),
510: ('not_extended',),
511: ('network_authentication_required', 'network_auth', 'network_authentication'),

[3] Advanced Operations

(1) File upload

import requests

files = {'file': open('favicon.ico', 'rb')}
response = requests.post("http://httpbin.org/post", files=files)
print(response.text)

(2) Getting cookie s

import requests

response = requests.get("https://www.baidu.com")
print(response.cookies)
for key, value in response.cookies.items():
    print(key + '=' + value)

(3)session can maintain session information

import requests

s = requests.Session()
s.get('http://httpbin.org/cookies/set/number/123456789')#Website for testing 
#Maintaining session information with session, as opposed to operating in a browser
response = s.get('http://httpbin.org/cookies')
print(response.text)

Output:

{
"cookies": {
"number": "123456789"
}
}

(4) Certificate Verification

Nowadays, many websites need certificate validation. Without certificates, errors such as "you are not visiting a private link" will occur. For the Https protocol, the certificate is checked first and then thrown: SSLError if the certificate is not valid. There are two measures to address this point:
The following method: When accessing, set up the certificate verification, then return the status code 200, but there will still be a warning.

import requests
response = requests.get('https://www.12306.cn')
print(response.status_code)

Eliminate warning information:

import requests
from requests.packages import urllib3
urllib3.disable_warnings()
response = requests.get('https://www.12306.cn', verify=False)
print(response.status_code)

[2] Manual Designation Certificate

import requests

response = requests.get('https://www.12306.cn', cert=('/path/server.crt', '/path/key'))
print(response.status_code)

(5) Agent settings

import requests

proxies = {
  "http": "http://127.0.0.1:9743",
  "https": "https://127.0.0.1:9743",
}
response = requests.get("https://www.taobao.com", proxies=proxies)
print(response.status_code)

For agents requiring usernames and passwords

import requests

proxies = {
    "http": "http://user:password@127.0.0.1:9743/",#Specify user name and password
}
response = requests.get("https://www.taobao.com", proxies=proxies)
print(response.status_code)

socks agent: install the module first

pip3 install 'requests[socks]'

Proxy settings

proxies = {
    'http': 'socks5://127.0.0.1:9742',
    'https': 'socks5://127.0.0.1:9742'
}
response = requests.get("https://www.taobao.com", proxies=proxies)
print(response.status_code)

(6) Timeout settings

import requests
from requests.exceptions import ReadTimeout
try:
    response = requests.get("http://httpbin.org/get", timeout = 0.5)
    print(response.status_code)
except ReadTimeout:

### (7) Websites that require login authentication to access

import requests
from requests.auth import HTTPBasicAuth

r = requests.get('http://120.27.34.24:9001', auth=HTTPBasicAuth('user', '123'))
print(r.status_code)

The following is equivalent to the above:

import requests

r = requests.get('http://120.27.34.24:9001', auth=('user', '123'))
print(r.status_code)

(8) exception handling

import requests
from requests.exceptions import ReadTimeout, ConnectionError, RequestException
try:
    response = requests.get("http://httpbin.org/get", timeout = 0.5)
    print(response.status_code)
except ReadTimeout:
    print('Timeout')
except ConnectionError:
    print('Connection error')
except RequestException:
    print('Error')

Here are some common methods of request

Scanning the two-dimensional code below, timely access to more Internet job search, java, python, crawler, big data and other technologies, and mass information sharing: public number backstage reply "csdn" can be free to receive [csdn] and [Baidu Library] download services; public number backstage reply "information": you can get 5T quality learning materials, Java interview points and Java face summaries, and several Ten java, big data projects, information is complete, you want to find almost all

Posted by Sno on Sun, 19 May 2019 11:12:58 -0700