python form traversal

Keywords: Session PHP Python Attribute

Get the information behind the login window

Most web servers use the GET method of HTTP protocol to request information when they interact with data, and the page form can basically be seen as a way for users to submit POST requests. Just as Web site URL links can help users send GET requests, HTML forms can help users send POST requests.

The Requests library is a Python third-party library that is good at handling complex HTTP requests, cookie s, headers (response headers and request headers).

1 Form submission requires only two concerns:
* The field name of the data you want to submit (in this case, first name, last name)
The action attribute of the form, which is the page that the website will jump to after submitting the form (in this case, http://url1)

  • Common forms are as follows:
<form action="http://url1" id="example_form2" method="POST">
First name: <input type="text" name="firstname"><br>
Last name: <input type="text" name="lastname"><br>
<input type="submit" value="Submit">
</form>
import requests
params = {'firstname': 'Ryan', 'lastname': 'Mitchell'}
r = requests.post("url1", data=params)
print(r.text)
  • If you encounter an uploaded image file, the following:
<form action="processing2.php" method="post" enctype="multipart/form-data">
Submit a jpg, png, or gif: 
<input type="file" name="image"><br>
<input type="submit" value="Upload File">
</form>

The construction request is as follows:

The submission of documents is as follows:
files = {'uploadFile': open('../files/Python-logo.png', 'rb')}
r = requests.post("url2",files=files)
  • What we encounter is a GET request that passes parameters. We only need to observe the form of the URL to confirm the corresponding form:

http://domainname.com?thing1=foo&thing2=bar

This request is the following form:

<form method="GET" action="someProcessor.php">
<input type="someCrazyInputType" name="thing1" value="foo" />
<input type="anotherCrazyInputType" name="thing2" value="bar" />
<input type="submit" value="Submit" />
</form>


The corresponding Python parameter is:
{'thing1':'foo', 'thing2':'bar'}

2 Processing login and cookie

Most new websites use cookies to track whether users have logged in or not. Some websites often adjust cookies secretly, or if you don't want to use cookies at all from the beginning, the session function of the Requests library can solve these problems perfectly.

import requests

session = requests.Session()
params = {'username': 'username', 'password': 'password'}
s = session.post("http://pythonscraping.com/pages/cookies/welcome.php", params)
print("Cookie is set to:")
print(s.cookies.get_dict())
print("-----------")
print("Going to profile page...")
s = session.get("http://pythonscraping.com/pages/cookies/profile.php")
print(s.text)

Session objects (call requests.Session() to obtain) keep track of session information, such as cookie s, header s, and even information about running HTTP protocols.

3 HTTP Basic Access Authentication

Before cookie was invented, the most common way to handle website login was to use HTTP basic access authentication. Sometimes they can be seen, especially in some highly secure websites or corporate websites. The Requests library has an auth module dedicated to HTTP authentication:

import requests
from requests.auth import AuthBase
from requests.auth import HTTPBasicAuth

auth = HTTPBasicAuth('ryan', 'password')
r = requests.post(url="http://pythonscraping.com/pages/auth/login.php", auth=auth)
print(r.text)

 

Posted by TurtleDove on Tue, 20 Aug 2019 20:06:26 -0700