python crawler-Tao translation-js encryption cracking

Keywords: Python JSON encoding Windows Session

Dow Translation-js Encryption Cracking

This is a locally crawled website: http://fanyi.youdao.com/

I. Analysis Request

We input the fruit in the page: the English after translation is fruit. There are many parameters in the request. First, save the parameter data and make a record.

Let's enter a new word: watermelon.

i: watermelon
from: AUTO
to: AUTO
smartresult: dict
client: fanyideskweb
salt: 15681884266087
sign: 1ea84aac4a04982f4a775f361ae30351
ts: 1568188426608
bv: a4f4c82afd8bdba188e568d101be3f53
doctype: json
version: 2.1
keyfrom: fanyi.web
action: FY_BY_REALTlME

i: Fruits
from: AUTO
to: AUTO
smartresult: dict
client: fanyideskweb
salt: 15681879672603
sign: 0f711cd437e15430dc1df1dd0948fb66
ts: 1568187967260
bv: a4f4c82afd8bdba188e568d101be3f53
doctype: json
version: 2.1
keyfrom: fanyi.web
action: FY_BY_REALTlME

By contrast, the two requests carry different data parameters: it's easy to see that the four parameters i,salt,ts,sign are different. I: That's the word you need to translate. If ts, you can guess it's a timestamp. Salt is just a number added after TS parameter, right? Now, we don't know what the parameters of sign are. It's 32-bit data, and it's supposed to be md5 encrypted string, if not unexpectedly.

2. Decrypting Encryption Parameters

Just as sign doesn't know how to generate it, it's better to search with sign as the keyword. Look where it appears.

After interrupting, we are translating in another word: banana.


I won't explain the encryption function. navigator.appVersion is the user-agent of the browser.

def get_encrypt_data(keyword):
    t = "5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
    bv = hashlib.md5(bytes(t, encoding="utf-8")).hexdigest()
    ts = str(int(round(time.time(), 3) * 1000))
    salt = ts + str(random.randint(1, 10))
    sign = hashlib.md5(
        bytes("fanyideskweb" + keyword + salt + "n%A-rKaT5fb[Gy?;N5@Tj", encoding="utf-8")).hexdigest()
    return ts, bv, salt, sign

Successful response:

import requests, time, random, hashlib
from fake_useragent import UserAgent

ua = UserAgent()

url = "http://fanyi.youdao.com/translate_o"

headers = {
    "User-Agent": ua.random,
    "Referer": "http://fanyi.youdao.com/",
}
s = requests.Session()


def get_encrypt_data(keyword):
    t = "5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
    bv = hashlib.md5(bytes(t, encoding="utf-8")).hexdigest()
    ts = str(int(round(time.time(), 3) * 1000))
    salt = ts + str(random.randint(1, 10))
    sign = hashlib.md5(
        bytes("fanyideskweb" + keyword + salt + "n%A-rKaT5fb[Gy?;N5@Tj", encoding="utf-8")).hexdigest()
    return ts, bv, salt, sign


def param():
    dic = {}
    dic["i"] = keyword,
    dic["from"] = "AUTO",
    dic["to"] = "AUTO",
    dic["smartresult"] = "dict",
    dic["client"] = "fanyideskweb",
    dic["doctype"] = "json",
    dic["version"] = "2.1",
    dic["keyfrom"] = "fanyi.web",
    dic["action"] = "FY_BY_REALTlME",
    dic["ts"], dic["bv"], dic["salt"], dic["sign"] = get_encrypt_data(keyword)
    return dic


if __name__ == '__main__':
    re = s.get("http://fanyi.youdao.com/", headers=headers)
    keyword = input("Enter the content you want to translate>>>:").strip()
    response = s.post(url=url, data=param(), headers=headers)
    msg = response.json().get("translateResult")[0][0]
    print('''Translation content>>>:{}
//Translation results >: {}''. format (msg. get ("src"), msg. get ("tgt"))

Posted by spitfire_esquive on Sat, 05 Oct 2019 07:04:58 -0700