[JS reverse hundred examples] AES encryption analysis of cnki academic translation

Keywords: Python Javascript crawler

WeChat official account: brother K crawler, QQ exchange group: 808574309, keep sharing crawler advance, JS/ Android reverse technology dry cargo!

statement

All contents in this article are for learning and communication only. The packet capturing content, sensitive website and data interface have been desensitized. It is strictly prohibited to use them for commercial and illegal purposes, otherwise all the consequences have nothing to do with the author. If there is infringement, please contact me and delete them immediately!

Reverse target

  • Objective: cnki academic translation AES encryption
  • Home page: aHR0cHM6Ly9kaWN0LmNua2kubmV0L2luZGV4
  • Interface: ahr0chm6ly9kawn0lmnua2kubmv0l2z5enmtznjvbnqtyxbl3ryyw5zbgf0zs9saxrlcmfsdhjhbnnsyxrpb24=
  • Reverse parameter: Request Payload: words: "kufhG_UJw_k3Sfr3j0BLAA = ="

Reverse process

The reverse material in this issue comes from the help of a group friend in the crawler exchange group of brother K. The goal is cnki academic translation. Fans want to achieve two functions: 1. Break through the limit of 1000 characters in English; 2. Reverse encryption process.

Go to the translation homepage and locate the packet capture to the translation interface. You can see that the text to be translated will be encrypted in the Request Payload, as shown in the following figure:

If you directly search the keyword words here, you will find that there are many results that are difficult to find. If you notice that there is a translateType in the Payload parameter, you can directly search the translateType, because the two parameters are usually next to each other. Of course, you can also use the XHR breakpoint to find it, but it's a little troublesome. The search results are in app.9fb42bb0.js, Note that in the last result, there is encrypto, which means encryption, which is basically the place of encryption:

Print (0, h.encrypto)(this.inputWord) on the console, which is the encryption result:

Continue to follow up h.encrypto. It is obvious that AES encryption, n = "4e87183cfd3a45fe", n is the key, mode ECB, fill Pkcs7, and finally replace some strings, as shown in the following figure:

After knowing the encryption algorithm, key and other key parameters, it is OK to directly reference the crypto JS module for implementation. The JavaScript code is as follows:

// Reference crypto JS encryption module
var CryptoJS = require('crypto-js')

function s(t) {
    var n = "4e87183cfd3a45fe"
    var e = {
        mode: CryptoJS.mode.ECB,
        padding: CryptoJS.pad.Pkcs7
    }
      , i = CryptoJS.enc.Utf8.parse(n)
      , s = CryptoJS.AES.encrypt(t, i, e)
      , r = s.toString().replace(/\//g, "_");
    return r = r.replace(/\+/g, "-"),
    r
}

console.log(s("test"))

// kufhG_UJw_k3Sfr3j0BLAA==

A small demo translated using Python:

# ==================================
# --*-- coding: utf-8 --*--
# @Time    : 2021-11-05
# @Author: WeChat official account: K brother crawler
# @FileName: cnki.py
# @Software: PyCharm
# ==================================


import execjs
import requests


token_url = "https://dict.cnki.net/fyzs-front-api/getToken"
translation_api = "https://dict.cnki.net/fyzs-front-api/translate/literaltranslation"
UA = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"

session = requests.session()


def get_token():
    headers = {"User-Agent": UA}
    response = session.get(url=token_url, headers=headers).json()
    token = response["data"]
    return token


def get_encrypted_word(word):
    with open('cnki_encrypt.js', 'r', encoding='utf-8') as f:
        cnki_js = f.read()
    encrypted_word = execjs.compile(cnki_js).call('s', word)
    return encrypted_word


def get_translation_result(encrypted_word, token):
    payload = {
        "translateType": None,
        "words": encrypted_word
    }
    headers = {
        "Token": token,
        "User-Agent": UA
    }
    response = session.post(url=translation_api, headers=headers, json=payload).json()
    result = response["data"]["mResult"]
    return result


def main():
    word = input("Please enter the string to be translated: ")
    token = get_token()
    encrypted_word = get_encrypted_word(word)
    result = get_translation_result(encrypted_word, token)
    print("The translation result is: ", result)


if __name__ == "__main__":
    main()

Another problem for fans is the character limit. See if you can break through. The measured English limit is 1000 characters and the Chinese limit is 500 characters, as shown in the figure below:

In fact, the probability of this restriction is not only the front-end restriction, but also the server. We can carry more than 500 characters of Chinese to request. The first character is "test 1", and the last three characters are "Test 2". At this time, it has exceeded 500 characters. We see that Test 2 does not appear in the translation results, so we want to translate many strings, It can only be divided into several parts.


Posted by Enlightened on Sat, 04 Dec 2021 14:53:08 -0800