JS reverse - invoice inspection platform of State Administration of Taxation

Keywords: Javascript Java network

National VAT invoice inspection platform of State Administration of Taxation


Recently, my friend has a new requirement, that is, to do a crawler for invoice verification. Because there are some unfriendly anti crawls on this website, it's very unfriendly to novices ~ ~ ~ so I spent some time on Saturday to be healthy.
The difficulty is not bad. Through the analysis, it is the enterprise version of sojson. It may be the latest version of v6, or it may be v5, and then a webdriver test is added, because it needs to be charged to use the enhanced version of v6 against headless browser. So no matter him, black cat and white cat are 🐱



I gave you a gift at the beginning. In fact, it's OK. There are so many debugger articles that we can't do more. Here's mainly the analysis process, so it's a little bit...

After debugger's test, I was clean and comfortable, and my mother didn't have to worry about my heartless interruption when I was debugging.

js encryption version judgment

Although sojson = obobobobfuscation + code written by oneself
Generally speaking, ob obfuscation doesn't have a debugger. It looks like a sojson thief. That's him. After reading the sojson advertisement, you can be sure that it is a customized version

Here is a screenshot of sojsonv6 anti obfuscation, the same world, the same routine...

Simple js processing

In April and may, I learned a little bit about the AST processing of js, and then I simply dealt with it.

Processing strings are extracted as method calls & & processing extracted operators and deleting

Handling the flattening process

Then replace it with charles. I have written about how to eat charles before... Because I'm just withholding those codes, I'm confusing them. I'm already familiar with dealing with sojson. I'm just beeping...

Then refresh it, repeat the debugger method, and then make the next breakpoint on the network, and you will find the whole process.

Initialization phase

Getting the address of different provinces may be different

General process

Add ajaxSetup here. In fact, this is what I found when I pushed back the process. Then I like to mention url signature...



The public library implemented by RSA in js is only JSEncrypt and nodejs version based on JSEncrypt redevelopment. No other standardized RSA has been found yet. But they rely on the browser's internal crypto or the crypto implemented by nodejs. The RSA implementation of a specific library that does not depend on the system has not been found yet, and the built-in js engine in Java cannot work normally. So this step is to get this value for Java to implement RSA encryption.

Java implementation of RSA algorithm

package cn.gov.chinatax.utils;

import sun.misc.BASE64Decoder;

import javax.crypto.BadPaddingException;
import javax.crypto.Cipher;
import javax.crypto.IllegalBlockSizeException;
import javax.crypto.NoSuchPaddingException;
import java.io.IOException;
import java.security.InvalidKeyException;
import java.security.KeyFactory;
import java.security.NoSuchAlgorithmException;
import java.security.PublicKey;
import java.security.spec.InvalidKeySpecException;
import java.security.spec.X509EncodedKeySpec;
import java.util.Base64;

 * @Description
 * @auther Gouzai
 * @create 2020-06-05 18:44
public class RSA {

    public static String encryp(String str,String key) {
        try {
            X509EncodedKeySpec bobPubKeySpec = new X509EncodedKeySpec(new BASE64Decoder().decodeBuffer(key));
            // RSA symmetric encryption algorithm
            KeyFactory keyFactory = KeyFactory.getInstance("RSA");
            // Public key taking object
            PublicKey publicKey = keyFactory.generatePublic(bobPubKeySpec);

            Cipher cipher = Cipher.getInstance("RSA");
            cipher.init(Cipher.ENCRYPT_MODE, publicKey);
            byte[] bytes = cipher.doFinal(str.getBytes());
            return Base64.getEncoder().encodeToString(bytes);
        } catch (IOException e) {
        } catch (NoSuchPaddingException e) {
        } catch (NoSuchAlgorithmException e) {
        } catch (IllegalBlockSizeException e) {
        } catch (BadPaddingException e) {
        } catch (InvalidKeyException e) {
        } catch (InvalidKeySpecException e) {
        return null;



Get verification code

His verification code is bound to invoice code and invoice number

Decryption verification code

After decryption, get the base64 image, check the time, and enter the type of verification code

Verification code identification

The verification code here uses the customized identification method of fish guide, and there is a test interface below. However, in order to facilitate the test, all of them will have up to 500 identification opportunities every day, which is enough for the test. One invoice can only be queried five times a day...
Since the verification code recognition rate is more than 98%, the query failure caused by the error of the verification code is basically not seen.

Get invoice information

Here is an fplx (invoice type), which can be searched globally. It is in the same file as the js to get the server configuration
Sign first

Organize the splicing code and then pass it in

Temporary storage

Dynamic generation of js code for splicing

Get initialization data

Parse data

Set to text

Show it's done

The whole process is finished. Short for one-stop service...


There are two types of split tickets in the test: the first one has no detailed list (magpie tower tea restaurant). The second has a detailed list (Wal Mart)

Colored egg

This website mainly tests

sojson's routine, directly passed...

Some libraries have been modified, such as Base64. There are several Base64 in them. Never confuse them. If they are confused, GG

testing window.navigator.webdriver

The product of available width and height of the screen is judged by a critical value

Code deduction skills

No matter where the button is used, mom doesn't have to worry about me anymore


java String split is different from js...

Are government programmers so boring??? It's all arrays. Don't you want to update it?... The whole process of updating a field must be changed...
The name of the parameter is actually the first one in Chinese Pinyin, such as pflx (invoice type), fpdm (invoice code), fphm (invoice number)...


QQ Group

Posted by jpratt on Wed, 17 Jun 2020 19:17:31 -0700