How did I go from analysis to handwriting a Wechat Robot--Login Chapter

Keywords: Python Session QRCode Mobile Mac

I believe you all use Wechat, and also know the existence of the web version of Wechat. Today we are going to talk about how to make a fully automated micro-communication robot by simulating the web version of the micro-communication interface. This is the first part of this practical project, which focuses on how to realize a landing process. The whole analysis process is actually an interpretation of https://res.wx.qqq.com/a/wx_fed/webwx/res/static/js/index_4f3487a.js file.

<!--more-->

0x00 logic Atlas

Before a site is logically reproduced, making a logical brain map will give a better understanding of the whole process. Help you clear your mind.

Atlas tagging

To make the image cleaner, we will use bracketed content, such as verb, as a variable.

And in order to make the logical route clearer, we declare the following categories of lines to distinguish the next operation.

Logon logic

Next is a web page Wechat login process drawn by mindnode.

Session refers to this session and can automatically handle set-cookie events that occur throughout the process.

0x01 process decomposition

Then we divide the login process into the following steps.

  • 1. Open the web Wechat Home Page
  • 2. Scanning two-dimensional code
  • 3. Simulated scanning and validation process
  • 4. Complete login
  • 5. Information acquisition after login

Detailed description of 0x02 process

1. Open the web Wechat Home Page

<! - First of all, according to source code analysis, we know that appid is fixed content wx782c26e4c19acffb, while deviceid is generated randomly for js code.

Here we are going to recreate it.

import random
def create_device_id():
    return f"e{str(random.random())[2:17]}"
    
create_device_id()

>>> 'e094388356198767'

First, we need to get two-dimensional code for scanning. By looking at the web request process, we find that https://login.weixin.qq.com/qrcode/[xxxx] is its actual address. The content of [xxx] depends on the result of the request for / jslogin.

By searching for the keyword jslogin, we found something interesting, which contains the paths required for all subsequent requests. Including the current associated API_jsLogin

So we know that the path is the following address, and the tail _timestamp is negligible.

https://login.wx.qq.com/jslogin?appid=wx782c26e4c19acffb&redirect_uri=https%3A%2F%2Fwx.qq.com%2Fcgi-bin%2Fmmwebwx-bin%2Fwebwxnewloginpage&fun=new&lang=zh_CN

2. Getting two-dimensional code pictures

Now we simulate the process of getting and printing QRcode.

import re
from imgcat import imgcat
import requests_html

session = requests_html.HTMLSession()
API_jsLogin = 'https://login.wx.qq.com/jslogin?appid=wx782c26e4c19acffb&redirect_uri=https%3A%2F%2Fwx.qq.com%2Fcgi-bin%2Fmmwebwx-bin%2Fwebwxnewloginpage&fun=new&lang=zh_CN'
QR_code = 'https://login.weixin.qq.com/qrcode/{}'

def get_qrcode_uid():
    resp = session.get(API_jsLogin)
    uid = re.split(r'"|";', resp.text)[1]
    print(f'uid is {uid}')
    return uid

def get_qrcode_img(uid):
    resp = session.get(QR_code.format(uid))
    return imgcat(resp.content)
    
uid = get_qrcode_uid()
get_qrcode_img(uid)

Here we use imgcat to display pictures directly on the command line. Note that this module is only applicable to mac. We'll write the above code in testimg.py and run it to get the following results. It can be seen that the two-dimensional code has been successfully printed.

3. Simulated scanning and validation process

Now we're logged in without sweeping the code. Unfortunately not, we need to open a circular request with a timeout of 25 seconds, and the total circular time is no more than 5 minutes until it succeeds. Perhaps some students will ask why? We might as well look back at the previous process animation. Before acquiring the two-dimensional code, there is a request whose state is pedding.

The request was completed in the instant we scanned it via mobile tweets. And the request returns a base64-encoded image, your avatar.

The next thing we need to do is to confirm the login operation on the client side. By observing the request process, we found that listening for two-dimensional code scanning and login confirmation is almost the same request, so we wrote the following code for simulation.

# Extend the session above and all import s
mport execjs
import time
import base64

API_login = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/login'
API_check_login = 'https://login.wx.qq.com/cgi-bin/mmwebwx-bin/login'

def get_timestamp(reverse=False):
    if reverse:
        return execjs.eval("~new Date")
    return int(time.time() * 1e3)
        
def login_wait(confirm = True):
    return session.get(
        API_check_login if confirm else API_login,
        params={
            "loginicon": "true",
            "uuid": uid,
            "tip": 0 if confirm else 1,
            "r": get_timestamp(True),
            "_": get_timestamp()
        },timeout=25)
        
nums = 10
while nums > 0:
    try:
        print("Waiting for client scanning, remaining times", nums)
        res = login_wait()
        if "userAvatar" in res.text:
            print("Immediately to print the Avatar")
            imgcat(base64.b64decode(re.findall("base64,(.*?)';", res.text)[0]))
            break
        nums -= 1
    except Exception as e:
        pass
    
print("Waiting for client confirmation")
redirect_uri = re.findall('redirect_uri="(.*?)"',login_wait(True).text)[0]
print('About to jump',redirect_uri)

After running the above code. We confirm and successfully acquire our own image and url to jump after confirmation by mobile phone scanner.

4. Complete login

After obtaining redirect_uri, we access it directly to obtain the required ticket, which results in an xml format string to be processed. We will write the following code for simulation and processing.

def get_auth_data(resp):
    return {
        key: resp.html.find(key)[0].text
        for key in ["skey", "wxsid", "wxuin", "pass_ticket", "isgrayscale"]
    }


def get_ticket():
    resp = session.get(
        redirect_uri, params={"fun": "new", "lang": "zh_CN", "_": get_timestamp()}
    )
    print("Get Ticket:", requests_html.requests.utils.dict_from_cookiejar(resp.cookies))
    auth_data = get_auth_data(resp)
    session.cookies.update(
        requests_html.requests.utils.cookiejar_from_dict(
            {"last_wxuin": auth_data["wxuin"]}
        )
    )
    if list(filter(lambda item: item[1], auth_data.items())):
        return auth_data


auth_data = get_ticket()

Now, we have also succeeded in obtaining bills information. The information here will help us to obtain personal factual information successfully.

5. Information acquisition after login

Next we see that my nickname S045pd has been included in the response body of the next request. That is to say, we have successfully logged in at the moment, and the request has obtained the current personal information.

Several more intuitive fields are selected here:

  • ChatSet Current Message Object ID Set Several (Who are you chatting with top10)
  • ContactList has several sets of current message objects (who are you chatting with top10)

    • MemberList is a list of group members if it is a group
  • MPSubscribeMsgList Public Number List
  • An important parameter of SKey
  • SyncKey: An Important Parameter for Message Interaction
  • SystemTime system time
  • User user information

    • HeadImgUrl Head
    • Nick Name NickName
    • Sex sex
    • Signature personality signature
    • UserName Current ID

Of course, what we get here is only basic information. The list of friends is not complete. Next issue will be about "Webot Weixin Robot Project Actual Warfare [Friends Export].

Background reply on Public Number: Wechat Robot can get the code used in this article.

Complete code welcomes attention to open source projects: Webot

More brilliant in the public number: the attacking Hector

Posted by bl00dshooter on Thu, 19 Sep 2019 23:14:46 -0700