Hand-in-hand teaching you how to implement custom application layer protocols

Keywords: C++ network JSON C ftp

1. Brief introduction

  • Internet is full of all kinds of network services. When providing network services to the outside world, the server and the client need to follow the same set of data communication protocols in order to communicate normally; just like you communicate with Taiwanese in Minnan and Cantonese in Cantonese.

  • When implementing its own application functions, known well-known protocols (http, smtp, ftp, etc.) can not meet the requirements of security and scalability, so it is necessary to design and implement their own application layer protocols.

2. Classification of protocols

2.1 By coding

  • Binary Protocol
    For example, the tcp protocol in the transport layer of network communication.

  • Plaintext text protocol
    For example, http and redis protocols in application layer.

  • Hybrid protocol (binary + plaintext)
    For example, Apple's early APNs push protocol.

2.2 By Agreement Boundary

  • Fixed boundary agreement
    The length of a protocol message can be known clearly. Such a protocol is easy to parse, such as tcp protocol.

  • Fuzzy Boundary Protocol
    It is difficult to know the length of a protocol message clearly. The parsing of such protocol is complex. Usually, it needs to define whether the message ends by some specific bytes, such as http protocol.

3. Basic Criteria for Evaluating the Advantages and Disadvantages of Agreements

  • Efficient
    Fast packing and unpacking reduces the occupancy of cpu and high data compression rate reduces the occupancy of network bandwidth.

  • Ordinary
    Easy to understand and analyze.

  • Easy to expand
    There is enough flexibility for predictable changes to be extended.

  • Easy to Compatible
    Forward compatibility, for messages sent by the old protocol, the new protocol can be used for parsing, but the new functions supported by the new protocol can not be used.

Backward compatibility. Messages sent by the new protocol can be parsed using the old protocol, but the new functions supported by the new protocol can not be used.

4. Advantages and disadvantages of custom application layer protocol

4.1 Advantages

  • Unknown protocols, data communication is more secure, if hackers want to analyze protocol vulnerabilities, they must first decode your communication protocol.

  • Expansibility is better, and it can expand its own protocols according to business needs and development, while known well-known protocols are not easy to expand.

4.2 Disadvantages

  • The design is difficult and the protocol needs to be extensible. It is best to be backward and forward compatible.

  • It is tedious to realize serialization and deserialization.

5. Preparatory knowledge before starting

5.1 Size End and Size End

Whether the starting address of a computer system is a high address or a low address when storing data.

  • Big end
    Store from a high address.

  • Small end
    Store from a low address.

  • graphic

  • judge
    This paper takes c/c++ language code as an example, and uses the characteristics of the consortium in C language.

#include <stdint.h>
#include <iostream>
using namespace std;

bool bigCheck()
{
    union Check
    {
        char a;
        uint32_t data;
    };
    
    Check c;
    c.data = 1;
    
    if (1 == c.a)
    {
        return false;
    }
    
    return true;
}

int main()
{
    if (bigCheck())
    {
        cout << "big" << endl;
    }
    else
    {
        cout << "small" << endl;
    }
    return 0;
}

5.2 Network byte order

As the name implies, the starting address of data in the byte stream of network transmission is high or low. In order to avoid introducing other complexity into network communication, the network byte order is uniform.

5.3 Local byte order

The size of the local operating system, different operating systems may use different byte order.

5.4 Memory Objects and Layout

Any variable, whether heap variable or stack variable, corresponds to a piece of memory in the operating system. Because memory alignment requires that variables in the program are not stored compactly, for example, the layout of a c language structure Test in memory may be shown in the following figure.

struct Test
{
    char a;
    char b;
    int32_t c;
};

5.5 Serialization and Deserialization

  • Convert memory objects in computer language into network byte streams, such as structure Test in c language into uint8_t data[6] byte streams.

  • Converting network byte stream to memory object in computer language, such as converting uint8_t data[6] byte stream to structure Test in c language.

6. An example

6.1 Protocol Design

Fixed boundary + hybrid coding strategy is adopted in this protocol.

  • Protocol header
    8-byte fixed-length protocol header. Support version number, magic number-based fast verification, reuse of different services. Fixed-length protocol header makes the protocol easy to parse and efficient.

  • Protocol body
    Variable length json is used as protocol body. json uses plaintext encoding, which is readable, easy to expand, compatible and universal. json protocol provides good scalability and compatibility for the protocol.

  • Protocol Visualization Diagram

6.2 Protocol Implementation

talk is easy, just code it, using c/c + + language to achieve.

6.2.1c/c++ Language Implementation

  • Storing protocol headers using MyProtoHead structure

/*
    Protocol header
 */
struct MyProtoHead
{
    uint8_t version;    //Protocol Version Number
    uint8_t magic;      //Protocol Magic Number
    uint16_t server;    //Protocol Reuse Service Number, Identifying Different Services Above the Protocol
    uint32_t len;       //Protocol length (protocol header length + variable length json protocol body length)
};
/*
    Protocol message body
 */
struct MyProtoMsg
{
    MyProtoHead head;   //Protocol header
    Json::Value body;   //Protocol body
};
  • Pack class

/*
    MyProto Pack class
 */
class MyProtoEnCode
{
public:
    //Protocol Message Body Packaging Function
    uint8_t * encode(MyProtoMsg * pMsg, uint32_t & len);
private:
    //Protocol Header Packing Function
    void headEncode(uint8_t * pData, MyProtoMsg * pMsg);
};
  • Unwrapping class

typedef enum MyProtoParserStatus
{
    ON_PARSER_INIT = 0,
    ON_PARSER_HAED = 1,
    ON_PARSER_BODY = 2,
}MyProtoParserStatus;
/*
    MyProto Unwrapping class
 */
class MyProtoDeCode
{
public:
    void init();
    void clear();
    bool parser(void * data, size_t len);
    bool empty();
    MyProtoMsg * front();
    void pop();
private:
    bool parserHead(uint8_t ** curData, uint32_t & curLen, 
        uint32_t & parserLen, bool & parserBreak);
    bool parserBody(uint8_t ** curData, uint32_t & curLen, 
        uint32_t & parserLen, bool & parserBreak);
    
private:
    MyProtoMsg mCurMsg;                     //Protocol message body in current parsing
    queue<MyProtoMsg *> mMsgQ;              //Resolved protocol message queue
    vector<uint8_t> mCurReserved;           //Unresolved network byte stream
    MyProtoParserStatus mCurParserStatus;   //Current parsing state
};

6.2.2 Packaging (serialization)

void MyProtoEnCode::headEncode(uint8_t * pData, MyProtoMsg * pMsg)
{
    //Set the first version number of the protocol to 1
    *pData = 1; 
    ++pData;

    //Setting protocol header magic number
    *pData = MY_PROTO_MAGIC;
    ++pData;

    //Set the protocol service number to convert the head.server local byte order into network byte order
    *(uint16_t *)pData = htons(pMsg->head.server);
    pData += 2;

    //Set the total length of the protocol to convert head.len local byte order to network byte order
    *(uint32_t *)pData = htonl(pMsg->head.len);
}

uint8_t * MyProtoEnCode::encode(MyProtoMsg * pMsg, uint32_t & len)
{
    uint8_t * pData = NULL;
    Json::FastWriter fWriter;
    
    //Protocol json body serialization
    string bodyStr = fWriter.write(pMsg->body);
    //Calculate the total length of protocol messages serialized
    len = MY_PROTO_HEAD_SIZE + (uint32_t)bodyStr.size();
    pMsg->head.len = len;
    //Space required to apply for protocol message serialization
    pData = new uint8_t[len];
    //Packing protocol header
    headEncode(pData, pMsg);
    //Packaging Protocol Body
    memcpy(pData + MY_PROTO_HEAD_SIZE, bodyStr.data(), bodyStr.size());
    
    return pData;
}

6.2.3 unpacking (deserialization)

bool MyProtoDeCode::parserHead(uint8_t ** curData, uint32_t & curLen, 
    uint32_t & parserLen, bool & parserBreak)
{
    parserBreak = false;
    if (curLen < MY_PROTO_HEAD_SIZE)
    {
        parserBreak = true; //Termination parsing
        return true;
    }

    uint8_t * pData = *curData;
    //Resolution version number
    mCurMsg.head.version = *pData;
    pData++;
    //Analytical Magic Number
    mCurMsg.head.magic = *pData;
    pData++;
    //If the magic number is inconsistent, the parsing failure is returned.
    if (MY_PROTO_MAGIC != mCurMsg.head.magic)
    {
        return false;
    }
    //Resolution Service Number
    mCurMsg.head.server = ntohs(*(uint16_t*)pData);
    pData+=2;
    //Length of parsing protocol message body
    mCurMsg.head.len = ntohl(*(uint32_t*)pData);
    //Exceptionally large package returns parsing failure
    if (mCurMsg.head.len > MY_PROTO_MAX_SIZE)
    {
        return false;
    }
    
    //Moving the parsing pointer forward MY_PROTO_HEAD_SIZE bytes
    (*curData) += MY_PROTO_HEAD_SIZE;
    curLen -= MY_PROTO_HEAD_SIZE;
    parserLen += MY_PROTO_HEAD_SIZE;
    mCurParserStatus = ON_PARSER_HAED;

    return true;
}

bool MyProtoDeCode::parserBody(uint8_t ** curData, uint32_t & curLen, 
    uint32_t & parserLen, bool & parserBreak)
{
    parserBreak = false;
    uint32_t jsonSize = mCurMsg.head.len - MY_PROTO_HEAD_SIZE;
    if (curLen < jsonSize)
    {
        parserBreak = true; //Termination parsing
        return true;
    }

    Json::Reader reader;    //json analytic class
    if (!reader.parse((char *)(*curData), 
        (char *)((*curData) + jsonSize), mCurMsg.body, false))
    {
        return false;
    }

    //Moving the parsing pointer forward jsonSize bytes
    (*curData) += jsonSize;
    curLen -= jsonSize;
    parserLen += jsonSize;
    mCurParserStatus = ON_PARSER_BODY;

    return true;
}

bool MyProtoDeCode::parser(void * data, size_t len)
{
    if (len <= 0)
    {
        return false;
    }

    uint32_t curLen = 0;
    uint32_t parserLen = 0;
    uint8_t * curData = NULL;
    
    curData = (uint8_t *)data;
    //Write the current network byte stream to the unsolved byte stream
    while (len--)
    {
        mCurReserved.push_back(*curData);
        ++curData;
    }

    curLen = mCurReserved.size();
    curData = (uint8_t *)&mCurReserved[0];

    //Continuous parsing as long as there is an unsolved network byte stream
    while (curLen > 0)
    {
        bool parserBreak = false;
        //Parsing protocol header
        if (ON_PARSER_INIT == mCurParserStatus ||
            ON_PARSER_BODY == mCurParserStatus)
        {
            if (!parserHead(&curData, curLen, parserLen, parserBreak))
            {
                return false;
            }

            if (parserBreak) break;
        }

        //After parsing the protocol header, parsing the protocol body
        if (ON_PARSER_HAED == mCurParserStatus)
        {
            if (!parserBody(&curData, curLen, parserLen, parserBreak))
            {
                return false;
            }

            if (parserBreak) break;
        }

        if (ON_PARSER_BODY == mCurParserStatus)
        {
            //Copy parsed message body into queue
            MyProtoMsg * pMsg = NULL;
            pMsg = new MyProtoMsg;
            *pMsg = mCurMsg;
            mMsgQ.push(pMsg);
        }
    }

    if (parserLen > 0)
    {
        //Delete the network byte stream that has been parsed
        mCurReserved.erase(mCurReserved.begin(), mCurReserved.begin() + parserLen);
    }

    return true;
}

7. Complete source code and testing

code is easy,just run it.

7.1 Source Code

#include <stdint.h>
#include <stdio.h>
#include <queue>
#include <vector>
#include <iostream>
#include <string.h>
#include <json/json.h>
#include <arpa/inet.h>
using namespace std;

const uint8_t MY_PROTO_MAGIC = 88;
const uint32_t MY_PROTO_MAX_SIZE = 10 * 1024 * 1024; //10M
const uint32_t MY_PROTO_HEAD_SIZE = 8;

typedef enum MyProtoParserStatus
{
    ON_PARSER_INIT = 0,
    ON_PARSER_HAED = 1,
    ON_PARSER_BODY = 2,
}MyProtoParserStatus;

/*
    Protocol header
 */
struct MyProtoHead
{
    uint8_t version;    //Protocol Version Number
    uint8_t magic;      //Protocol Magic Number
    uint16_t server;    //Protocol Reuse Service Number, Identifying Different Services Above the Protocol
    uint32_t len;       //Protocol length (protocol header length + variable length json protocol body length)
};

/*
    Protocol message body
 */
struct MyProtoMsg
{
    MyProtoHead head;   //Protocol header
    Json::Value body;   //Protocol body
};

void myProtoMsgPrint(MyProtoMsg & msg)
{
    string jsonStr = "";
    Json::FastWriter fWriter;
    jsonStr = fWriter.write(msg.body);
    
    printf("Head[version=%d,magic=%d,server=%d,len=%d]\n"
        "Body:%s", msg.head.version, msg.head.magic, 
        msg.head.server, msg.head.len, jsonStr.c_str());
}
/*
    MyProto Pack class
 */
class MyProtoEnCode
{
public:
    //Protocol Message Body Packaging Function
    uint8_t * encode(MyProtoMsg * pMsg, uint32_t & len);
private:
    //Protocol Header Packing Function
    void headEncode(uint8_t * pData, MyProtoMsg * pMsg);
};

void MyProtoEnCode::headEncode(uint8_t * pData, MyProtoMsg * pMsg)
{
    //Set the first version number of the protocol to 1
    *pData = 1; 
    ++pData;

    //Setting protocol header magic number
    *pData = MY_PROTO_MAGIC;
    ++pData;

    //Set the protocol service number to convert the head.server local byte order into network byte order
    *(uint16_t *)pData = htons(pMsg->head.server);
    pData += 2;

    //Set the total length of the protocol to convert head.len local byte order to network byte order
    *(uint32_t *)pData = htonl(pMsg->head.len);
}

uint8_t * MyProtoEnCode::encode(MyProtoMsg * pMsg, uint32_t & len)
{
    uint8_t * pData = NULL;
    Json::FastWriter fWriter;
    
    //Protocol json body serialization
    string bodyStr = fWriter.write(pMsg->body);
    //Calculate the total length of protocol messages serialized
    len = MY_PROTO_HEAD_SIZE + (uint32_t)bodyStr.size();
    pMsg->head.len = len;
    //Space required to apply for protocol message serialization
    pData = new uint8_t[len];
    //Packing protocol header
    headEncode(pData, pMsg);
    //Packaging Protocol Body
    memcpy(pData + MY_PROTO_HEAD_SIZE, bodyStr.data(), bodyStr.size());
    
    return pData;
}

/*
    MyProto Unwrapping class
 */
class MyProtoDeCode
{
public:
    void init();
    void clear();
    bool parser(void * data, size_t len);
    bool empty();
    MyProtoMsg * front();
    void pop();
private:
    bool parserHead(uint8_t ** curData, uint32_t & curLen, 
        uint32_t & parserLen, bool & parserBreak);
    bool parserBody(uint8_t ** curData, uint32_t & curLen, 
        uint32_t & parserLen, bool & parserBreak);
    
private:
    MyProtoMsg mCurMsg;                     //Protocol message body in current parsing
    queue<MyProtoMsg *> mMsgQ;              //Resolved protocol message queue
    vector<uint8_t> mCurReserved;           //Unresolved network byte stream
    MyProtoParserStatus mCurParserStatus;   //Current parsing state
};

void MyProtoDeCode::init()
{
    mCurParserStatus = ON_PARSER_INIT;
}

void MyProtoDeCode::clear()
{
    MyProtoMsg * pMsg = NULL;
    
    while (!mMsgQ.empty())
    {
        pMsg = mMsgQ.front();
        delete pMsg;
        mMsgQ.pop();
    }
}

bool MyProtoDeCode::parserHead(uint8_t ** curData, uint32_t & curLen, 
    uint32_t & parserLen, bool & parserBreak)
{
    parserBreak = false;
    if (curLen < MY_PROTO_HEAD_SIZE)
    {
        parserBreak = true; //Termination parsing
        return true;
    }

    uint8_t * pData = *curData;
    //Resolution version number
    mCurMsg.head.version = *pData;
    pData++;
    //Analytical Magic Number
    mCurMsg.head.magic = *pData;
    pData++;
    //If the magic number is inconsistent, the parsing failure is returned.
    if (MY_PROTO_MAGIC != mCurMsg.head.magic)
    {
        return false;
    }
    //Resolution Service Number
    mCurMsg.head.server = ntohs(*(uint16_t*)pData);
    pData+=2;
    //Length of parsing protocol message body
    mCurMsg.head.len = ntohl(*(uint32_t*)pData);
    //Exceptionally large package returns parsing failure
    if (mCurMsg.head.len > MY_PROTO_MAX_SIZE)
    {
        return false;
    }
    
    //Moving the parsing pointer forward MY_PROTO_HEAD_SIZE bytes
    (*curData) += MY_PROTO_HEAD_SIZE;
    curLen -= MY_PROTO_HEAD_SIZE;
    parserLen += MY_PROTO_HEAD_SIZE;
    mCurParserStatus = ON_PARSER_HAED;

    return true;
}

bool MyProtoDeCode::parserBody(uint8_t ** curData, uint32_t & curLen, 
    uint32_t & parserLen, bool & parserBreak)
{
    parserBreak = false;
    uint32_t jsonSize = mCurMsg.head.len - MY_PROTO_HEAD_SIZE;
    if (curLen < jsonSize)
    {
        parserBreak = true; //Termination parsing
        return true;
    }

    Json::Reader reader;    //json analytic class
    if (!reader.parse((char *)(*curData), 
        (char *)((*curData) + jsonSize), mCurMsg.body, false))
    {
        return false;
    }

    //Moving the parsing pointer forward jsonSize bytes
    (*curData) += jsonSize;
    curLen -= jsonSize;
    parserLen += jsonSize;
    mCurParserStatus = ON_PARSER_BODY;

    return true;
}

bool MyProtoDeCode::parser(void * data, size_t len)
{
    if (len <= 0)
    {
        return false;
    }

    uint32_t curLen = 0;
    uint32_t parserLen = 0;
    uint8_t * curData = NULL;
    
    curData = (uint8_t *)data;
    //Write the current network byte stream to the unsolved byte stream
    while (len--)
    {
        mCurReserved.push_back(*curData);
        ++curData;
    }

    curLen = mCurReserved.size();
    curData = (uint8_t *)&mCurReserved[0];

    //Continuous parsing as long as there is an unsolved network byte stream
    while (curLen > 0)
    {
        bool parserBreak = false;
        //Parsing protocol header
        if (ON_PARSER_INIT == mCurParserStatus ||
            ON_PARSER_BODY == mCurParserStatus)
        {
            if (!parserHead(&curData, curLen, parserLen, parserBreak))
            {
                return false;
            }

            if (parserBreak) break;
        }

        //After parsing the protocol header, parsing the protocol body
        if (ON_PARSER_HAED == mCurParserStatus)
        {
            if (!parserBody(&curData, curLen, parserLen, parserBreak))
            {
                return false;
            }

            if (parserBreak) break;
        }

        if (ON_PARSER_BODY == mCurParserStatus)
        {
            //Copy parsed message body into queue
            MyProtoMsg * pMsg = NULL;
            pMsg = new MyProtoMsg;
            *pMsg = mCurMsg;
            mMsgQ.push(pMsg);
        }
    }

    if (parserLen > 0)
    {
        //Delete the network byte stream that has been parsed
        mCurReserved.erase(mCurReserved.begin(), mCurReserved.begin() + parserLen);
    }

    return true;
}

bool MyProtoDeCode::empty()
{
    return mMsgQ.empty();
}

MyProtoMsg * MyProtoDeCode::front()
{
    MyProtoMsg * pMsg = NULL;
    pMsg = mMsgQ.front();
    return pMsg;
}

void MyProtoDeCode::pop()
{
    mMsgQ.pop();
}

int main()
{
    uint32_t len = 0;
    uint8_t * pData = NULL;
    MyProtoMsg msg1;
    MyProtoMsg msg2;
    MyProtoDeCode myDecode;
    MyProtoEnCode myEncode;

    msg1.head.server = 1;
    msg1.body["op"] = "set";
    msg1.body["key"] = "id";
    msg1.body["value"] = "9856";

    msg2.head.server = 2;
    msg2.body["op"] = "get";
    msg2.body["key"] = "id";

    myDecode.init();
    pData = myEncode.encode(&msg1, len);
    if (!myDecode.parser(pData, len))
    {
        cout << "parser falied!" << endl;
    }
    else
    {
        cout << "msg1 parser successful!" << endl;
    }

    pData = myEncode.encode(&msg2, len);
    if (!myDecode.parser(pData, len))
    {
        cout << "parser falied!" << endl;
    }
    else
    {
        cout << "msg2 parser successful!" << endl;
    }

    MyProtoMsg * pMsg = NULL;
    while (!myDecode.empty())
    {
        pMsg = myDecode.front();
        myProtoMsgPrint(*pMsg);
        myDecode.pop();
    }
    
    return 0;
}

7.2 Running Test

8. Summary

Less than 350 lines of code show us how to implement a custom application layer protocol. Of course, this protocol is not perfect enough. It can also be perfected, such as encrypting the body of the protocol to enhance the security of the protocol.

Posted by SuperTini on Wed, 17 Jul 2019 17:51:53 -0700