1. Brief introduction
Internet is full of all kinds of network services. When providing network services to the outside world, the server and the client need to follow the same set of data communication protocols in order to communicate normally; just like you communicate with Taiwanese in Minnan and Cantonese in Cantonese.
When implementing its own application functions, known well-known protocols (http, smtp, ftp, etc.) can not meet the requirements of security and scalability, so it is necessary to design and implement their own application layer protocols.
2. Classification of protocols
2.1 By coding
Binary Protocol
For example, the tcp protocol in the transport layer of network communication.Plaintext text protocol
For example, http and redis protocols in application layer.Hybrid protocol (binary + plaintext)
For example, Apple's early APNs push protocol.
2.2 By Agreement Boundary
Fixed boundary agreement
The length of a protocol message can be known clearly. Such a protocol is easy to parse, such as tcp protocol.Fuzzy Boundary Protocol
It is difficult to know the length of a protocol message clearly. The parsing of such protocol is complex. Usually, it needs to define whether the message ends by some specific bytes, such as http protocol.
3. Basic Criteria for Evaluating the Advantages and Disadvantages of Agreements
Efficient
Fast packing and unpacking reduces the occupancy of cpu and high data compression rate reduces the occupancy of network bandwidth.Ordinary
Easy to understand and analyze.Easy to expand
There is enough flexibility for predictable changes to be extended.Easy to Compatible
Forward compatibility, for messages sent by the old protocol, the new protocol can be used for parsing, but the new functions supported by the new protocol can not be used.
Backward compatibility. Messages sent by the new protocol can be parsed using the old protocol, but the new functions supported by the new protocol can not be used.
4. Advantages and disadvantages of custom application layer protocol
4.1 Advantages
Unknown protocols, data communication is more secure, if hackers want to analyze protocol vulnerabilities, they must first decode your communication protocol.
Expansibility is better, and it can expand its own protocols according to business needs and development, while known well-known protocols are not easy to expand.
4.2 Disadvantages
The design is difficult and the protocol needs to be extensible. It is best to be backward and forward compatible.
It is tedious to realize serialization and deserialization.
5. Preparatory knowledge before starting
5.1 Size End and Size End
Whether the starting address of a computer system is a high address or a low address when storing data.
Big end
Store from a high address.Small end
Store from a low address.graphic
judge
This paper takes c/c++ language code as an example, and uses the characteristics of the consortium in C language.
#include <stdint.h> #include <iostream> using namespace std; bool bigCheck() { union Check { char a; uint32_t data; }; Check c; c.data = 1; if (1 == c.a) { return false; } return true; } int main() { if (bigCheck()) { cout << "big" << endl; } else { cout << "small" << endl; } return 0; }
5.2 Network byte order
As the name implies, the starting address of data in the byte stream of network transmission is high or low. In order to avoid introducing other complexity into network communication, the network byte order is uniform.
5.3 Local byte order
The size of the local operating system, different operating systems may use different byte order.
5.4 Memory Objects and Layout
Any variable, whether heap variable or stack variable, corresponds to a piece of memory in the operating system. Because memory alignment requires that variables in the program are not stored compactly, for example, the layout of a c language structure Test in memory may be shown in the following figure.
struct Test { char a; char b; int32_t c; };
5.5 Serialization and Deserialization
Convert memory objects in computer language into network byte streams, such as structure Test in c language into uint8_t data[6] byte streams.
Converting network byte stream to memory object in computer language, such as converting uint8_t data[6] byte stream to structure Test in c language.
6. An example
6.1 Protocol Design
Fixed boundary + hybrid coding strategy is adopted in this protocol.
Protocol header
8-byte fixed-length protocol header. Support version number, magic number-based fast verification, reuse of different services. Fixed-length protocol header makes the protocol easy to parse and efficient.Protocol body
Variable length json is used as protocol body. json uses plaintext encoding, which is readable, easy to expand, compatible and universal. json protocol provides good scalability and compatibility for the protocol.Protocol Visualization Diagram
6.2 Protocol Implementation
talk is easy, just code it, using c/c + + language to achieve.
6.2.1c/c++ Language Implementation
Storing protocol headers using MyProtoHead structure
/* Protocol header */ struct MyProtoHead { uint8_t version; //Protocol Version Number uint8_t magic; //Protocol Magic Number uint16_t server; //Protocol Reuse Service Number, Identifying Different Services Above the Protocol uint32_t len; //Protocol length (protocol header length + variable length json protocol body length) };
Using open source Jsoncpp classes to store protocol bodies
https://sourceforge.net/proje...Protocol message body
/* Protocol message body */ struct MyProtoMsg { MyProtoHead head; //Protocol header Json::Value body; //Protocol body };
Pack class
/* MyProto Pack class */ class MyProtoEnCode { public: //Protocol Message Body Packaging Function uint8_t * encode(MyProtoMsg * pMsg, uint32_t & len); private: //Protocol Header Packing Function void headEncode(uint8_t * pData, MyProtoMsg * pMsg); };
Unwrapping class
typedef enum MyProtoParserStatus { ON_PARSER_INIT = 0, ON_PARSER_HAED = 1, ON_PARSER_BODY = 2, }MyProtoParserStatus; /* MyProto Unwrapping class */ class MyProtoDeCode { public: void init(); void clear(); bool parser(void * data, size_t len); bool empty(); MyProtoMsg * front(); void pop(); private: bool parserHead(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak); bool parserBody(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak); private: MyProtoMsg mCurMsg; //Protocol message body in current parsing queue<MyProtoMsg *> mMsgQ; //Resolved protocol message queue vector<uint8_t> mCurReserved; //Unresolved network byte stream MyProtoParserStatus mCurParserStatus; //Current parsing state };
6.2.2 Packaging (serialization)
void MyProtoEnCode::headEncode(uint8_t * pData, MyProtoMsg * pMsg) { //Set the first version number of the protocol to 1 *pData = 1; ++pData; //Setting protocol header magic number *pData = MY_PROTO_MAGIC; ++pData; //Set the protocol service number to convert the head.server local byte order into network byte order *(uint16_t *)pData = htons(pMsg->head.server); pData += 2; //Set the total length of the protocol to convert head.len local byte order to network byte order *(uint32_t *)pData = htonl(pMsg->head.len); } uint8_t * MyProtoEnCode::encode(MyProtoMsg * pMsg, uint32_t & len) { uint8_t * pData = NULL; Json::FastWriter fWriter; //Protocol json body serialization string bodyStr = fWriter.write(pMsg->body); //Calculate the total length of protocol messages serialized len = MY_PROTO_HEAD_SIZE + (uint32_t)bodyStr.size(); pMsg->head.len = len; //Space required to apply for protocol message serialization pData = new uint8_t[len]; //Packing protocol header headEncode(pData, pMsg); //Packaging Protocol Body memcpy(pData + MY_PROTO_HEAD_SIZE, bodyStr.data(), bodyStr.size()); return pData; }
6.2.3 unpacking (deserialization)
bool MyProtoDeCode::parserHead(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak) { parserBreak = false; if (curLen < MY_PROTO_HEAD_SIZE) { parserBreak = true; //Termination parsing return true; } uint8_t * pData = *curData; //Resolution version number mCurMsg.head.version = *pData; pData++; //Analytical Magic Number mCurMsg.head.magic = *pData; pData++; //If the magic number is inconsistent, the parsing failure is returned. if (MY_PROTO_MAGIC != mCurMsg.head.magic) { return false; } //Resolution Service Number mCurMsg.head.server = ntohs(*(uint16_t*)pData); pData+=2; //Length of parsing protocol message body mCurMsg.head.len = ntohl(*(uint32_t*)pData); //Exceptionally large package returns parsing failure if (mCurMsg.head.len > MY_PROTO_MAX_SIZE) { return false; } //Moving the parsing pointer forward MY_PROTO_HEAD_SIZE bytes (*curData) += MY_PROTO_HEAD_SIZE; curLen -= MY_PROTO_HEAD_SIZE; parserLen += MY_PROTO_HEAD_SIZE; mCurParserStatus = ON_PARSER_HAED; return true; } bool MyProtoDeCode::parserBody(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak) { parserBreak = false; uint32_t jsonSize = mCurMsg.head.len - MY_PROTO_HEAD_SIZE; if (curLen < jsonSize) { parserBreak = true; //Termination parsing return true; } Json::Reader reader; //json analytic class if (!reader.parse((char *)(*curData), (char *)((*curData) + jsonSize), mCurMsg.body, false)) { return false; } //Moving the parsing pointer forward jsonSize bytes (*curData) += jsonSize; curLen -= jsonSize; parserLen += jsonSize; mCurParserStatus = ON_PARSER_BODY; return true; } bool MyProtoDeCode::parser(void * data, size_t len) { if (len <= 0) { return false; } uint32_t curLen = 0; uint32_t parserLen = 0; uint8_t * curData = NULL; curData = (uint8_t *)data; //Write the current network byte stream to the unsolved byte stream while (len--) { mCurReserved.push_back(*curData); ++curData; } curLen = mCurReserved.size(); curData = (uint8_t *)&mCurReserved[0]; //Continuous parsing as long as there is an unsolved network byte stream while (curLen > 0) { bool parserBreak = false; //Parsing protocol header if (ON_PARSER_INIT == mCurParserStatus || ON_PARSER_BODY == mCurParserStatus) { if (!parserHead(&curData, curLen, parserLen, parserBreak)) { return false; } if (parserBreak) break; } //After parsing the protocol header, parsing the protocol body if (ON_PARSER_HAED == mCurParserStatus) { if (!parserBody(&curData, curLen, parserLen, parserBreak)) { return false; } if (parserBreak) break; } if (ON_PARSER_BODY == mCurParserStatus) { //Copy parsed message body into queue MyProtoMsg * pMsg = NULL; pMsg = new MyProtoMsg; *pMsg = mCurMsg; mMsgQ.push(pMsg); } } if (parserLen > 0) { //Delete the network byte stream that has been parsed mCurReserved.erase(mCurReserved.begin(), mCurReserved.begin() + parserLen); } return true; }
7. Complete source code and testing
code is easy,just run it.
7.1 Source Code
#include <stdint.h> #include <stdio.h> #include <queue> #include <vector> #include <iostream> #include <string.h> #include <json/json.h> #include <arpa/inet.h> using namespace std; const uint8_t MY_PROTO_MAGIC = 88; const uint32_t MY_PROTO_MAX_SIZE = 10 * 1024 * 1024; //10M const uint32_t MY_PROTO_HEAD_SIZE = 8; typedef enum MyProtoParserStatus { ON_PARSER_INIT = 0, ON_PARSER_HAED = 1, ON_PARSER_BODY = 2, }MyProtoParserStatus; /* Protocol header */ struct MyProtoHead { uint8_t version; //Protocol Version Number uint8_t magic; //Protocol Magic Number uint16_t server; //Protocol Reuse Service Number, Identifying Different Services Above the Protocol uint32_t len; //Protocol length (protocol header length + variable length json protocol body length) }; /* Protocol message body */ struct MyProtoMsg { MyProtoHead head; //Protocol header Json::Value body; //Protocol body }; void myProtoMsgPrint(MyProtoMsg & msg) { string jsonStr = ""; Json::FastWriter fWriter; jsonStr = fWriter.write(msg.body); printf("Head[version=%d,magic=%d,server=%d,len=%d]\n" "Body:%s", msg.head.version, msg.head.magic, msg.head.server, msg.head.len, jsonStr.c_str()); } /* MyProto Pack class */ class MyProtoEnCode { public: //Protocol Message Body Packaging Function uint8_t * encode(MyProtoMsg * pMsg, uint32_t & len); private: //Protocol Header Packing Function void headEncode(uint8_t * pData, MyProtoMsg * pMsg); }; void MyProtoEnCode::headEncode(uint8_t * pData, MyProtoMsg * pMsg) { //Set the first version number of the protocol to 1 *pData = 1; ++pData; //Setting protocol header magic number *pData = MY_PROTO_MAGIC; ++pData; //Set the protocol service number to convert the head.server local byte order into network byte order *(uint16_t *)pData = htons(pMsg->head.server); pData += 2; //Set the total length of the protocol to convert head.len local byte order to network byte order *(uint32_t *)pData = htonl(pMsg->head.len); } uint8_t * MyProtoEnCode::encode(MyProtoMsg * pMsg, uint32_t & len) { uint8_t * pData = NULL; Json::FastWriter fWriter; //Protocol json body serialization string bodyStr = fWriter.write(pMsg->body); //Calculate the total length of protocol messages serialized len = MY_PROTO_HEAD_SIZE + (uint32_t)bodyStr.size(); pMsg->head.len = len; //Space required to apply for protocol message serialization pData = new uint8_t[len]; //Packing protocol header headEncode(pData, pMsg); //Packaging Protocol Body memcpy(pData + MY_PROTO_HEAD_SIZE, bodyStr.data(), bodyStr.size()); return pData; } /* MyProto Unwrapping class */ class MyProtoDeCode { public: void init(); void clear(); bool parser(void * data, size_t len); bool empty(); MyProtoMsg * front(); void pop(); private: bool parserHead(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak); bool parserBody(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak); private: MyProtoMsg mCurMsg; //Protocol message body in current parsing queue<MyProtoMsg *> mMsgQ; //Resolved protocol message queue vector<uint8_t> mCurReserved; //Unresolved network byte stream MyProtoParserStatus mCurParserStatus; //Current parsing state }; void MyProtoDeCode::init() { mCurParserStatus = ON_PARSER_INIT; } void MyProtoDeCode::clear() { MyProtoMsg * pMsg = NULL; while (!mMsgQ.empty()) { pMsg = mMsgQ.front(); delete pMsg; mMsgQ.pop(); } } bool MyProtoDeCode::parserHead(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak) { parserBreak = false; if (curLen < MY_PROTO_HEAD_SIZE) { parserBreak = true; //Termination parsing return true; } uint8_t * pData = *curData; //Resolution version number mCurMsg.head.version = *pData; pData++; //Analytical Magic Number mCurMsg.head.magic = *pData; pData++; //If the magic number is inconsistent, the parsing failure is returned. if (MY_PROTO_MAGIC != mCurMsg.head.magic) { return false; } //Resolution Service Number mCurMsg.head.server = ntohs(*(uint16_t*)pData); pData+=2; //Length of parsing protocol message body mCurMsg.head.len = ntohl(*(uint32_t*)pData); //Exceptionally large package returns parsing failure if (mCurMsg.head.len > MY_PROTO_MAX_SIZE) { return false; } //Moving the parsing pointer forward MY_PROTO_HEAD_SIZE bytes (*curData) += MY_PROTO_HEAD_SIZE; curLen -= MY_PROTO_HEAD_SIZE; parserLen += MY_PROTO_HEAD_SIZE; mCurParserStatus = ON_PARSER_HAED; return true; } bool MyProtoDeCode::parserBody(uint8_t ** curData, uint32_t & curLen, uint32_t & parserLen, bool & parserBreak) { parserBreak = false; uint32_t jsonSize = mCurMsg.head.len - MY_PROTO_HEAD_SIZE; if (curLen < jsonSize) { parserBreak = true; //Termination parsing return true; } Json::Reader reader; //json analytic class if (!reader.parse((char *)(*curData), (char *)((*curData) + jsonSize), mCurMsg.body, false)) { return false; } //Moving the parsing pointer forward jsonSize bytes (*curData) += jsonSize; curLen -= jsonSize; parserLen += jsonSize; mCurParserStatus = ON_PARSER_BODY; return true; } bool MyProtoDeCode::parser(void * data, size_t len) { if (len <= 0) { return false; } uint32_t curLen = 0; uint32_t parserLen = 0; uint8_t * curData = NULL; curData = (uint8_t *)data; //Write the current network byte stream to the unsolved byte stream while (len--) { mCurReserved.push_back(*curData); ++curData; } curLen = mCurReserved.size(); curData = (uint8_t *)&mCurReserved[0]; //Continuous parsing as long as there is an unsolved network byte stream while (curLen > 0) { bool parserBreak = false; //Parsing protocol header if (ON_PARSER_INIT == mCurParserStatus || ON_PARSER_BODY == mCurParserStatus) { if (!parserHead(&curData, curLen, parserLen, parserBreak)) { return false; } if (parserBreak) break; } //After parsing the protocol header, parsing the protocol body if (ON_PARSER_HAED == mCurParserStatus) { if (!parserBody(&curData, curLen, parserLen, parserBreak)) { return false; } if (parserBreak) break; } if (ON_PARSER_BODY == mCurParserStatus) { //Copy parsed message body into queue MyProtoMsg * pMsg = NULL; pMsg = new MyProtoMsg; *pMsg = mCurMsg; mMsgQ.push(pMsg); } } if (parserLen > 0) { //Delete the network byte stream that has been parsed mCurReserved.erase(mCurReserved.begin(), mCurReserved.begin() + parserLen); } return true; } bool MyProtoDeCode::empty() { return mMsgQ.empty(); } MyProtoMsg * MyProtoDeCode::front() { MyProtoMsg * pMsg = NULL; pMsg = mMsgQ.front(); return pMsg; } void MyProtoDeCode::pop() { mMsgQ.pop(); } int main() { uint32_t len = 0; uint8_t * pData = NULL; MyProtoMsg msg1; MyProtoMsg msg2; MyProtoDeCode myDecode; MyProtoEnCode myEncode; msg1.head.server = 1; msg1.body["op"] = "set"; msg1.body["key"] = "id"; msg1.body["value"] = "9856"; msg2.head.server = 2; msg2.body["op"] = "get"; msg2.body["key"] = "id"; myDecode.init(); pData = myEncode.encode(&msg1, len); if (!myDecode.parser(pData, len)) { cout << "parser falied!" << endl; } else { cout << "msg1 parser successful!" << endl; } pData = myEncode.encode(&msg2, len); if (!myDecode.parser(pData, len)) { cout << "parser falied!" << endl; } else { cout << "msg2 parser successful!" << endl; } MyProtoMsg * pMsg = NULL; while (!myDecode.empty()) { pMsg = myDecode.front(); myProtoMsgPrint(*pMsg); myDecode.pop(); } return 0; }
7.2 Running Test
8. Summary
Less than 350 lines of code show us how to implement a custom application layer protocol. Of course, this protocol is not perfect enough. It can also be perfected, such as encrypting the body of the protocol to enhance the security of the protocol.