Node.js With ProtoBuffer, realize a redis from zero

Keywords: Java socket Redis Google network

Write at the beginning

  • If you want to learn the wheel making technology, you can see my previous collection of original articles: https://mp.weixin.qq.com/s/RsvI5AFzbp3rm6sOlTmiYQ
  • If you want to get 3700G of free learning materials or join the technology exchange group (no advertising), you can use the end of the article + my wechat to focus on technology instead of chatting

What is protobuffer?

  • protocol buffer is an independent data exchange format of Google, which can be used in many fields.
  • Protocol buffer (PB) is a data exchange format of google, which is independent of language and platform.
  • google provides a variety of language implementations: java, c ා, c + +, go and python. Each implementation contains the compiler and library files of the corresponding language. Because it is a binary format, it is much faster than using xml for data exchange.
  • It can be used for data communication between distributed applications or data exchange in heterogeneous environments. As a binary data transmission format with excellent efficiency and compatibility, it can be used in many fields such as network transmission, configuration file, data storage, etc.

Summarize the advantages

  • In short, the main advantages of Protobuf are simplicity and speed.
  • Why do you say that?
  • Because the representation of Protocol Buffer information is very compact, it means that the volume of the message is reduced and less resources are required naturally. For example, the number of bytes transmitted on the network is less, and the IO required is less, so as to improve the performance.
  • For a message, the sequence of bytes serialized with Protobuf is:
08 65 12 06 48 65 6C 6C 6F 77
  • If you use XML, it's like this:
31 30 31 3C 2F 69 64 3E 3C 6E 61 6D 65 3E 68 65 
 6C 6C 6F 3C 2F 6E 61 6D 65 3E 3C 2F 68 65 6C 6C 
 6F 77 6F 72 6C 64 3E 

stay Node.js PB in

yarn add protobufjs -D
mkdir proto 
cd proto 
vi message.proto

....
//message.proto file
package message;
option optimize_for = LITE_RUNTIME;
message Account{
    required string accountName = 1;
    required string pwd = 2; 
}
message AccountList{
    required int32 index = 1;
    repeated Account list = 2;
}

Start using PB protocol

  • Introducing protobufjs
  • Read root object
const ProtoBufJs = require("protobufjs");
const root = ProtoBufJs.loadSync("./proto/message.proto");
  • Read the defined pb file, and dynamically introduce the read
const ProtoBufJs = require("protobufjs");
const root = ProtoBufJs.loadSync("./proto/message.proto");
const AccountList = root.lookupType("message.AccountList");
const Account = root.lookupType("message.Account");
const accountListObj = AccountList.create();
for (let i = 0; i < 5;i++) {
  const accountObj = Account.create();
  accountObj.accountName = "Front end peak" + i;
  accountObj.pwd = "Peter Sauce is fatter than technology" + i;
  accountListObj.list.push(accountObj);
}
const buffer = AccountList.encode(accountListObj).finish();

console.log(buffer)
  • Starting a project with nodemon

  • When the Buffer is printed out, it will be converted to string
const ProtoBufJs = require("protobufjs");
const root = ProtoBufJs.loadSync("./proto/message.proto");
const AccountList = root.lookupType("message.AccountList");
const Account = root.lookupType("message.Account");
const accountListObj = AccountList.create();
const accountObj = Account.create();
accountObj.accountName = "Front end peak";
accountObj.pwd = "Peter Sauce is fatter than technology";
accountObj.test = "Why does great health care become more and more deficient";
accountListObj.list.push(accountObj);
const buffer = AccountList.encode(accountListObj).finish();
console.log(buffer.toString());
  • Printed

  • So excuse me, the bigger the sword, the weaker the body. Why?

Introduction of socket communication, better binary support

  • The socket communication using the native net module is to realize redis. Instead of udp communication, it is based on reliable TCP. First, write the redis server code
const net = require("net");
const listenPort = 6380; //Listening port
const server = net
  .createServer(function (socket) {
    // Create socket server
    console.log("connect: " + socket.remoteAddress + ":" + socket.remotePort);
    socket.setKeepAlive(true);
    socket.setEncoding("utf-8");
    //Data received
    socket.on("data", function (data) {
      console.log("client send:" + data);
    });
    socket.write("Hello client!\r\n");
    //Data error event
    socket.on("error", function (exception) {
      socket.end();
    });
    //Client shutdown event
    socket.on("close", function (data) {
    });
  })
  .listen(listenPort);
//Server listening events
server.on("listening", function () {
  console.log("server listening:" + server.address().port);
});
//Server error events
server.on("error", function (exception) {
  console.log("server error:" + exception);
});
  • The default port of redis is 6379. We monitor on port 6380. The heartbeat is kept alive. The application layer does keep alive, socket.setKeepAlive (true), keep long links

Write redis client

  • Introduce Socket communication
const { Socket } = require("net");
//Other codes introduced into pb file remain unchanged
  • The code of the pb file is the same, one for the client, one for the server, and one for both sides of the duplex communication
const port = 6380;
const host = "127.0.0.1";
const client = new Socket();
client.setKeepAlive(true);
client.setEncoding("utf-8");
//Connect to the server
client.connect(port, host, function () {
  client.write("hello server");
  //Write data to the port to the server
});
client.on("data", function (data) {
  console.log("from server:" + data);
  //Get the data returned by the server
});
client.on("error", function (error) {
  //Close connection after error
  console.log("error:" + error);
  client.destory();
});
client.on("close", function () {
  //Close connection normally
  console.log("Connection closed");
});
  • Connect 6380 port server through socket and establish long link

Application layer heartbeat and reconnection

  • Redefining pb file, PingPong
package message;
syntax = "proto3";

message PingPong {
    string message_type = 1; // Becomes messageType
    string ping = 2; 
    string pong = 3; 
}
  • The client compiles the pb file and uses the pb protocol to communicate with the server
const root = ProtoBufJs.loadSync('./proto/message.proto');
const PingPong = root.lookupType('message.PingPong');
 setInterval(() => {
    const payload = { message_type: '1', ping: '1', pong: '2' };
    const errMsg = PingPong.verify(payload);
    if (errMsg) throw Error(errMsg);
    const message = PingPong.create(payload);
    const buffer = PingPong.encode(message).finish();
    client.write(buffer);
  }, 3000);
  • Send heartbeat packets every 3 seconds

Server receives heartbeat

  • Introduction of pb
const root = ProtoBufJs.loadSync('./proto/message.proto');
const PingPong = root.lookupType('message.PingPong');
  • Accept heartbeat package
const server = createServer(function (socket) {
  socket.setKeepAlive(true);
  // Create socket server
  //Data received
  socket.on('data', function (data) {
    const decodedMessage = PingPong.decode(data);
    console.log(decodedMessage, 'decodedMessage');
  });
  socket.write('Hello client!\r\n');
  //Data error event
  socket.on('error', function (exception) {
    console.log('socket error:' + exception);
    socket.end();
  });
  //Client shutdown event
  socket.on('close', function (data) {
    console.log('client closed!');
  });
}).listen(listenPort);
  • At this time, the Buffer transmitted by PB protocol can be received and resolved

  • Heartbeat hold, client sends heartbeat
  timer = setInterval(() => {
    count++;
    const payload = { messageType: '1', ping: '1' };
    const errMsg = PingPong.verify(payload);
    if (errMsg) throw Error(errMsg);
    const message = PingPong.create(payload);
    const buffer = PingPong.encode(message).finish();
    client.write(buffer);
  }, 3000);
  • Server returns heartbeat
 socket.on('data', function (data) {
    const decodedMessage = PingPong.decode(data);
    console.log(decodedMessage,'decodedMessage')
    if(decodedMessage.messageType ==='1'){
      console.log('Enter judgment')
      const payload = { messageType: '1', pong: '1'};
      const errMsg = PingPong.verify(payload);
      if (errMsg) throw Error(errMsg);
      const message = PingPong.create(payload);
      const buffer = PingPong.encode(message).finish();
      socket.write(buffer);
    }
  });
  • The client records the heartbeat and handles timeout and disconnection
client.on('data', function (data) {
  const decodedMessage = PingPong.decode(data);
  if (decodedMessage.messageType === '1') {
    console.log('Received the heartbeat back package');
    count = 0;
  }
  console.log('from server:' + decodedMessage.messageType);
  //Get the data returned by the server
});
  • Judge when sending the heartbeat. If you can't receive the heartbeat after three times, throw an error and don't send the heartbeat again
  timer = setInterval(() => {
    if (count > 3) {
      clearInterval(timer);
      client.end();
      throw Error('timeout')
    }
    count++;
    const payload = { messageType: '1', ping: '1' };
    const errMsg = PingPong.verify(payload);
    if (errMsg) throw Error(errMsg);
    const message = PingPong.create(payload);
    const buffer = PingPong.encode(message).finish();
    client.write(buffer);
  }, 3000);
  • The server intentionally does not reply to the heartbeat
  socket.write(buffer);
  • The client throws an error, cancels the heartbeat sending and disconnects the socket link

  • At this time, there should be a reconnection mechanism. We will not deal with it here. There are also send queues

Implement the get and set methods of redis

  • Data storage, the server uses Map type storage
  • Transport using PB protocol
  • Receive message reply ACK

Define the Payload pb field for data transfer

  • Define fields
message Data {
    string message_type = 1; // Becomes messageType
    Payload data = 2;
}


message Payload {
    required string key = 1;
    required string value =2;
}
  • Define the RedisSet function:
const Data = root.lookupType('message.Data');
function RedisSet() {
  const msg = { messageType: '2', data: { key: '1', value: '2' } };
  const errMsg = Data.verify(msg);
  if (errMsg) throw Error(errMsg);
  const message = Data.create(msg);
  const buffer = Data.encode(message).finish();
  client.write(buffer);
}
  • Descode parsing and deserialization on the server
  socket.on('data', function (data) {
    const decodedMessage = PingPong.decode(data);
    console.log(decodedMessage,'decodedMessage');
    if(decodedMessage.messageType ==='1'){
      const payload = { messageType: '1', pong: '1'};
      const errMsg = PingPong.verify(payload);
      if (errMsg) throw Error(errMsg);
      const message = PingPong.create(payload);
      const buffer = PingPong.encode(message).finish();
      socket.write(buffer);
    }
  });
  • Deserialization successful

  • At this time, we have got the data, but if we carefully observe it, we will find that we have taken the wrong object of deserialization to deal with, causing data problems, so we need to tell the receiver what object to use to deserialize
  • In this case, the best solution is to define the common field to deserialize it first
message Common {
    string message_type = 1; 
}
  • In the server, deserialize once, use common, get messageType, then process, and then deserialize once
  socket.on('data', function (data) {
    const res = Common.decode(data);
    if (res.messageType=== '1') {
      const payload = { messageType: '1', pong: '1' };
      const errMsg = PingPong.verify(payload);
      if (errMsg) throw Error(errMsg);
      const message = PingPong.create(payload);
      const buffer = PingPong.encode(message).finish();
      socket.write(buffer);
    } else if(res.messageType=== '2'){
        const message = Data.decode(data)
        const payload = message.data;
        console.log(payload.key,'payload');
        M.set(payload.key,payload.value);
        console.log(M,'m')
    }
  });
  • Complete simple set method

  • Define RedisGet method:
const M = new Map();
M.set('1','Peter Braised beef with sauce')

function RedisGet() {
  const msg = { messageType: '3', data: { key: '1' } };
  const errMsg = Data.verify(msg);
  if (errMsg) throw Error(errMsg);
  const message = Data.create(msg);
  const buffer = Data.encode(message).finish();
  client.write(buffer);
}
  • The server processes the message type '3'
else if (res.messageType === '3') {
      const message = Query.decode(data);
      const res = M.get(message.key);
      console.log(res, 'res');
    }

  • At this time, the get method is completed, the data is obtained, and then a GetData transmission is defined. The serialization and deserialization are completed first. It's certainly not so simple
  • The set and get operations of redis are very high frequency. Even though the cache is not stored in the database, there is still a risk of failure. Because we communicate through socket. If the communication fails due to network jitter or other reasons, and the data does not enter the cache, then there is a problem

set method should have CB (callback), get method should have return value

  • Based on the above two requirements, we need to design a new mode to complete the set and get functions
  • No matter success or failure, you can know the result

Really start to realize Redis

  • First, make sure that the communication still uses socket and long connection
  • Need for heartbeat
  • Need to import send queue
  • set can trigger cb,get can return data (based on promise | generator|async)
  • Transmission based on pb protocol
  • There is an ACK reply mechanism to ensure that cb calls

Processing queue

  • Set and set's callback queue
  • I thought that after the set was successful, I should protect one copy of the data on the client side, so redis.get You can get the data directly. You don't need to use socket. Later, considering that multiple machines are connected to redis, you should keep the data consistent. There should be many ways to ensure the data consistency
const cbQueue = []; //Callback queue
const setQueue = []; //set's queue
const getQueue = []; //get's queue
  • Implement set queue, trigger cb, and modify redisSet
function RedisSet(data, cb) {
  cbQueue.push(cb);
  setQueue.push(data);
  console.log(cbQueue, setQueue, "queue");
  const errMsg = Data.verify(data);
  if (errMsg) throw Error(errMsg);
  const message = Data.create(data);
  const buffer = Data.encode(message).finish();
  client.write(buffer);
}
  • After receiving the set, the server adds data to the Map and writes it to the incoming client with socket
else if (res.messageType === '2') {
      const message = Data.decode(data);
      const payload = message.data;
      M.set(payload.key, payload.value);
    } 

M. After set, the socket is used to inform the client that the cache write is successful

  • First, we define the pb field. We use message_type = "5" to notify
message setSucceed {
    string message_type = 1;  
}
const msg = { messageType: "5" };
const errMsg = setSucceed.verify(msg);
if (errMsg) throw Error(errMsg);
const m = setSucceed.create(msg);
const buffer = setSucceed.encode(m).finish();
socket.write(buffer);
  • The front end triggers the cb of the set queue and consumes the queue
  RedisSet(data, () => {
        console.log("set success,trigger cb");
      });
      
 else if (decodedMessage.messageType === "5") {
      const cb = cbQueue.shift();
      cb && cb();
    }
  • Results, in line with expectations

But this operation has BUG

  • Because socket writes are asynchronous, when it returns, it may be out of order. Here you need to add the ACK reply mechanism
  • When the client is set, a UUID is generated and brought to the server. After the Map data storage of the server is completed, the UUID can be brought back to the client (equivalent to the ACK mechanism)
  • The client receives the ACK and triggers the cb in cbQueue (at this time, change the array type of cbQueue to Map for convenient processing). After triggering, remove the cb
  • Join UUID
yarn add node-uuid
const uuid = require('node-uuid');

// uuid generated from time stamp and random number in v1
const creatuuid= uuid.v1()
  • Modify pb file of Data and add uuid field
message Data {
     string message_type = 1; // Becomes messageType
     string uuid = 2;
     Payload data = 3;
}
  • Modify the set method. Each set uses UUID to generate the key. The value is cb and stored in the Map
function RedisSet(data, cb) {
  // uuid generated from time stamp and random number in v1
  const creatuuid = uuid.v1();
  data.uuid = creatuuid;
  cbQueue.set(creatuuid, cb);
  const errMsg = Data.verify(data);
  if (errMsg) throw Error(errMsg);
  const message = Data.create(data);
  const buffer = Data.encode(message).finish();
  client.write(buffer);
}
  • Modify the server, return the ACK field, and notify the client to consume cb
else if (res.messageType === '2') {
      const message = Data.decode(data);
      const payload = message.data;
      M.set(payload.key, payload.value);
      const msg = { messageType: '5', uuid: message.uuid };
      const errMsg = setSucceed.verify(msg);
      if (errMsg) throw Error(errMsg);
      const m = setSucceed.create(msg);
      const buffer = setSucceed.encode(m).finish();
      socket.write(buffer);
    } 
  • The client receives the set successful ACK and consumes cb according to UUId
 else if (decodedMessage.messageType === '5') {
      const res = setSucceed.decode(data);
      const cb = cbQueue.get(res.uuid);
      cb && cb() && cbQueue.remove(res.uuid);
    }

In this way, we have finished triggering cb by set, and the rest get gets the return value

  • In fact, I also need to think about this get. At the beginning, I wanted to make a rough point and synchronize all the data directly to the client. Then the client added data according to setqueue & cbqueue. Later, I thought it was not elegant. Because redis also has cluster, data synchronization, preheating, two different data persistence, etc
  • You can get it in the form of curl, http request, etc. because I haven't seen the redis source code, and I don't know how to implement it
  • But based on Node.js The redis of is used directly through redis.get(), after passing in the callback function, we get a data without using promise and await (I remember that)

Define pb field of get

  • Define Query
message Query {
    string message_type = 1; 
    string key = 2;
    string uuid =3;
}
  • Define get method
get = function (key, cb) {
    // uuid generated from time stamp and random number in v1
    const creatuuid = uuid.v1();
    getCbQueue.set(creatuuid, cb);
    const msg = { messageType: '6', key, uuid: creatuuid };
    const errMsg = Query.verify(msg);
    if (errMsg) throw Error(errMsg);
    const message = Query.create(msg);
    const buffer = Query.encode(message).finish();
    TCPClient.write(buffer);
  };
  • First, send the packet with messageType 6 to the server, and the server processes the type of 6
else if (res.messageType === "6") {
      const message = Query.decode(data);
      const res = M.get(message.key);
      const msg = { messageType: "6", uuid: message.uuid, data: res };
      const errMsg = getSucceed.verify(msg);
      if (errMsg) throw Error(errMsg);
      const m = getSucceed.create(msg);
      const buffer = getSucceed.encode(m).finish();
      socket.write(buffer);
    }
  • If it is 6, it represents the get operation of the client. We first query in the Map, and then return the notification to the client. type or 6
  • After receiving the msgtype of 6, the client calls the corresponding in getCbQueue to return and delete through the data and uuid obtained
else if (decodedMessage.messageType === '6') {
        const res = getSucceed.decode(data);
        const cb = getCbQueue.get(res.uuid);
        cb && cb(res.data);
        getCbQueue.delete(res.uuid);
      }

Many people want to see my real code. I post my optimized code. I think it's really neat

  • redis is implemented by class, static method definition

  • How to use my Redis?
const Redis = require('./redis');
const port = 6380;
const host = '127.0.0.1';
const RedisStore = Redis.connect(port, host);

const data = { messageType: '2', data: { key: '1', value: '2' } };

RedisStore.set(data, () => {
  console.log('set success,trigger cb');
});

RedisStore.get('1', (data) => {
  console.log('get success data:', data);
});

  • Meet expectations

Lack of daemons, data persistence

  • Daemons. I wrote about cluster source code parsing before. Anyone can use pm2 docker, but if you really want to implement it yourself, you need to think about it
  • If you are interested in learning, you can see my previous articles on parsing Cluster source code and PM2 principle https://segmentfault.com/a/1190000021230376

  • PM2 is probably the same as redis when it is built. You can write one in the future. We can start it with PM2 today to achieve the effect of daemons
pm2 start server.js

Realize redis data persistence

  • Two ways of redis data persistence

    • RDB: save data snapshot within specified time interval
    • AOF: append the command to the end of the operation log to save all historical operations
  • In fact, it's a bit troublesome to persist here. redis has a lot of key data types
  • What is redis data persistence used for?

    • Redis data is stored in memory. If the server restarts or redis hangs / restarts, if data persistence is not done, the data will be lost

First, implement AOF and append to the end of the log

  • Received on the server redis.set Log append when
 M.set(payload.key, payload.value);
      fs.appendFile(
        './redis.db',
        `${payload.key},${payload.value}\n`,
        (error) => {
          if (error) return console.log('Failed to append file' + error.message);
          console.log('Append succeeded');
        }
      );
  • result

  • There is a problem in writing like this. It's not easy to get the value when it comes to time. Here you can use the principle of my previous handwritten rich text editor, use zero width characters to occupy the space, and then replace the segmentation when reading the data~

What is a zero width character

  • A non printable Unicode character that is not visible in browsers and other environments, but it does exist. When obtaining the length of a string, it will occupy a position to represent a character for a control function
  • What are the common zero width characters
  • Zero width space (ZWSP) is used where line breaks may be required.
    Unicode: U+200B  HTML: &#8203;
  • Zero width non Joiner (ZWNJ) is placed between two characters of the electronic text to suppress the ligatures that would have happened, and is drawn with the original glyphs of the two characters.
    Unicode: U+200C  HTML: &#8204;
  • Zero width Joiner (ZWJ) is a control character, which is placed between two characters of some complex typesetting languages (such as Arabic and Hindi), so that the two characters that would not have ligatures have a ligature effect.
    Unicode: U+200D  HTML: &#8205;
  • Left to right mark (LRM) is a kind of control character, which is used in two-way typesetting of computer.
    Unicode: U+200E  HTML: &lrm; &#x200E; or&#8206;
  • Right to left mark (RLM) is a kind of control character, which is used in two-way typesetting of computer.
    Unicode: U+200F  HTML: &rlm; &#x200F; or&#8207;
  • Byte order mark (BOM) is often used as a mark that is encoded in UTF-8, UTF-16 or UTF-32.
    Unicode: U+FEFF
  • The application of zero width character in JavaScript
  • Data anti creep
  • Inserts a zero width character into the text to interfere with keyword matching. The crawler's data with zero width characters will affect their analysis, but will not affect the user's reading data.
  • information transfer
  • Insert the zero width character of the custom combination into the text, and the user will carry the invisible information after copying, so as to achieve the transmission function.

Use zero width characters

  • I like to use it because it's 2 b
 `${payload.key},${payload.value}\u200b\n`,
  • Insert persistence effect

Data preheating

  • Preheat the data in the event that the server listens to the port, and read the disk data into the memory
//Server listening events
server.on('listening', function () {
  fs.readFile('./redis.db', (err, data) => {
    console.log(data.toString(), 'xxx');
  });
  console.log('server listening:' + server.address().port);
});
  • Results in line with expectations

  • In fact, there is a problem in the above description. In order to better segment and extract the cold data of the disk, I changed the zero width character of the next segment
 `${payload.key}-${payload.value}\u200b`,
  • The inserted data becomes like this

  • The algorithm of reading data should also be considered
//Server listening events
server.on('listening', function () {
  fs.readFile('./redis.db', (err, data) => {
    const string = data.toString();
    if (string.length > 0) {
      const result = string.split('\u200b');
      for (let i = 0; i < result.length; i++) {
        const res = result[i];
        for (let j = 0; j < res.length; j++) {
          if (res[j] === '-') {
            continue;
          }
          j === 0 ? M.set(res[j], null) : M.set(res[j - 2], res[j]);
        }
      }
    }
  });
  console.log('server listening:' + server.address().port);
});
  • Final effect, in line with expectations

  • When redis makes a mistake, you can brush the data into the disk and persist the data on a regular basis. If you want to implement it, you can use a similar idea. Of course, this is not the real implementation of redis, but a simulation

If you feel good, pay attention to the official account of WeChat.

  • This is Peter. I have designed the desktop IM software with the function of end-to-end encryption super group for 200000 people. My wechat is CALASFxiaotan
  • In addition, welcome to collect my information website: front end life community: https://qianduan.life If you feel right, you can help yourself. You can look at the bottom right corner and watch a wave official account.

Posted by wutanggrenade on Sun, 28 Jun 2020 18:08:35 -0700