Efficient Generation of JSON Strings-json-gen

Keywords: Go JSON github encoding

Summary

Many operations on the game server (both player and non-player) need to be transmitted to the company's mid-stage to collect summaries and analyze data according to operational needs. On the other side of the platform, we need to transfer the past data in JSON format. At first, we used encoding/json in the golang standard library and found that the performance was not ideal (because serialization used reflection, involving multiple memory allocations). Since the original format of data is map[string]interface {} and needs to construct one field by one field, I think I can calculate the length of the final JSON string during the construction process, so only one memory allocation is needed.

Use

Download:

$ go get github.com/darjun/json-gen

Import:

import (
  jsongen "github.com/darjun/json-gen"
)

It is convenient to use:

m := jsongen.NewMap()
m.PutUint("key1", 123)
m.PutInt("key2", -456)
m.PutUintArray("key3", []uint64{78, 90})
data := m.Serialize(nil)

data is the JSON string that is finally serialized. Of course, types can be nested arbitrarily. See Code github.

github has Benchmark, which is 10 times the performance of standard JSON libraries!

Library Time/op(ns) B/op allocs/op
encoding/json 22209 6673 127
darjun/json-gen 3300 1152 1

Realization

First, an interface Value is defined, which is implemented by all values that can be serialized into JSON:

type Value interface {
  Serialize(buf []byte) []byte
  Size() int
}
  • Serialize can pass in an allocated memory, which appends the serialized JSON string to buf.
  • Size returns the number of bytes that the value ultimately occupies in the JSON string.

classification

I classify the values that can be serialized into JSON strings into four categories:

  • QuotedValue: In the final string, you need to use "wrapped values, such as strings in golang."
  • Unquoted Value: There is no need to use "wrapped values" in the final string, such as uint/int/bool/float32.
  • Array: Corresponds to an array in JSON.
  • Map: Maps in JSON.

At present, these four types can meet my needs, and subsequent expansion is also very convenient, just need to realize the Value interface. The following four types of implementations are discussed based on Value's two interfaces.

QuotedValue

The underlying definition of QuotedValue is based on string type:

type QuotedValue string

Because QuotedValue will eventually have two "in the JSON string, its size is: length + 2. Let's look at the implementation of Serialize and Size methods:

func (q QuotedValue) Serialize(buf []byte) []byte {
  buf = append(buf, '"')
  buf = append(buf, []byte(q)...)
  return append(buf, '"')
}

func (q QuotedValue) Size() int {
  return len(q) + 2
}

UnquotedValue

UnquotedValue is also defined based on string type:

type UnquotedValue string

Unquoted Value, unlike QuotedValue, does not require "wrapping, Serialize, and Size methods to be implemented as shown above, which is relatively simple!

Array

Array represents an array of JSON. Because JSON arrays can contain any type of data, we can define Array for the underlying type based on [] Value:

type Array []Value

Thus, the bytes occupied by Array in the final JSON string include all element sizes, between elements, and before and after the array []. The Size method is implemented as follows:

func (a Array) Size() int {
  size := 0
  for _, e := range a {
    // Recursive calculation of element size
    size += e.Size()
  }

  // for []
  size += 2
  if len(a) > 1 {
    // for ,
    size += len(a) - 1
  }

  return size
}

The Serialize method recursively calls the Serialize method of the element, adds between the elements, and wraps the entire array with [].

func (a Array) Serialize(buf []byte) []byte {
  if len(buf) == 0 {
    // If no allocated space is passed in, allocate space according to Size
    buf = make([]byte, 0, a.Size())
  }

  buf = append(buf, '[')
  count := len(a)
  for i, e := range a {
    buf = e.Serialize(buf)
    if i != count-1 {
      // In addition to the last element, add after each element,
      buf = append(buf, ',')
    }
  }

  return append(buf, ']')
}

To facilitate the operation of arrays, I add many methods to arrays. The basic types commonly used and Array / Map have corresponding operation methods. The operation method is named AppendType and AppendType Array (where Type is uint/int/bool/float/Array/Map and other type names).

In addition to string/Array/Map, other basic types use strconv to convert to a string and force to convert to Unquoted Value because it does not require "wrapping".

func (a *Array) AppendUint(u uint64) {
  value := strconv.FormatUint(u, 10)

    *a = append(*a, UnquotedValue(value))
}

func (a *Array) AppendString(value string) {
    *a = append(*a, QuotedValue(escapeString(value)))
}

func (a *Array) AppendUintArray(u []uint64) {
    value := make([]Value, 0, len(u))
    for _, v := range u {
        value = append(value, UnquotedValue(strconv.FormatUint(v, 10)))
    }

    *a = append(*a, Array(value))
}

func (a *Array) AppendStringArray(s []string) {
    value := make([]Value, 0, len(s))
    for _, v := range s {
        value = append(value, QuotedValue(escapeString(v)))
    }

    *a = append(*a, Array(value))
}

One thing to note here is that since the Append* method modifies Array (that is, slices), the receiver needs to use pointers!

Map

There are two choices when implementing Map. The first definition is map[string]Value, which is simple in structure, but because of the randomness of map traversal, the JSON string generated by the same Map will be different. Finally, I chose the second scheme, that is, the key and the value are stored separately, so as to ensure that in the final JSON string, the order of keys is the same as that of insertion:

type Map struct {
  keys []string
  values []Value
}

Map size consists of several parts:

  • The size of keys and values.
  • The {} package is needed before and after.
  • Each key needs to be wrapped.
  • There needs to be a:: between the key and the value.
  • Each key-value pair needs to be separated.

With these components in mind, the implementation of the Size method is simple:

func (m Map) Size() int {
  size := 0
    for i, key := range m.keys {
        // +2 for ", +1 for :
        size += len(key) + 2 + 1
        size += m.values[i].Size()
    }

    // +2 for {}
    size += 2

    if len(m.keys) > 1 {
        // for ,
        size += len(m.keys) - 1
    }

    return size
}

Serialize assembles multiple key-value pairs:

func (m Map) Serialize(buf []byte) []byte {
    if len(buf) == 0 {
        buf = make([]byte, 0, m.Size())
    }

    buf = append(buf, '{')
    count := len(m.keys)
    for i, key := range m.keys {
        buf = append(buf, '"')
        buf = append(buf, []byte(key)...)
        buf = append(buf, '"')
        buf = append(buf, ':')
        buf = m.values[i].Serialize(buf)
        if i != count-1 {
            buf = append(buf, ',')
        }
    }
    return append(buf, '}')
}

Similar to Array, in order to operate Map easily, I have added many methods to Map. Common basic data types and Array / Map have corresponding operation methods. The operation method is named PutType and PutType Array (where Type is uint/int/bool/float/Array/Map, etc.).

func (m *Map) put(key string, value Value) {
    m.keys = append(m.keys, key)
    m.values = append(m.values, value)
}

func (m *Map) PutUint(key string, u uint64) {
    value := strconv.FormatUint(u, 10)

    m.put(key, UnquotedValue(value))
}

func (m *Map) PutUintArray(key string, u []uint64) {
    value := make([]Value, 0, len(u))
    for _, v := range u {
        value = append(value, UnquotedValue(strconv.FormatUint(v, 10)))
    }

    m.put(key, Array(value))
}

epilogue

According to my own needs, I have implemented a library to generate JSON strings, which greatly improves the performance. Although it is not perfect, the subsequent expansion is very simple. Hope to bring inspiration to friends who have the same needs.

Posted by Submerged on Wed, 09 Oct 2019 02:40:50 -0700