Network programming - use more concise and efficient Okio library for IO and NIO

Keywords: Java Android socket

introduction

Compared with the omniscience of Java and Android, OKHttp needed to introduce additional Okio libraries when it was introduced through Gradle in the early days. I didn't pay attention to it when it was first used, just like most. I didn't realize that Okio is a treasure house until I slowly like to go to the bottom and go deep into the core of various frameworks, There are also many design ideas worth learning from. Finally, let's give a suggestion to abandon the traditional Java IO and NIO. Whether you are Java or Android, you should embrace Okio as soon as possible. The article is excerpted from Okio official website document Official account will be official account. CSDN will be more like a essay. The public number will be integrated into the source code in a highly complete series. It will really form a complete technical stack. Welcome to the attention. The first series is about the Android system startup series. There are about twenty dry cargo.

Based on OKio 1.x

1, Okio overview

Okio was originally a component of okhttp. As an IO library used internally by okhttp, it can be said to be an excellent component more efficient than JDK native IO and nio. Okio bottom layer adopts object pool reuse technology to avoid frequent GC, and highly abstracts IO objects (a bit similar to the design idea of socket), making it more efficient and convenient to access, store and process data.

2, Core elements of Okio

1,Two data types of Okio

Okio is a highly abstract data type, which is mainly built around two types:

  • ByteStrings
  • Buffers

All data operations are abstracted and unified into the corresponding API. It is precisely because of their internal design that Okio saves CPU and memory resources compared with traditional IO.

1.1. ByteStrings -- immutable byte sequence

The ByteString class implements the serializable and comparable interfaces generated by the original JDK. In essence, it is an immutable byte sequence. The most basic type corresponding to String data is String. ByteString is equivalent to a smarter String because it comes with flexible encoding and decoding mechanisms for hexadecimal, base64 and utf-8

When encoding a string into ByteString using UTF-8, the string will be cached internally and can be directly obtained from the cache during decoding.

1.2 Buffers -- variable byte sequence

The buffer class implements the bufferedsource, bufferedsink, Cloneable and bytechannel interfaces defined by Okio. It is essentially a variable byte sequence. Similar to Arraylist, you do not need to set the size of the buffer in advance. You can read and write the buffer as a queue structure, write the data to the end of the queue, and then read from the head of the queue. Buffer is implemented as a linked list of fragments. When data is moved from one buffer to another, it will reallocate the holding relationship of fragments instead of copying data across fragments. This is especially useful for multithreading, especially the sub thread interacting with the network can exchange data with the worker thread without any replication or redundant operation.

2. Stream type of Okio

Okio also highly abstracts the Stream type. All streams can be represented by the following two types:

  • Source
  • Sink

It can be regarded as the base class of all streams.

2.1. Source - InputStream in Okio

Supplies a stream of bytes. Use this interface to read data from wherever it's located: from the network, storage, or a buffer in memory.

public interface Sink extends Closeable, Flushable {
  /** Removes {@code byteCount} bytes from {@code source} and appends them to this. */
  void write(Buffer source, long byteCount) throws IOException;

  /** Pushes all buffered bytes to their final destination. */
  @Override void flush() throws IOException;

  /** Returns the timeout for this sink. */
  Timeout timeout();

  /**
   * Pushes all buffered bytes to their final destination and releases the
   * resources held by this sink. It is an error to write a closed sink. It is
   * safe to close a sink more than once.
   */
  @Override void close() throws IOException;
}

When we need to read data, we can directly call the methods of Source class and its subclass BufferedSource. The design of BufferedSource is somewhat similar to the original I/O architecture design of JDK, which is equivalent to the decorator role of a Source. When the Source input stream is obtained through Okio.buffer(fileSource), a RealBufferedSource class object that implements the BufferedSource interface will be generated in the method. Second, the buffer buffer object held in RealBufferedSource can make IO faster.

2.2 Sink - OutputStream in Okio

Receives a stream of bytes. Use this interface to write data wherever it's needed: to the network, storage, or a buffer in memory. Sinks may be layered to transform received data, such as to compress, encrypt, throttle, or add protocol framing.

public interface Source extends Closeable {
  /**
   * Removes at least 1, and up to {@code byteCount} bytes from this and appends
   * them to {@code sink}. Returns the number of bytes read, or -1 if this
   * source is exhausted.
   */
  long read(Buffer sink, long byteCount) throws IOException;

  /** Returns the timeout for this source. */
  Timeout timeout();

  /**
   * Closes this source and releases the resources held by this source. It is an
   * error to read a closed source. It is safe to close a source more than once.
   */
  @Override void close() throws IOException;
}

When we need to write data, we can directly call the methods of Sink class and its subclass BufferedSink. The design of BufferedSink is similar to that of BufferedSource. When the Source input stream is obtained through Okio.buffer(fileSource), a RealBufferedSink class object that implements the BufferedSink interface will be generated in the method. Second, the buffer buffer object held in RealBufferedSink can make IO faster.

2.3 comparison between Okio Stream and JDK Stream

Source and Sink can operate with InputStream and OutputStream in JDK, that is, any source can be regarded as an InputStream or any InputStream can be regarded as a source; The same is true between Sink and OutputStream. But there are some differences between them:

  • Timeouts, different from the socket word stream of java.io, Okio stream supports access to the underlying I/O timeout mechanism, and its read() and write() methods are provided with a timeout mechanism.

  • It is easy to implement and safe. Source only defines three methods -- read(), close(), and timeout(), which avoids the hidden dangers that may be unknown when calling the java.io.inputstream#available () method or single byte read operation.

    Source avoids the impossible-to-implement java.io.InputStream#available available() method. Instead callers specify how many bytes BufferedSource#require() method.

  • It is easy to use. Although there are only three methods to implement Okio's stream base class interface Source and Sink, its sub interfaces Bufferedsource and Bufferedsink have rich API s. In addition to supporting users' own extended implementation, it also encapsulates the implementation classes corresponding to most types of stream operations for users to call directly, making some offset operations similar to C/C + + very simple, etc.

  • The caller does not need to significantly distinguish the operations of byte stream and character stream. They are all data. They can read and write in any form, such as bytes, UTF-8 strings, 32-bit integers of big endian, short integers of little endian, etc. there is no need for additional stream conversion operations in JDK, just call the corresponding methods directly.

3, Simple use of Okio

1. BufferedSource reads text files

  • Get the source input stream through the Okio.source(file) method
  • Then wrap the original input stream into BufferedSource
  • Call the BufferedSource method. For example, call readLines when reading in line form.
public void readLines(File file) throws IOException {
    //Whether the resources used in try catch are self created or java built-in types, try with resources is a powerful method to ensure that resources can be shut down correctly.
    try (Source fileSource = Okio.source(file);
         BufferedSource bufferedSource = Okio.buffer(fileSource)) {
        while (true) {
            String line = bufferedSource.readUtf8Line();
            if (line == null) break;

            if (line.contains("square")) {
                System.out.println(line);
            }
        }
    }
}

Another form:

public void readLines(File file) throws IOException {
  try (BufferedSource source = Okio.buffer(Okio.source(file))) {
    for (String line; (line = source.readUtf8Line()) != null; ) {
      if (line.contains("square")) {
        System.out.println(line);
      }
    }
  }
}

Try with source is a syntax sugar provided by jdk1.7. For the resource object in the try statement (), JDK will automatically call its close method to close it, so you don't have to close it manually in finally. Even if there are multiple resource objects in the try, it doesn't affect it. However, if it is used in android, you will be prompted to require that the API level be at least 19.

Generally speaking, the readUtf8Line() method can be used to read most files, but for some special cases, you can consider using readUtf8LineStrict(). The function is similar. The difference is that readUtf8LineStrict() requires that each line end with * * \ n or \ r\n * *. If the end of the file is encountered before that, it will throw an EOFException.

After JDK 1.7, you can call the System.lineSeparator() method to automatically obtain the line feed flag under the corresponding system.

  • \n -- end of line flag under Unix system
  • \r\r -- end of line flag under Window system
 The returned string will have at most {@code limit} UTF-8 bytes, and the maximum number
of bytes scanned is {@code limit + 2}. If {@code limit == 0} this will always throw an {@code EOFException} because no bytes will be scanned.public void readLines(File file) throws IOException {
  try (BufferedSource source = Okio.buffer(Okio.source(file))) {
    while (!source.exhausted()) {
        /**
        * The returned string will have at most {@code limit} UTF-8 bytes, and the maximum number
of bytes scanned is {@code limit + 2}. If {@code limit == 0} this will always throw an {@code EOFException} because no bytes will be scanned.
		  * Buffer buffer = new Buffer();
           *   buffer.writeUtf8("12345\r\n");
           *   // This will throw! There must be \r\n or \n at the limit or before it.
           *   buffer.readUtf8LineStrict(4);
           *   // No bytes have been consumed so the caller can retry.
           *   assertEquals("12345", buffer.readUtf8LineStrict(5));	 
        */
        String line = source.readUtf8LineStrict(1024L);
        System.out.println(line);
    }
  }
}

In addition to reading text files, you can also read serialized objects through ObjectOutputStream

2. Serialization and deserialization

2.1. Serialize the object into ByteString

Okio's buffer replaces the ByterrayOutputstream of JDK, obtains the output stream object from the buffer, and writes the output stream to the buffer buffer through the ObjectOutputStream of JDK. When you write data to the buffer, it will always be written to the end of the buffer (determined by the internal data structure), and finally through the readByteString() of the buffer object Read a ByteString object from the buffer. At this time, it will be read from the head of the buffer. The readByteString() method (you can specify the number of bytes to be read, but if not specified, read all the contents)

private ByteString serialize(Object o) throws IOException {
  Buffer buffer = new Buffer();
  try (ObjectOutputStream objectOut = new ObjectOutputStream(buffer.outputStream())) {
    objectOut.writeObject(o);
  }
  return buffer.readByteString();
}

Through the above methods, you can serialize an object and get the ByteString object. You can directly call * * ByteString#base64() * * method to encode Base64:

Point point = new Point(8.0, 15.0);
ByteString pointBytes = serialize(point);
System.out.println(pointBytes.base64());

Okio calls the string obtained after base64 processing as Golden value, and ByteString#decodeBase64 is used for Golden value

ByteString goldenBytes = ByteString.decodeBase64(pointBytes.base64());

2.2. Deserialize ByteString as an object

You can get the original ByteString from the Golden value, construct the ObjectInputStream object of JDK through the original ByteString, and deserialize it through its readObject method

private Object deserialize(ByteString byteString) throws IOException, ClassNotFoundException {
  Buffer buffer = new Buffer();
  buffer.write(byteString);
  try (ObjectInputStream objectIn = new ObjectInputStream(buffer.inputStream())) {
    return objectIn.readObject();
  }
}

In short, Golden value can be used to ensure that the serialized and deserialized objects are consistent.

Point point = new Point(8.0, 15.0);
ByteString pointBytes = serialize(point);
String glodenValue =pointBytes.base64();
ByteString goldenBytes = ByteString.decodeBase64(glodenValue);
Point decoded = (Point) deserialize(goldenBytes);
assertEquals(new Point(8.0, 15.0), decoded);//true

An obvious difference between Okio serialization and JDK native serialization is that GodenValue can be compatible between different clients (as long as the serialized and deserialized classes are the same). For example, on the PC side, I use Okio to serialize the GodenValue string generated by a User object. You can still deserialize the User object when you get the string from the mobile terminal.

3. BufferedSink write text file

  • The sink output stream is obtained by Okio.sink() method
  • Then wrap the original output stream into BufferedSink
  • Call the write series method of BufferedSink
public void writeEnv(File file) throws IOException {
  try (Sink fileSink = Okio.sink(file);
       BufferedSink bufferedSink = Okio.buffer(fileSink)) {

    for (Map.Entry<String, String> entry : System.getenv().entrySet()) {
      bufferedSink.writeUtf8(entry.getKey());
      bufferedSink.writeUtf8("=");
      bufferedSink.writeUtf8(entry.getValue());
      bufferedSink.writeUtf8("\n");
        System.lineSeparator()
    }
  }
}

Other usage and design ideas refer to the read-write method. Most of the write series methods are read-write methods with UTF-8 coding. Okio recommends giving priority to the use of UTF-8 method, because UTF-8 has been standardized all over the world. However, other encoded character sets are required. You can use readString() and writeString() to specify character encoding parameters, but in most cases, you should only use methods with UTF-8.

4. Write binary byte sequence

Any type of write operation in Okio is initiated by BufferedSink. Compared with traditional IO, Okio is particularly convenient to write binary byte sequence, which realizes the function of inserting after pointer displacement in C/C + +. Therefore, several concepts are introduced:

  • The width of each field - the number of bytes to be written, but Okio does not have a mechanism to write part of the number of bytes, but if necessary, it needs to perform byte displacement and other operations before writing.

  • The endianness of each field -- binary sequences larger than one byte in the computer are sorted by size. The order of bytes is from the highest bit to the lowest bit (big endian) or from the lowest bit to the highest bit (small endian).

    Okio's sorting methods for small ends are all suffixed with Le; The method without suffix is sorted by the big end by default

  • Signed vs Unsigned -- signed and unsigned. In Java, except that char is an unsigned basic type, all other basic types are signed. Therefore, an "unsigned" byte like 255 (int type) can be directly passed into the writeByte() and writeShort() methods, and Okio will handle it by itself.

methodwidthByte sortingvalueEncoded value
writeByte1303
writeShort2big300 03
writeInt4big300 00 00 03
writeLong8big300 00 00 00 00 00 00 03
writeShortLe2little303 00
writeIntLe 4little303 00 00 00
writeLongLe8little303 00 00 00 00 00 00 00
writeByte1Byte.MAX_VALUE7f
writeShort2bigShort.MAX_VALUE7f ff
writeInt4bigInt.MAX_VALUE7f ff ff ff
writeLong8bigLong.MAX_VALUE7f ff ff ff ff ff ff ff
writeShortLe2littleShort.MAX_VALUEff 7f
writeIntLe4littleInt.MAX_VALUEff ff ff 7f
writeLongLe8littleLong.MAX_VALUEff ff ff ff ff ff ff 7f

The following example is basically the parsing and implementation of the Bitmap algorithm protocol, which is quickly implemented through Okio.

public final class BitmapEncoder {
    static final class Bitmap {
        private final int[][] pixels;
        Bitmap(int[][] pixels) {
            this.pixels = pixels;
        }
        int width() {
            return pixels[0].length;
        }
        int height() {
            return pixels.length;
        }
        int red(int x, int y) {
            return (pixels[y][x] & 0xff0000) >> 16;
        }
        int green(int x, int y) {
            return (pixels[y][x] & 0xff00) >> 8;
        }
        int blue(int x, int y) {
            return (pixels[y][x] & 0xff);
        }
    }
    /**
     * Returns a bitmap that lights up red subpixels at the bottom, green subpixels on the right, and
     * blue subpixels in bottom-right.
     */
    Bitmap generateGradient() {
        int[][] pixels = new int[1080][1920];
        for (int y = 0; y < 1080; y++) {
            for (int x = 0; x < 1920; x++) {
                int r = (int) (y / 1080f * 255);
                int g = (int) (x / 1920f * 255);
                int b = (int) ((Math.hypot(x, y) / Math.hypot(1080, 1920)) * 255);
                pixels[y][x] = r << 16 | g << 8 | b;
            }
        }
        return new Bitmap(pixels);
    }

    void encode(Bitmap bitmap, File file) throws IOException {
        try (BufferedSink sink = Okio.buffer(Okio.sink(file))) {
            encode(bitmap, sink);
        }
    }

    /**
     * https://en.wikipedia.org/wiki/BMP_file_format
     */
    void encode(Bitmap bitmap, BufferedSink sink) throws IOException {
        int height = bitmap.height();
        int width = bitmap.width();

        int bytesPerPixel = 3;
        int rowByteCountWithoutPadding = (bytesPerPixel * width);
        int rowByteCount = ((rowByteCountWithoutPadding + 3) / 4) * 4;
        int pixelDataSize = rowByteCount * height;
        int bmpHeaderSize = 14;
        int dibHeaderSize = 40;

        // BMP Header
        sink.writeUtf8("BM"); // ID.
        sink.writeIntLe(bmpHeaderSize + dibHeaderSize + pixelDataSize); // File size.
        sink.writeShortLe(0); // Unused.
        sink.writeShortLe(0); // Unused.
        sink.writeIntLe(bmpHeaderSize + dibHeaderSize); // Offset of pixel data.

        // DIB Header
        sink.writeIntLe(dibHeaderSize);
        sink.writeIntLe(width);
        sink.writeIntLe(height);
        sink.writeShortLe(1);  // Color plane count.
        sink.writeShortLe(bytesPerPixel * Byte.SIZE);
        sink.writeIntLe(0);    // No compression.
        sink.writeIntLe(16);   // Size of bitmap data including padding.
        sink.writeIntLe(2835); // Horizontal print resolution in pixels/meter. (72 dpi).
        sink.writeIntLe(2835); // Vertical print resolution in pixels/meter. (72 dpi).
        sink.writeIntLe(0);    // Palette color count.
        sink.writeIntLe(0);    // 0 important colors.
        // Pixel data.
        for (int y = height - 1; y >= 0; y--) {
            for (int x = 0; x < width; x++) {
                sink.writeByte(bitmap.blue(x, y));
                sink.writeByte(bitmap.green(x, y));
                sink.writeByte(bitmap.red(x, y));
            }

            // Padding for 4-byte alignment.
            for (int p = rowByteCountWithoutPadding; p < rowByteCount; p++) {
                sink.writeByte(0);
            }
        }
    }
    public static void main(String[] args) throws Exception {
        BitmapEncoder encoder = new BitmapEncoder();
        Bitmap bitmap = encoder.generateGradient();
        encoder.encode(bitmap, new File("gradient.bmp"));
    }
}

In the code, binary data is written to the file in BMP format, which will generate a picture file in BMP format. BMP format requires that each line start with 4 bytes, so a lot of zeros are added to the code for byte alignment. In addition, the format of encoding other binary is very similar, and there are some noteworthy points:

  • Writing tests using Golden values makes debugging easier for confirming the expected results of the program.
  • Use the Utf8.size() method to calculate the byte length of the encoded string.
  • Use Float.floatToIntBits() and Double.doubleToLongBits() to encode floating-point values.

5. Okio is used in Socket communication

The essence of network communication is I/O operation, so okio is very important to improve the performance of network communication. Okio uses BufferedSink to encode the output and BufferedSource to decode the input. Network protocols can be text, binary, or a mixture of both. But there are some substantive differences between networks and file systems. For a file object, you can choose to read or write, but the network can read and write at the same time, which has the problem of thread safety. Therefore, in some protocols, it works in turn - write request, read response, and repeat the above operations. When multithreading is involved, a special thread is usually required to read data, and then use a special thread or use synchronized to write data. If multiple threads can share a Sink, thread safety needs to be considered. In short, okio streams are unsafe to use in parallel by default. Although okio will refresh automatically when the buffered data exceeds a certain threshold, it is only to save memory and cannot be relied on for protocol interaction. Therefore, for okio's sinks buffer, you must manually call flush() to transfer data to minimize I/O operations. Okio establishes a connection based on java.io.socket. After creating a server or client through socket, you can use Okio.source(Socket) to read and Okio.sink(Socket) to write (these API s are also applicable to SSLSocket). In any thread, you can call the Socket.close() method to close the connection, which will cause the sources and sinks objects to throw IOException immediately and fail. Okio can configure timeout limits for all socket operations, but you do not need to call socket methods to set timeout: Source and Sink provide timeout interfaces.

private void handleSocket(final Socket fromSocket) {
    try {
      final BufferedSource fromSource = Okio.buffer(Okio.source(fromSocket));
      final BufferedSink fromSink = Okio.buffer(Okio.sink(fromSocket));
	  //..............
	  //..................
    }  catch (IOException e) {
       .....
    }
  }

You can see that the way to create sources and sinks through a Socket is the same as that through a file. You first get the source or Sink object corresponding to the Socket through Okio.source(), and then get the corresponding decorator buffer object through Okio.buffer().

In Okio, once you create a Source or Sink for the Socket object, you can no longer use InputStream or OutputStream.

Buffer buffer = new Buffer();
for (long byteCount; (byteCount = source.read(buffer, 8192L)) != -1; ) {
  sink.write(buffer, byteCount);
  sink.flush();
}

The loop reads data from the source, writes it to sink, and calls flush() to refresh. If you don't need to flush() every time you write data, the two sentences in the for loop can be replaced by a line of code BufferedSink.writeAll(Source). In the read() method, an 8192 is passed as the number of bytes read. (in fact, any number can be passed here, but Okio prefers to use 8 kib, because this is the maximum value that Okio can handle in a single system call)

int addressType = fromSource.readByte() & 0xff;
int port = fromSource.readShort() & 0xffff;

Okio uses signed types, such as byte and short, but usually the protocol requires unsigned values. The preferred way to convert signed values to unsigned values in Java is through the bitwise AND & operator. The following is the conversion table of bytes, short integers and integers

Type Java typeSigned Range signed rangeUnsigned Range unsigned rangeSigned to Unsigned formula
byte-128...1270...255int u = s & 0xff;
short-32,768...32,7670...65,535int u = s & 0xffff;
int-2,147,483,648...2,147,483,6470...4,294,967,295long u = s & 0xffffffffL;

There is no basic type in Java that can represent an unsigned long type.

6. Hash algorithm and encryption and decryption

Hash hash functions are widely used, such as HTTPS certificate, Git submission, BitTorrent integrity check and blockchain block. Good use of hash can improve the performance, privacy, security and simplicity of applications. Each encrypted hash function accepts a byte input stream with variable length and generates a string value with fixed length, called "Hash" string. Hash functions have the following important characteristics:

  • Certainty - each input always produces the same output.
  • Unity: the possibility of byte strings for each output is the same. It is difficult to find or create different input pairs that produce the same output. That is, "collision".
  • Irreversibility: knowing the output doesn't help you find the input.
  • Easy to understand: hashing has been implemented in many environments and is strictly understood.

Okio comes with some common encryption hash functions:

  • MD5 - 128 bit (16 byte) encrypted hash, which is both unsafe and outdated because of its low reverse cost! This hash is provided because it is very popular and convenient in systems with low security.
  • SHA-1 - 160 bit (20 byte) encrypted hash. It is feasible to create SHA-1 collision. Consider upgrading from SHA-1 to SHA-256.
  • SHA-256 - 256 bit (32-byte) encrypted hash. SHA-256 is widely understood and the reverse operation cost is high. This is the hash that should be used in most systems.
  • SHA-512 - 512 bit (64 byte) encrypted hash. Reverse operation is very expensive.

6.1. Generate the corresponding encrypted "hash" string from ByteBufferString

ByteString byteString = readByteString(new File("README.md"));
System.out.println("   md5: " + byteString.md5().hex());
System.out.println("  sha1: " + byteString.sha1().hex());
System.out.println("sha256: " + byteString.sha256().hex());
System.out.println("sha512: " + byteString.sha512().hex());

6.2. Generate the corresponding encrypted "hash" string from the Buffer

Buffer buffer = readBuffer(new File("README.md"));
System.out.println("   md5: " + buffer.md5().hex());
System.out.println("  sha1: " + buffer.sha1().hex());
System.out.println("sha256: " + buffer.sha256().hex());
System.out.println("sha512: " + buffer.sha512().hex());

6.3. Generate the corresponding encrypted "hash" string from the Source input stream

  try (HashingSource hashingSource = HashingSource.sha256(Okio.source(file));
       BufferedSource source = Okio.buffer(hashingSource)) {
      source.readAll(Okio.blackhole());
      System.out.println("    sha256: " + hashingSource.hash().hex());
  }

6.4. Generate the corresponding encrypted "hash" string from the Sink output stream

try (HashingSink hashingSink = HashingSink.sha256(Okio.blackhole());
     BufferedSink sink = Okio.buffer(hashingSink);
     Source source = Okio.source(file)) {
     sink.writeAll(source);
     sink.close(); // Emit anything buffered.
     System.out.println("    sha256: " + hashingSink.hash().hex());
}

Okio also supports HMAC (Hash Message Authentication Code), which combines a secret key value and a hash value. Applications can use HMAC for data integrity and authentication.

Okio uses Java's java.security.MessageDigest to encrypt hashes and javax.crypto.Mac to generate HMAC.

ByteString secret = ByteString.decodeHex("7065616e7574627574746572");
System.out.println("hmacSha256: " + byteString.hmacSha256(secret).hex());

Of course, HMAC can also be generated from ByteString, Buffer, HashingSource, and HashingSink. However, Okio does not implement HMAC for MD5.

Posted by 01hanstu on Sat, 25 Sep 2021 01:26:16 -0700