Serialization and deserialization and differences of serialization protocols

Keywords: Java Session network Web Server

1, What is serialization and deserialization?
  • Java serialization refers to the process of converting Java objects into byte sequences;
  • Java deserialization refers to the process of restoring byte sequences to Java objects.
2, What does serialization do?
  • Object persistence, which can permanently save the byte sequence of the object to the hard disk, usually in a file (save memory);
    For example:
    When 100000 users access the Session object in the Web server concurrently, 100000 Session objects may appear. Obviously, the memory may not be able to bear this situation.
    Then the Web container will serialize some sessions first, let them leave the memory space, serialize them to the hard disk, and restore the objects saved in the hard disk to the memory when they need to be called.
  • The byte sequence of the object is transmitted on the network (generally, the binary sequence is used in the network communication, such as text, picture, etc.).
3, Serialization and deserialization in Java?
3.1 Java serialization API

There are three ways to serialize in Java:

  • If class a only implements the Serializable interface, then
    ObjectOutputStream uses the default serialization method to serialize the non transient instance variables of a object
    ObjectInputStream uses the default deserialization method to deserialize the non transient instance variables of a object
  • If class a only implements the Serializable interface, and also defines the writeObject(ObjectOutputStream out) and readObject(ObjectInputStream in) of object a, then
    ObjectOutputStream calls the method of writeObject(ObjectOutputStream out) of a object for serialization
    ObjectInputStream calls the method of readObject(ObjectInputStream in) of a object for serialization
  • If class a implements the externalizable interface, and the User class must implement the readExternal(ObjectInput in) and wiriteExternal(ObjectOutput out) methods, then
    ObjectOutputStream calls the wiriteExternal(ObjectOutput out) method of a object for serialization
    ObjectInputStream calls the readexternal (objectinput in) method of a object for serialization
3.2 serialization steps
--1--Create an object output stream, which can wrap a special type of target output stream, such as file output stream:
objectOutputStream oos=new objectOutputStream(new FileOutStream(c:\\object.out));

--2--Output flow through objectswriteObject()Method write object:
oos.writeObject(new A("xiaoxiao","145263","female"));
3.3 deserialization steps
--1--Create an object input stream that wraps a different type of input stream, such as a file input stream:
objectInputStream ois=new ObjectInputStream(new FileInputStream("object.out"));

--2--Output thereadObject()Method read object:
A a=(A)ois.readObject();

--3--In order to read data correctly and complete deserialization, it is necessary to ensure that the order of writing objects to the output stream of objects is the same as that of reading objects from the input stream of objects
3.4 what is the purpose of serialVersionUID and the difference between explicit and implicit?
  • Real use of serialVersionUID:
    Serialization and deserialization are carried out by comparing their serialversionuids. We modify an entity class that implements the Serializable interface. After recompilation, it is obvious that the program will generate new values again. Once the SerialversionUID does not match the previous one, the deserialization will not succeed.
  • The serialVersionUID differs explicitly from implicitly:
  1. Implicit definition:
    When the attribute is added to the entity, the class in the file stream and the class in the classpath, that is, the modified class, are incompatible. Considering the security mechanism, the program throws an error and refuses to load. For example, if the serialVersionUID of class A is not specified, the java compiler will automatically perform a summary algorithm for this class, similar to the fingerprint algorithm. As long as there is one more space in this file, the UID will be different, which can ensure that in so many classes, this number is unique. Therefore, after adding a field, the compiler generated a UID for us because the serialVersionUID was not explicitly specified. Of course, it is not the same as the one saved in the file, so there are two errors with inconsistent serialization version numbers.
  2. Explicitly defined:
    The serialVersionUID is explicitly defined, so that after serialization, a field or method can be added without regenerating the serialVersionUID, which will not affect the later deserialization and restore.
4, Hessian serialization and deserialization?
4.1 define abstract serializer
public abstract class AbstractSerializer {
    /**
     * serialize
     *
     * @param o
     * @return
     */
    public abstract byte[] toBytes(Object o) throws Exception;

    /**
     * Deserialization
     *
     * @param bytes
     * @param <T>
     * @return
     */
    public abstract <T> T fromBytes(byte[] bytes) throws Exception;
}
4.2 implementation of Hessian serializer
public class HessianSerializer extends AbstractSerializer {

    private static final Logger LOGGER = LoggerFactory.getLogger(HessianSerializer.class);

    private static final com.caucho.hessian.io.SerializerFactory SERIALIZER_FACTORY = new com.caucho.hessian.io.SerializerFactory();

    @Override
    public byte[] toBytes(Object o) throws Exception {
        Hessian2Output h2os = null;
        try {
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            h2os = new Hessian2Output(bos);

            h2os.setSerializerFactory(SERIALIZER_FACTORY);
            h2os.writeObject(o);
            h2os.flush();

            return bos.toByteArray();
        } finally {
            close(h2os);
        }
    }

    private void close(Hessian2Output h2os) {
        if (h2os != null) {
            try {
                h2os.close();
            } catch (Exception e) {
                LOGGER.info("failed to close hessian output stream", e);
            }
        }
    }

    @Override
    public <T> T fromBytes(byte[] bytes) throws Exception {
        Hessian2Input h2is = null;
        try {
            ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
            h2is = new Hessian2Input(bis);
            h2is.setSerializerFactory(SERIALIZER_FACTORY);
            Object rv = h2is.readObject();

            return (T)rv;
        } finally {
            close(h2is);
        }
    }

    private void close(Hessian2Input h2is) {
        if (h2is != null) {
            try {
                h2is.close();
            } catch (Exception e) {
                LOGGER.info("failed to close hessian input stream", e);
            }
        }
    }
}
4.3 create serialization enumeration to facilitate the extension of serialization type
public enum SerializerType {

    HESSIAN(5, new HessianSerializer());

    private int type;

    private AbstractSerializer instance;

    SerializerType(int type, AbstractSerializer instance) {
        this.type = type;
        this.instance = instance;
    }

    public int getType() {
        return type;
    }

    public AbstractSerializer getInstance() {
        return instance;
    }

    public static SerializerType getByType(int type) {
        for (SerializerType serializerType : values()) {
            if (serializerType.type == type) {
                return serializerType;
            }
        }
        return null;
    }
}
5, The difference between Hessian and Java serialization?
  • Java serialization:
    Java serialization will serialize the metadata and business data of the object class to be serialized from the byte stream, and serialize the whole inheritance relationship. The byte stream it serializes is a complete description of the structure to content of that object, containing all the information, so it is inefficient and has a large byte stream. But because everything is serialized, it can be said that everything can be transferred, so it is more available and reliable.
  • Hessian serialization:
    hessian serialization, its implementation mechanism is focused on data, with a simple method of type information. Just like Integer a = 1, hessian will be sequenced into a stream like I 1. I represents int or Integer, and 1 is the data content. For complex objects, through the reflection mechanism of Java, hessian serializes all the properties of the object as a map, generating a stream like M className propertyName1 I 1 propertyName S stringValue, which contains the basic type description and data content. In the serialization process, if an object has appeared before, hessian will directly insert a block such as R index to represent a reference location, thus saving the time of re serialization and deserialization. The cost of doing this is that hessian needs to handle different types of things differently (so hessian is lazy directly and doesn't support short), and needs to handle some special objects (such as StackTraceElement). At the same time, because it does not go deep into the implementation for serialization, there will be some inconsistencies in some cases, such as through Collections.synchronizedMap The resulting map.
6, Common serialization protocol?

Reference link: comparison and selection of serialization performance in common use

Posted by everknown on Fri, 26 Jun 2020 21:35:28 -0700