Summary of java serialization

Keywords: Java Back-end

Catalog

What is object serialization

Why serialization and deserialization are required

Knowledge of serialization and deserialization

What if some fields in Java serialization do not want to be serialized?

Java Serialization Interface   java.io.Serializable

Class refactoring using serialization and serialVersionUID

Java Externalization Interface   java.io.Externalizable

Differences and connections between Serializable and Externalizable

Four methods of serialization

Serialization Combined Inheritance

Serialization Proxy Pattern

The underlying principle of serialization

Serialization: writeObject()

Deserialization: readObject()

static and transient fields cannot be serialized

How do I implement custom serialization and deserialization?

Note: Reference for this article     Java Serialization JDK Serialization Summary - ixenos - Blog Park

Learn more about Java serialization_ Blog of dingxie1963 - CSDN Blog

What is object serialization

Simply put:

Serialization: The process of converting a data structure or object into a binary byte stream

Deserialization: The process of converting the binary byte stream generated during serialization into a data structure or object

1. Object serialization is the process of converting an object's state into a byte stream.

We can store such a stream of bytes as a file for duplication (deep copy) of this object; In some distributed applications, we can also send byte streams of objects to other computers on the network.

For Java, which is an Object-oriented programming language, we serialize objects, which are instantiated classes, but in C+, a semi-Object-oriented language, struct defines the type of data structure, and class corresponds to the type of Object.

Deserialization restores objects of a stream structure to their original form

2. The Java platform allows us to create reusable Java objects in memory, but generally these objects only exist when the JVM is running, that is, they will not have a longer life cycle than the JVM. However, in real applications, it may be required that the specified object be saved (persisted) after the JVM stops running and that the saved object be read again in the future. Serialization of Java objects can help us do this.

3. Using Java object serialization, the state of the object is saved as a set of bytes when it is saved and then assembled into an object in the future. It must be noted that object serialization preserves the object's "state", which is its member variable. As you can see, object serialization does not focus on static variables in classes.

Because static variables are classes that allocate space and are initialized during the load-link-initialization phase, let alone static constants that are initialized at compile time.

4. In addition to object serialization used when persisting objects, object serialization is used when using RMI (remote method call) or when passing objects across the network. The Java serialization API provides a standard mechanism for dealing with object serialization, which is easy to use.

5. To sum up: The main purpose of serialization is to transfer objects over the network or to store them in file system, database and memory.

Why serialization and deserialization are required

We know that when two processes communicate remotely, they can send each other various types of data, including text, pictures, audio, video, and so on, which are transmitted over the network as binary sequences.

So when two Java processes communicate, can they transfer objects between them? The answer is yes! How can I do that? This requires Java serialization and deserialization!

In other words, on the one hand, the sender needs to convert the Java object into a sequence of bytes and transfer it over the network; On the other hand, the receiver needs to recover the Java object from the byte sequence.

When we understand why Java serialization and deserialization are needed, we naturally think about the benefits of Java serialization. One advantage is data persistence, which allows data to be permanently saved to the hard disk (usually in a file) through serialization, and the other is remote communication through serialization, which is the byte sequence of objects being transmitted over the network.

In general, it can be summarized as the following:

(1) Permanently save the object and save the byte sequence of the object to a local file or database;
(2) Serialization enables objects to be passed and received in the network as byte streams;
(3) transferring objects between processes through serialization;

Knowledge of serialization and deserialization

1. In Java, a class can be serialized as long as it implements the java.io.Serializable interface.

2. Serialize and deserialize objects through ObjectOutputStream and ObjectInputStream

3. Whether the virtual machine allows deserialization depends not only on the consistency of the class path and the function code, but also on the consistency of the serialization IDs of the two classes.   private static final long serialVersionUID)

4. Serialization does not hold static variables.

5. To serialize parent objects, you need to have the parent implement Serializable as well   Interface.

6. The purpose of the Transient keyword is to control the serialization of a variable. By adding the keyword before the variable is declared, you can prevent the variable from being serialized to a file. After being deserialized, the value of the transientvariable is set to the initial value, such as 0 for int and null for object.

7. The server sends serialized object data to the client. Some data in the object is sensitive, such as password string. The client wants to encrypt the password field when serializing. If the client has a decrypted key, it can read the password only when the client deserializes it. This guarantees a certain degree of data security for the serialized object.

What if some fields in Java serialization do not want to be serialized?

For variables that you do not want to serialize, use   transient   Keyword modifiers.

transient   The purpose of the keyword is to prevent the serialization of variables in the instance that are modified with this keyword; When an object is deserialized, it is   transient   Modified variable values are not persisted and restored.

about   transient   There are also a few points to note:

transient   You can only modify variables, not classes and methods.

transient   Modified variable whose value will be set to the default value of the type after deserialization. For example, if it is a decoration   int   Type, then the result after deserialization is   0.

static   Variables do not belong to any object, so they are either   transient   Keyword modifiers are not serialized.

Java Serialization Interface   java.io.Serializable

If you want a class object to be serializable, all you have to do is implement the java.io.Serializable interface. Serialization is a markup interface that does not require any fields or methods to be implemented; it is like a selective join process by which class objects can be serialized.

Serialization is done through ObjectInputStream and ObjectOutputStream, so all we have to do is wrap them up one layer, either save them as files or send them over the network. Let's take a simple serialization example.

package com.journaldev.serialization;

import java.io.Serializable;

public class Employee implements Serializable {

//  private static final long serialVersionUID = -6470090944414208496L;

    private String name;
    private int id;
    transient private int salary;
//  private String password;

    @Override
    public String toString(){
        return "Employee{name="+name+",id="+id+",salary="+salary+"}";
    }

    //getter and setter methods
    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public int getSalary() {
        return salary;
    }

    public void setSalary(int salary) {
        this.salary = salary;
    }

//  public String getPassword() {
//      return password;
//  }
//
//  public void setPassword(String password) {
//      this.password = password;
//  }

}

Note that this is a simple java bean with some properties and a getter-setter method. If you want an object property not to be serialized into a stream, you can use the transient keyword, as I did on the salary variable in the example.

Now let's assume that we need to write our objects to a file and then deserialize them from the same file, so we need some tool methods to serialize them using ObjectInputStream and ObjectOutputStream.

package com.journaldev.serialization;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;

/**
 * A simple class with generic serialize and deserialize method implementations
 *
 * @author pankaj
 *
 */
public class SerializationUtil {

    // deserialize to Object from given file
    public static Object deserialize(String fileName) throws IOException,
            ClassNotFoundException {
        FileInputStream fis = new FileInputStream(fileName);
        ObjectInputStream ois = new ObjectInputStream(fis);
        Object obj = ois.readObject();
        ois.close();
        return obj;
    }

    // serialize the given object and save it to file
    public static void serialize(Object obj, String fileName)
            throws IOException {
        FileOutputStream fos = new FileOutputStream(fileName);
        ObjectOutputStream oos = new ObjectOutputStream(fos);
        oos.writeObject(obj);

        fos.close();
    }

}

Note that the parameter of the method is Object, which is the base class of any Java class, so the writing guarantees universality in a very natural way.

Now let's write a test program to see the real world of Java serialization.

package com.journaldev.serialization;

import java.io.IOException;

public class SerializationTest {

    public static void main(String[] args) {
        String fileName="employee.ser";
        Employee emp = new Employee();
        emp.setId(100);
        emp.setName("Pankaj");
        emp.setSalary(5000);

        //serialize to file
        try {
            SerializationUtil.serialize(emp, fileName);
        } catch (IOException e) {
            e.printStackTrace();
            return;
        }

        Employee empNew = null;
        try {
            empNew = (Employee) SerializationUtil.deserialize(fileName);
        } catch (ClassNotFoundException | IOException e) {
            e.printStackTrace();
        }

        System.out.println("emp Object::"+emp);
        System.out.println("empNew Object::"+empNew);
    }

}

Run the above test program to get the following output.

emp Object::Employee{name=Pankaj,id=100,salary=5000}
empNew Object::Employee{name=Pankaj,id=100,salary=0}

Since salary is a transient variable, its value will not be saved in the file and therefore will not be restored in the new object. Similarly, the values of static variables are not serialized because they belong to classes rather than objects.

Class refactoring using serialization and serialVersionUID

Java serialization allows some changes in Java classes if they can be ignored. Some changes that do not affect the deserialization process are:

Add some new variables to the class.

Changing a variable from transient s to non tansient s is like adding a new variable to serialization.

Converting a variable from static to nonstatic is like adding a new variable to serialization.

For these changes to work properly, however, the java class needs to have a serialVersionUID defined for it, so let's write a test class that deserializes only the serialization files that have been generated by the previous test class.

package com.journaldev.serialization;

import java.io.IOException;

public class DeserializationTest {

    public static void main(String[] args) {

        String fileName="employee.ser";
        Employee empNew = null;

        try {
            empNew = (Employee) SerializationUtil.deserialize(fileName);
        } catch (ClassNotFoundException | IOException e) {
            e.printStackTrace();
        }

        System.out.println("empNew Object::"+empNew);

    }

}

Now, remove the comment for the "password" variable and its getter-setter method from the Employee class to run. You get the following exceptions.

java.io.InvalidClassException: com.journaldev.serialization.Employee; local class incompatible: stream classdesc serialVersionUID = -6470090944414208496, local class serialVersionUID = -6234198221249432383
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
    at com.journaldev.serialization.SerializationUtil.deserialize(SerializationUtil.java:22)
    at com.journaldev.serialization.DeserializationTest.main(DeserializationTest.java:13)
empNew Object::null

The reason is obvious that the serialVersionUID of the previous class is different from that of the new class; in fact, if a class does not have a serialVersionUID defined, it will be automatically calculated and assigned to that class. Java uses class variables, methods, class names, packages, and so on to produce this particular long number. If you work on any IDE, you will be warned that the serializable class Employee does not define a static final serialVersionUID of type long.

We can use the java tool "serialver" to generate a serialVersionUID for a class, and for the Employee class, you can execute the following commands.

SerializationExample/bin$serialver -classpath . com.journaldev.serialization.Employee

Keep in mind that generating a sequence version from the program itself is not required, and we can specify a value as needed, which simply informs the deserialization mechanism that the new class is a new version of the same class and that possible deserialization should occur.

For example, in the Employee class, just comment out the serialVersionUID field and run the SerializationTest program. Now, uncomment the password field in the Employee class and run the DeserializationTest program, you will see that the object stream has been successfully deserialized because the changes in the Employee class are compatible with the serialization process.

Java Externalization Interface   java.io.Externalizable

If you keep an eye on serialization, you will find it handled automatically. Sometimes we want to hide object data to preserve its integrity by implementing the java.io.Externalizable interface and providing implementations of the writeExternal() and readExternal() methods, which are used for serialization.

package com.journaldev.externalization;

import java.io.Externalizable;
import java.io.IOException;
import java.io.ObjectInput;
import java.io.ObjectOutput;

public class Person implements Externalizable{

    private int id;
    private String name;
    private String gender;

    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
        out.writeInt(id);
        out.writeObject(name+"xyz");
        out.writeObject("abc"+gender);
    }

    @Override
    public void readExternal(ObjectInput in) throws IOException,
            ClassNotFoundException {
        id=in.readInt();
        //read in the same order as written
        name=(String) in.readObject();
        if(!name.endsWith("xyz")) throw new IOException("corrupted data");
        name=name.substring(0, name.length()-3);
        gender=(String) in.readObject();
        if(!gender.startsWith("abc")) throw new IOException("corrupted data");
        gender=gender.substring(3);
    }

    @Override
    public String toString(){
        return "Person{id="+id+",name="+name+",gender="+gender+"}";
    }
    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getGender() {
        return gender;
    }

    public void setGender(String gender) {
        this.gender = gender;
    }

}

Note that I have changed the values of the fields before converting them to streams, and then I get these changes when reading them. This way, data integrity can be guaranteed to some extent, and we can throw exceptions after reading streaming data, indicating that integrity checks fail.

Take a look at a test program.

package com.journaldev.externalization;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;

public class ExternalizationTest {

    public static void main(String[] args) {

        String fileName = "person.ser";
        Person person = new Person();
        person.setId(1);
        person.setName("Pankaj");
        person.setGender("Male");

        try {
            FileOutputStream fos = new FileOutputStream(fileName);
            ObjectOutputStream oos = new ObjectOutputStream(fos);
            oos.writeObject(person);
            oos.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        FileInputStream fis;
        try {
            fis = new FileInputStream(fileName);
            ObjectInputStream ois = new ObjectInputStream(fis);
            Person p = (Person)ois.readObject();
            ois.close();
            System.out.println("Person Object Read="+p);
        } catch (IOException | ClassNotFoundException e) {
            e.printStackTrace();
        }

    }

}

Run the above test program to get the following output.

Person Object Read=Person{id=1,name=Pankaj,gender=Male}

So which method is better for serialization? It's actually better to use serialized interfaces, and when you see the end of this tutorial, you'll know why.

Differences and connections between Serializable and Externalizable

1. Externalizable inherits from Serializable; Externalizable instances can also assign a replacement object through the writeReplace and readResolve methods recorded in the Serializable interface

2. Classes that implement the Externalizable interface must have default construction methods. When an Externalizable class is read in, the object flow first creates an object with a parameterless constructor, then calls the readExternal method to deserialize it according to the mechanism defined by the writeExternal.

3. To fully control the flow format and content of an object and its supertype, it implements the writeExternal and readExternal methods of the Externalizable interface. These methods must be explicitly coordinated with the supertype to preserve its state. These methods will be implemented instead of custom writeObject and readObject methods. write/readExternal is fully responsible for the storage and recovery of the entire object, including superclass data, while Serializable only records the attribute state of the class to which the object belongs in the stream.

public void readExternal(ObjectInput in) throws IOException,
      ClassNotFoundException {
     name = (String) in.readObject();
     password = (String) in.readObject();
}

public void writeExternal(ObjectOutput out) throws IOException {
    out.writeObject( name);
    out.writeObject( password);
}

4. Externalizable is faster to serialize than Serializable and has smaller data after serialization, but both reading and fetching need to be done by the developers themselves.   Serializable development is relatively simple, slow, and the serialized data is larger.  

5. Both serialization methods have a feature that if the internal attribute b of multiple objects a points to the same object at the same time, it refers to another object b at the same time. After serialization - > deserialization, these object attributes will still point to the same object b at the same time, and will not deserialize more than one b object.  

However, if multiple objects a are serialized more than once, object b will be deserialized more than once after deserialization because the constructor (Externalizable calls parameterless constructor) is called during deserialization to generate new objects in the heap with different memory addresses, that is, the attribute b of multiple objects a is different.  

6. Serialization objects will use Serializable and Externalizable interfaces. Object persistence mechanisms can also use them. Each object to be stored needs to be checked for support for the Externalizable interface.

a)   If the object supports Externalizable, the writeExternal method is called. If the object does not support Externalizable but implements Serializable, use ObjectOutputStream to save the object.

b)   When you rebuild an Externalizable object, you first create an instance using a public parameterless construction method, then call the readExternal method. Serializable objects can be recovered by reading them from ObjectInputStream.

Four methods of serialization

Serialization of java is automatic, all we have to do is implement the serialization interface, which already exists in the ObjectInputStream and ObjectOutputStream classes. But what if we want to change the way we store data, such as having sensitive information in objects, and encrypt/decrypt them before storing/retrieving them? This is why there are four ways in a class to change the serialization behavior.

If the following methods exist in the serialized class, they will be used for serialization processing.

readObject(ObjectInputStream ois): If this method exists, the ObjectInputStream readObject() method calls it to read objects from the stream.

WteObject (ObjectOutputStream oos): If this method exists, the ObjectOutputStream writeObject() method calls it to write objects from the stream. A common use is to hide the value of an object to ensure integrity.

Object writeReplace(): If this method exists, it is called after serialization and serializes the returned object into the stream.

Object readResolve(): If this method exists, it is called after serialization and returns a final object to the caller (keyijinxing). One way to use this is to implement the singleton pattern in the serialized class, which you can use from Serialization and singletons Read more. The object returned by this method will be used as the return value of readOjbect (even if the readObject method definition does not return any objects)

Typically, when implementing the above methods, you should set them as private types so that subclasses cannot override them because they are built for serialization and setting them as private types avoids some security issues.

Specific examples of readObejct and writeObject: Serialization and Deserialization of Java Object Objects

Specific examples of writeReplace and readResolve: Java Object serialization and singleton mode   And this article   Serialization Proxy Pattern

Serialization Combined Inheritance

Sometimes we need to extend a class that does not implement a serialization interface. If we rely on automated serialization behavior and some states are owned by the parent class, they will not be converted to streams and will not be available later.

Here, readObject() and writeObject() are useful, and by providing their implementations, we can store the state of the parent class in the stream for future acquisition. Let's have a look at the actual war.

package com.journaldev.serialization.inheritance;

public class SuperClass {

    private int id;
    private String value;

    public int getId() {
        return id;
    }
    public void setId(int id) {
        this.id = id;
    }
    public String getValue() {
        return value;
    }
    public void setValue(String value) {
        this.value = value;
    }

}

The parent class is a simple java bean and does not implement a serialization interface.

package com.journaldev.serialization.inheritance;

import java.io.IOException;
import java.io.InvalidObjectException;
import java.io.ObjectInputStream;
import java.io.ObjectInputValidation;
import java.io.ObjectOutputStream;
import java.io.Serializable;

public class SubClass extends SuperClass implements Serializable, ObjectInputValidation{

    private static final long serialVersionUID = -1322322139926390329L;

    private String name;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString(){
        return "SubClass{id="+getId()+",value="+getValue()+",name="+getName()+"}";
    }

    //adding helper method for serialization to save/initialize super class state
    private void readObject(ObjectInputStream ois) throws ClassNotFoundException, IOException{
        ois.defaultReadObject();

        //Note that the order of reading and writing is the same, the values are deserialized, and the parent attributes are assigned
        setId(ois.readInt());
        setValue((String) ois.readObject());

    }

    private void writeObject(ObjectOutputStream oos) throws IOException{
        oos.defaultWriteObject();
        //Call the getId of the parent class to get a value written separately to the serialized stream
        oos.writeInt(getId());
        oos.writeObject(getValue());
    }

    @Override
    public void validateObject() throws InvalidObjectException {
        //validate the object here
        if(name == null || "".equals(name)) throw new InvalidObjectException("name can't be null or empty");
        if(getId() <=0) throw new InvalidObjectException("ID can't be negative or zero");
    }

}

Note that the order in which additional data is written and read streams should be consistent, and we can add some logic to them to make them more secure.

Also note that this class implements the ObjectInputValidation interface, and by implementing the validateObject() method, you can add some business validation to ensure that data integrity is not compromised.

Here, by writing a test class, we will see if we can get the state of the parent class from the serialized data.

package com.journaldev.serialization.inheritance;

import java.io.IOException;

import com.journaldev.serialization.SerializationUtil;

public class InheritanceSerializationTest {

    public static void main(String[] args) {
        String fileName = "subclass.ser";

        SubClass subClass = new SubClass();
        subClass.setId(10);
        subClass.setValue("Data");
        subClass.setName("Pankaj");

        try {
            SerializationUtil.serialize(subClass, fileName);
        } catch (IOException e) {
            e.printStackTrace();
            return;
        }

        try {
            SubClass subNew = (SubClass) SerializationUtil.deserialize(fileName);
            System.out.println("SubClass read = "+subNew);
        } catch (ClassNotFoundException | IOException e) {
            e.printStackTrace();
        }
    }

}

Run the above test program to get the following output.

SubClass read = SubClass{id=10,value=Data,name=Pankaj}

In this way, the state of the parent class can be serialized even if it does not implement a serialization interface. This strategy is useful when the parent is a third-party class that we cannot change.

Serialization Proxy Pattern

Java serialization also brings some serious errors, such as:

The structure of classes cannot be significantly changed unless serialization is interrupted, so we need to keep some variables even if we no longer need them, just for backward compatibility.

Serialization can result in a huge security crisis, where an attacker can change the order of streams, thereby harming the system. For example, a user role is serialized, and an attacker can change the value of the stream to admin before executing malicious code.

The serialized proxy mode is a way to make serialization extremely secure, in which an internal private static class is used as a serialized proxy class designed to preserve the state of the primary class. The implementation of this pattern requires a reasonable implementation of the readResolve() and writeReplace() methods.

Let's write a class, implement the serialized code pattern, and then analyze it to better understand the principles.

package com.journaldev.serialization.proxy;

import java.io.InvalidObjectException;
import java.io.ObjectInputStream;
import java.io.Serializable;

public class Data implements Serializable{

    private static final long serialVersionUID = 2087368867376448459L;

    private String data;

    public Data(String d){
        this.data=d;
    }

    public String getData() {
        return data;
    }

    public void setData(String data) {
        this.data = data;
    }

    @Override
    public String toString(){
        return "Data{data="+data+"}";
    }

    //serialization proxy class
    private static class DataProxy implements Serializable{

        private static final long serialVersionUID = 8333905273185436744L;

        private String dataProxy;
        private static final String PREFIX = "ABC";
        private static final String SUFFIX = "DEFG";

        public DataProxy(Data d){
            //obscuring data for security
            this.dataProxy = PREFIX + d.data + SUFFIX;
        }

        private Object readResolve() throws InvalidObjectException {
            if(dataProxy.startsWith(PREFIX) && dataProxy.endsWith(SUFFIX)){
            return new Data(dataProxy.substring(3, dataProxy.length() -4));
            }else throw new InvalidObjectException("data corrupted");
        }

    }

    //replacing serialized object to DataProxy object
    private Object writeReplace(){
        return new DataProxy(this);
    }

    private void readObject(ObjectInputStream ois) throws InvalidObjectException{
        throw new InvalidObjectException("Proxy is not used, something fishy");
    }
}

The Data and DataProxy classes should both implement serialization interfaces.

DataProxy should be able to preserve the state of the Data object.

DataProxy is an internal private static class and cannot be accessed by other classes.

DataProxy should have a separate construction method that receives Data as a parameter.

The Data class should provide a writeReplace() method that returns a DataProxy instance so that when the Data object is serialized, the returned stream belongs to the DataProxy class, but the DataProxy class is not externally visible, so it cannot be used directly.

DataProxy should implement the readResolve() method and return the Data object so that when the Data class is deserialized, it is actually the DataProxy class that is deserialized internally, and then its readResolve() method is called, and we get the Data object.

Finally, the readObject() method is implemented in the Data class, throwing an InvalidObjectException exception to prevent hackers from forging streams of Data objects, parsing them, and then executing attacks.

Let's write a small test to see if this implementation works.

package com.journaldev.serialization.proxy;

import java.io.IOException;

import com.journaldev.serialization.SerializationUtil;

public class SerializationProxyTest {

    public static void main(String[] args) {
        String fileName = "data.ser";

        Data data = new Data("Pankaj");

        try {
            SerializationUtil.serialize(data, fileName);
        } catch (IOException e) {
            e.printStackTrace();
        }

        try {
            Data newData = (Data) SerializationUtil.deserialize(fileName);
            System.out.println(newData);
        } catch (ClassNotFoundException | IOException e) {
            e.printStackTrace();
        }
    }

}

Run the above test program to get the following output.

Data{data=Pankaj

If you turn it on data.ser file You can see that the DataProxy object has been saved as a stream in the file.

This is all about Java serialization, and it looks simple, but we should use it cautiously, and generally it's best not to rely on default implementations. You can download items from the links above and play with them, which will help you learn more.

The underlying principle of serialization

Serialization: writeObject()

Before calling wroteObject() for serialization, the constructor of ObjectOutputStream is called to generate an ObjectOutputStream object with the following constructors:

public ObjectOutputStream(OutputStream out) throws IOException {
    verifySubclass();  //  bout represents the underlying byte data container
  bout = new BlockDataOutputStream(out);
  handles = new HandleTable(10, (float) 3.00);
  subs = new ReplaceTable(10, (float) 3.00);
  enableOverride = false;
  writeStreamHeader(); //  Write header
  bout.setBlockDataMode(true); //  flush data
  if (extendedDebugInfo) {
        debugInfoStack = new DebugTraceInfoStack();
  } else {
        debugInfoStack = null;
  }
}

The constructor first binds bout to the underlying byte data container, then calls the writeStreamHeader() method, which implements the following:

protected void writeStreamHeader() throws IOException {
    bout.writeShort(STREAM_MAGIC);
    bout.writeShort(STREAM_VERSION);
}

In the writeStreamHeader() method, Magic Number and version number representing serialization are first written to the bottom byte container, defined as

/**
 * Magic number that is written to the stream header. 
 */final static short STREAM_MAGIC = (short)0xaced;/**
 * Version number that is written to the stream header. 
 */final static short STREAM_VERSION = 5;

Next, the writeObject() method is called to serialize as follows:

public final void writeObject(Object obj) throws IOException {
     if (enableOverride) {
         writeObjectOverride(obj);
         return ;
     }
     try {
         // Call writeObject0() method serialization
         writeObject0(obj, false );
     } catch (IOException ex) {
         if (depth == 0 ) {
             writeFatalException(ex);
         }
         throw ex;
     }
}

WteObject0() is normally called for serialization, which is implemented as follows:

private void writeObject0(Object obj, boolean unshared)
     throws IOException
{
     // Some omitted codes
     try {
         // Some omitted codes
         Object orig = obj;
         // Gets the Class object of the object to be serialized
         Class cl = obj.getClass();
         ObjectStreamClass desc;
         for (;;) {
             Class repCl;
             // Create ObjectStreamClass object describing cl
             desc = ObjectStreamClass.lookup(cl, true );
             // Other ellipsis codes
         }
         // Some omitted codes
         // Write differently depending on the actual type
         // remaining cases
         if (obj instanceof String) {
             writeString((String) obj, unshared);
         } else if (cl.isArray()) {
             writeArray(obj, desc, unshared);
         } else if (obj instanceof Enum) {
             writeEnum((Enum) obj, desc, unshared);
         } else if (obj instanceof Serializable) {
             // Serializable interface implemented by serialized object
             writeOrdinaryObject(obj, desc, unshared);
         } else {
             if (extendedDebugInfo) {
                 throw new NotSerializableException(
                     cl.getName() + "\n" + debugInfoStack.toString());
             } else {
                 throw new NotSerializableException(cl.getName());
             }
         }
     } finally {
         depth--;
         bout.setBlockDataMode(oldMode);
     }
}

As you can see from the code, the program will

1 Generate an ObjectStreamClass object that describes the class meta information of the class being serialized.

2 Different serialization operations depend on the actual type of object passed in that needs to be serialized. It is clear from the code that String types, array types, and Enum can be serialized directly. If the serialized object implements a Serializable object, the writeOrdinaryObject() method is called for serialization.

This explains the problem that the Serializbale interface is empty and does not define any methods, so why serialized interfaces are needed to be serialized as long as the Serializbale interface is implemented.

The answer is: Serializable interface This is an identity that tells the program that all objects that implement Me need to be serialized.

Therefore, the serialization process proceeds to the method writeOrdinaryObject(), which is implemented as follows:

private void writeOrdinaryObject(Object obj,
                                  ObjectStreamClass desc,
                                  boolean unshared) throws IOException
{
     if (extendedDebugInfo) {
         debugInfoStack.push(
             (depth == 1 ? "root " : "" ) + "object (class \"" +
             obj.getClass().getName() + "\", " + obj.toString() + ")" );
     }
     try {
         desc.checkSerialize();
 
         bout.writeByte(TC_OBJECT); // Write Object Flag Bits
         writeClassDesc(desc, false ); // Write Class Metadata
         handles.assign(unshared ? null : obj);
         if (desc.isExternalizable() && !desc.isProxy()) {
             writeExternalData((Externalizable) obj);
         } else {
             writeSerialData(obj, desc); // Write instance data of the serialized object
         }
     } finally {
         if (extendedDebugInfo) {
             debugInfoStack.pop();
         }
     }
}

In this method TC_is first written to the bottom byte container OBJECT, meaning this is a new Object

/**
  * new Object.
  */
final static byte TC_OBJECT =       ( byte ) 0x73 ;

Next, the writeClassDesc() method is called to write the class metadata of the class being serialized, and the writeClassDesc() method is implemented as follows:

private void writeClassDesc(ObjectStreamClass desc, boolean unshared)
     throws IOException
{
     int handle;
     if (desc == null ) {
         // If desc is null
         writeNull();
     } else if (!unshared && (handle = handles.lookup(desc)) != - 1 ) {
         writeHandle(handle);
     } else if (desc.isProxy()) {
         writeProxyDesc(desc, unshared);
     } else {
         writeNonProxyDesc(desc, unshared);
     }
}

In this method, it first determines if the incoming desc is null or, if null, calls the writeNull() method

private void writeNull() throws IOException {
     // TC_NULL =         (byte)0x70;
     // Represents the end of a description of an Object reference
     bout.writeByte(TC_NULL);
}

If it is not null, the writeNonProxyDesc() method is typically called next, which implements the following:

private void writeNonProxyDesc(ObjectStreamClass desc, boolean unshared)
     throws IOException
{
     // TC_CLASSDESC =    (byte)0x72;
     // Represents a new Class descriptor
     bout.writeByte(TC_CLASSDESC);
     handles.assign(unshared ? null : desc);
 
     if (protocol == PROTOCOL_VERSION_1) {
         // do not invoke class descriptor write hook with old protocol
         desc.writeNonProxy( this );
     } else {
         writeClassDescriptor(desc);
     }
 
     Class cl = desc.forClass();
     bout.setBlockDataMode( true );
     if (cl != null && isCustomSubclass()) {
         ReflectUtil.checkPackageAccess(cl);
     }
     annotateClass(cl);
     bout.setBlockDataMode( false );
     bout.writeByte(TC_ENDBLOCKDATA);
 
     writeClassDesc(desc.getSuperDesc(), false );
}

In this method, a byte of TC_is written first CLASSDESC, which indicates that the next data is a new Class Descriptor, then writeNonProxy() method is called to write the actual class meta-information, as follows:

void writeNonProxy(ObjectOutputStream out) throws IOException {
     out.writeUTF(name); // Write the name of the class
     out.writeLong(getSerialVersionUID()); // Write Serial Number of Class
 
     byte flags = 0 ;
     // Get the identity of the class
     if (externalizable) {
         flags |= ObjectStreamConstants.SC_EXTERNALIZABLE;
         int protocol = out.getProtocolVersion();
         if (protocol != ObjectStreamConstants.PROTOCOL_VERSION_1) {
             flags |= ObjectStreamConstants.SC_BLOCK_DATA;
         }
     } else if (serializable) {
         flags |= ObjectStreamConstants.SC_SERIALIZABLE;
     }
     if (hasWriteObjectData) {
         flags |= ObjectStreamConstants.SC_WRITE_METHOD;
     }
     if (isEnum) {
         flags |= ObjectStreamConstants.SC_ENUM;
     }
     out.writeByte(flags); // Write flag to class
 
     out.writeShort(fields.length); // Number of fields written to the object
     for ( int i = 0 ; i < fields.length; i++) {
         ObjectStreamField f = fields[i];
         out.writeByte(f.getTypeCode());
         out.writeUTF(f.getName());
         if (!f.isPrimitive()) {
             // If it's not the original type, it's an object or an Interface
             // Writes a type string that represents an object or class
             out.writeTypeString(f.getTypeString());
         }
     }
}

The writeNonProxy() method writes data following several procedures:

1. Call the writeUTF() method to write the name of the class to which the object belongs. For this example, name = com.beautyboss.slogen.TestObject. For the writeUTF() method, the number of bytes of the name is written before the actual data is written, as follows:

void writeUTF(String s, long utflen) throws IOException {
         if (utflen > 0xFFFFL) {
             throw new UTFDataFormatException();
         }
         // Length of s written to two bytes
         writeShort(( int ) utflen);
         if (utflen == ( long ) s.length()) {
             writeBytes(s);
         } else {
             writeUTFBody(s);
         }
     }

2. The writeLong() method is then called to write the class's serial number UID, which is obtained by the getSerialVersionUID() method.

3. The flag of the class to which the serialized object belongs is then determined and written to the underlying byte container (which takes up two bytes). The flags of the class are divided into the following categories:

final static byte SC_EXTERNALIZABLE = 0 × 04; Indicates that the class is an Externalizable class, which implements the Externalizable interface.

final static byte SC_SERIALIZABLE = 0 × 02; Indicates that the class implements the Serializable interface.

final static byte SC_WRITE_METHOD = 0 × 01; Indicates that the class implements the Serializable interface and customizes the writeObject() method.

final static byte SC_ENUM = 0 × 10; Indicates that the class is an Enum type.

For this example, flag = 0 × 02 means only Serializable type.

4. The fourth step writes the metadata of the fields of the serialized object in turn.

<1>The number of fields that are first written to the serialized object, taking up two bytes. In this case, 2, because there are only two fields in the TestObject class, one is the testValue of type int and the other is the innerValue of type InnerObject.

<2>Write metadata for each field in turn. Each individual field is represented by the ObjectStreamField class.

1) Write the type code of the field, which takes up one byte. The mapping relationship of type codes is as follows:

  2) Call the writeUTF() method to write the name of each field. Note that the writeUTF() method writes the number of bytes occupied by the name first.

3) If the field being written is not a basic type, the writeTypeString() method is then called to write a type string representing the object or class, which requires a parameter representing the corresponding class or interface, and the writeString() method is finally called as follows

private void writeString(String str, boolean unshared) throws IOException {
     handles.assign(unshared ? null : str);
     long utflen = bout.getUTFLength(str);
     if (utflen <= 0xFFFF ) {
         // final static byte TC_STRING = (byte)0x74;
         // Indicates that the next byte represents a string
         bout.writeByte(TC_STRING);
         bout.writeUTF(str, utflen);
     } else {
         bout.writeByte(TC_LONGSTRING);
         bout.writeLongUTF(str, utflen);
     }
}

In this method, a flag bit TC_is written first STRING indicates that the next data is a string, followed by a call to writeUTF() to write the string.

After executing the above procedure, the program flow returns to the writeNonProxyDesc() method

private void writeNonProxyDesc(ObjectStreamClass desc, boolean unshared)
     throws IOException
{
     // Other ellipsis codes
 
     // TC_ENDBLOCKDATA = (byte)0x78;
     // Represents the end of a description block for an object
     bout.writeByte(TC_ENDBLOCKDATA);
 
     writeClassDesc(desc.getSuperDesc(), false ); // Tail recursive call, writing parent class metadata
}

Next, a byte of the token bit TC_is written ENDBLOCKDATA represents the end of a description block for an object.

The writeClassDesc() method is then called, passing in the ObjectStreamClass object of the parent class, to write its class metadata.

It is important to note that the method writeClassDesc() is a recursive call that ends with the condition that there is no parent, that is, the incoming ObjectStreamClass object is null, at which point a byte of identifier bit TC_is written NULL.

After the recursive call completes writing the class metadata of the class, the program execution process returns to the wriyeOrdinaryObject() method.

private void writeOrdinaryObject(Object obj,
                                  ObjectStreamClass desc,
                                  boolean unshared) throws IOException
{
     // Other ellipsis codes
     try {
         desc.checkSerialize();
         // Other ellipsis codes
         if (desc.isExternalizable() && !desc.isProxy()) {
             writeExternalData((Externalizable) obj);
         } else {
             writeSerialData(obj, desc); // Write instance data of the serialized object
         }
     } finally {
         if (extendedDebugInfo) {
             debugInfoStack.pop();
         }
     }
}

From the above analysis, we can know that when you write a class's metadata, you write the subclass's metadata first, then recursively call it to write the parent class's metadata.

Next, the writeSerialData() method is called to write the data of the field of the serialized object as follows:

private void writeSerialData(Object obj, ObjectStreamClass desc)
     throws IOException
{
     // Gets the lassDataSlot array that represents the layout of the data for the serialized object, with the parent class first
     ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
     for ( int i = 0 ; i < slots.length; i++) {
         ObjectStreamClass slotDesc = slots[i].desc;
         if (slotDesc.hasWriteObjectMethod()) {
            // If the serialized object implements the writeObject() method itself, the code in the if block is executed
 
            // Some omitted codes
         } else {
             // Call default method to write instance data
             defaultWriteFields(obj, slotDesc);
         }
     }
}

The getClassDataSlot() method is called first in this method to get the layout of the data for the serialized object, as described in the official documentation on this method:

/**
  * Returns array of ClassDataSlot instances representing the data layout
  * (including superclass data) for serialized objects described by this
  * class descriptor.  ClassDataSlots are ordered by inheritance with those
  * containing "higher" superclasses appearing first.  The final
  * ClassDataSlot contains a reference to this descriptor.
  */
  ClassDataSlot[] getClassDataLayout() throws InvalidClassException;

It is important to note that this method returns the data inherited from the parent class together and that the CassDataSlot object representing the data inherited from the parent class is at the top of the array.

For objects that do not have a custom writeObject() method, the defaultWriteFields() method is then called to write data, which is implemented as follows:

private void defaultWriteFields(Object obj, ObjectStreamClass desc)
     throws IOException
{
     // Some other omission codes
 
     int primDataSize = desc.getPrimDataSize();
     if (primVals == null || primVals.length < primDataSize) {
         primVals = new byte [primDataSize];
     }
     // Get data of the basic data type in the corresponding class and save it in the primVals byte array
     desc.getPrimFieldValues(obj, primVals);
     // Write data of the basic data type into the underlying byte container
     bout.write(primVals, 0 , primDataSize, false );
 
     // Gets all the field objects of the corresponding class
     ObjectStreamField[] fields = desc.getFields( false );
     Object[] objVals = new Object[desc.getNumObjFields()];
     int numPrimFields = fields.length - objVals.length;
     // Save objects of the Object type (not the original type) of the corresponding class in the objVals array
     desc.getObjFieldValues(obj, objVals);
     for ( int i = 0 ; i < objVals.length; i++) {
         // Some omitted code
 
         try {
             // Recursively call the writeObject0() method on all Object types of fields to write the corresponding data
             writeObject0(objVals[i],
                          fields[numPrimFields + i].isUnshared());
         } finally {
             if (extendedDebugInfo) {
                 debugInfoStack.pop();
             }
         }
     }
}

You can see that the following things are done in this method:

<1>Get data for the basic type of field of the corresponding class and write it to the underlying byte container.
<2>Get the field members of the Object type (non-basic type) of the corresponding class and recursively call the writeObject0() method to write the corresponding data.

From the analysis of the written data above, you can see that the written data is written in the order of parent and child.

So far, the Java serialization process has been analyzed and summarized as follows:

Now you can analyze the contents of the temp.out file written in the next step.

aced        Stream Magic
0005        Serialized Version Number
73          Sign bits:TC_OBJECT,Means next is a new Object
72          Sign bits:TC_CLASSDESC,Means next is Class Description
0020        Class name length is 32
636f 6d2e 6265 6175 7479 626f 7373 2e73 com.beautyboss.s
6c6f 6765 6e2e 5465 7374 4f62 6a65 6374 logen.TestObject
d3c6 7e1c 4f13 2afe serial number
02          flag,Serializable
00 02       TestObject Number of fields, 2
49          TypeCode,I,Express int type
0009        Field name length, 9 bytes
7465 7374 5661 6c75 65      Field name:testValue
4c          TypeCode:L,Indicates a Class perhaps Interface
000b        Field name length, 11 bytes
696e 6e65 724f 626a 6563 74 Field name:innerObject
74          Sign bits: TC_STRING,Indicates that the following data is a string
0023        Class name length, 35 bytes
4c63 6f6d 2f62 6561 7574 7962 6f73 732f  Lcom/beautyboss/
736c 6f67 656e 2f49 6e6e 6572 4f62 6a65  slogen/InnerObje
6374 3b                                  ct;
78          Sign bits:TC_ENDBLOCKDATA,End of data block description for object

Next start writing data, starting with Parent

0000   0064   Value of parentValue:   100

0000   The value of 012c testValue:   300

The next step is to write class meta information for InnerObject

73 Sign bits,TC_OBJECT:Means next is a new Object
72 Sign bits,TC_CLASSDESC: Means next is Class Description
0021 Class name length, 33
636f 6d2e 6265 6175 7479 626f 7373 com.beautyboss
2e73 6c6f 6765 6e2e 496e 6e65 724f .slogen.InnerO
626a 6563 74 bject
4f2c 148a 4024 fb12 serial number
02 flag,Represents serializable
0001 Number of fields, 1
49 TypeCode,I,Express int type
00 0a Field name length, 10 Bytes
69 6e6e 6572 5661 6c75 65 innerValue
78 Sign bits:TC_ENDBLOCKDATA,End of data block description for object
70 Sign bits:TC_NULL,Null object reference.
0000 00c8 innervalue Value of: 200

Deserialization: readObject()

The deserialization process is to parse binary data according to the serialization algorithm described earlier.

One thing to note is what happens when a child class implements the Serializable interface but a parent class does not implement the Serializable interface?

Answer: If the parent class has a default constructor, it will not be a problem even if the Serializable interface is not implemented. The default constructor will be called to initialize when deserializing. Otherwise, it will be thrown when deserializing. InvalidClassException: Exception. The reason for the exception is no valid constructor.

static and transient fields cannot be serialized

When serialized, all data comes from the ObejctStreamClass object and fields = getSerialFields(cl) is called in the constructor that generates the ObjectStreamClass; This code gets the fields that need to be serialized. The getSerialFields() method actually calls the getDefaultSerialFields() method, which is implemented as follows:

private static ObjectStreamField[] getDefaultSerialFields(Class<?> cl) {
     Field[] clFields = cl.getDeclaredFields();
     ArrayList<ObjectStreamField> list = new ArrayList<>();
     int mask = Modifier.STATIC | Modifier.TRANSIENT;
 
     for ( int i = 0 ; i < clFields.length; i++) {
         if ((clFields[i].getModifiers() & mask) == 0 ) {
             // If the field is neither static nor transient, it will be added to the list of fields that need to be serialized
             list.add( new ObjectStreamField(clFields[i], false , true ));
         }
     }
     int size = list.size();
     return (size == 0 ) ? NO_FIELDS :
         list.toArray( new ObjectStreamField[size]);
}

It is clear from the code above that the fields modified by static s and transient s are filtered out when calculating the fields that need to be serialized.

Default values are given when deserializing.

How do I implement custom serialization and deserialization?

Just need the class to which the serialized object belongs to define the void writeObject(ObjectOutputStream oos) and void readObject(ObjectInputStream ois) methods, which are called when Java serializes and deserializes. How does this work?

1. There are the following lines of code in the constructor of the ObjectClassStream class:

cons = getSerializableConstructor(cl);
writeObjectMethod = getPrivateMethod(cl, "writeObject" ,
     new Class<?>[] { ObjectOutputStream. class },
     Void.TYPE);
readObjectMethod = getPrivateMethod(cl, "readObject" ,
     new Class<?>[] { ObjectInputStream. class },
     Void.TYPE);
readObjectNoDataMethod = getPrivateMethod(
     cl, "readObjectNoData" , null , Void.TYPE);
hasWriteObjectData = (writeObjectMethod != null );

The getPrivateMethod() method is implemented as follows:

private static Method getPrivateMethod(Class<?> cl, String name,
                                    Class<?>[] argTypes,
                                    Class<?> returnType)
{
     try {
         Method meth = cl.getDeclaredMethod(name, argTypes);
         meth.setAccessible( true );
         int mods = meth.getModifiers();
         return ((meth.getReturnType() == returnType) &&
                 ((mods & Modifier.STATIC) == 0 ) &&
                 ((mods & Modifier.PRIVATE) != 0 )) ? meth : null ;
     } catch (NoSuchMethodException ex) {
         return null ;
     }
}

You can see that in the constructor of ObejctStreamClass, the function defined as void writeObject(ObjectOutputStream oos) is found in the serialized class, and if found, the method found is assigned to the variable writeObjectMethod, or null if not found.

2. When calling the writeSerialData() method to write serialized data

private void writeSerialData(Object obj, ObjectStreamClass desc)
     throws IOException
{
     ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
     for ( int i = 0 ; i < slots.length; i++) {
         ObjectStreamClass slotDesc = slots[i].desc;
         if (slotDesc.hasWriteObjectMethod()) {
             // Some other omission codes
             try {
                 curContext = new SerialCallbackContext(obj, slotDesc);
                 bout.setBlockDataMode( true );
                 // Call user-defined methods here
                 slotDesc.invokeWriteObject(obj, this );
                 bout.setBlockDataMode( false );
                 bout.writeByte(TC_ENDBLOCKDATA);
             } finally {
                 curContext.setUsed();
                 curContext = oldContext;
                 if (extendedDebugInfo) {
                     debugInfoStack.pop();
                 }
             }
 
             curPut = oldPut;
         } else {
             defaultWriteFields(obj, slotDesc);
         }
     }
}

The hasWriteObjectMethod() method is called first to determine if there is a custom writeObject(), and the code is as follows

boolean hasWriteObjectMethod() {
     return (writeObjectMethod != null );
}

hasWriteObjectMethod() is just a way to determine if writeObjectMethod is equal to null. As mentioned above, if a user customizes such a method as void writeObject(ObjectOutputStream oos), the writeObjectMethod is not null and slotDesc.invokeWriteObject(obj, this) is called in the if() code block; Method, which calls the user-defined writeObject() method

Posted by utahcon on Fri, 03 Dec 2021 11:13:51 -0800