Serialization is the process of saving an object's state to a sequence of bytes; deserialization is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to handle object serialization. In this tip, you will see how to serialize an object, and why serialization is sometimes necessary. You'll learn about the serialization algorithm used in Java, and see an example that illustrates the serialized format of an object. By the time you're done, you should have a solid knowledge of how the serialization algorithm works and what entities are serialized as part of the object at a low level.
Why is serialization required?
In today's world, a typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java, everything is represented as objects; if two Java components want to communicate with each other, there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object, which would make it very difficult to talk to third-party components. Hence, there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose, and Java components use this protocol to transfer objects.Figure 1 shows a high-level view of client/server communication, where an object is transferred from the client to the server through serialization.

Figure 1. A high-level view of serialization in action (click to enlarge)
How to serialize an object
In order to serialize an object, you need to ensure that the class of the object implements thejava.io.Serializable
interface, as shown in Listing 1.Listing 1. Implementing Serializable
In Listing 1, the only thing you had to do differently from creating a normal class is implement theimport java.io.Serializable;
class TestSerial implements Serializable {
public byte version = 100;
public byte count = 0;
}
java.io.Serializable
interface. The Serializable
interface is a marker interface; it declares no methods at all. It tells the serialization mechanism that the class can be serialized.Now that you have made the class eligible for serialization, the next step is to actually serialize the object. That is done by calling the
writeObject()
method of the java.io.ObjectOutputStream
class, as shown in Listing 2.Listing 2. Calling writeObject()
Listing 2 stores the state of thepublic static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
TestSerial ts = new TestSerial();
oos.writeObject(ts);
oos.flush();
oos.close();
}
TestSerial
object in a file called temp.out
. oos.writeObject(ts);
actually kicks off the serialization algorithm, which in turn writes the object to temp.out
.To re-create the object from the persistent file, you would employ the code in Listing 3.
Listing 3. Recreating a serialized object
In Listing 3, the object's restoration occurs with thepublic static void main(String args[]) throws IOException {
FileInputStream fis = new FileInputStream("temp.out");
ObjectInputStream oin = new ObjectInputStream(fis);
TestSerial ts = (TestSerial) oin.readObject();
System.out.println("version="+ts.version);
}
oin.readObject()
method call. This method call reads in the raw bytes that we previously persisted and creates a live object that is an exact replica of the original object graph. Because readObject()
can read any serializable object, a cast to the correct type is required. Executing this code will print
version=100
on the standard output.The serialized format of an object
What does the serialized version of the object look like? Remember, the sample code in the previous section saved the serialized version of theTestSerial
object into the file temp.out
. Listing 4 shows the contents of temp.out
, displayed in hexadecimal. (You need a hexadecimal editor to see the output in hexadecimal format.)Listing 4. Hexadecimal form of TestSerial
If you look again at the actualAC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64
TestSerial
object, you'll see that it has only two byte members, as shown in Listing 5.Listing 5. TestSerial's byte members
The size of a byte variable is one byte, and hence the total size of the object (without the header) is two bytes. But if you look at the size of the serialized object in Listing 4, you'll see 51 bytes. Surprise! Where did the extra bytes come from, and what is their significance? They are introduced by the serialization algorithm, and are required in order to to re-create the object. In the next section, you'll explore this algorithm in detail.public byte version = 100;
public byte count = 0;
Java's serialization algorithm
By now, you should have a pretty good knowledge of how to serialize an object. But how does the process work under the hood? In general the serialization algorithm does the following:- It writes out the metadata of the class associated with an instance.
- It recursively writes out the description of the superclass until it finds
java.lang.object
. - Once it finishes writing the metadata information, it then starts with the actual data associated with the instance. But this time, it starts from the topmost superclass.
- It recursively writes the data associated with the instance, starting from the least superclass to the most-derived class.
Listing 6. Sample serialized object
This example is a straightforward one. It serializes an object of typeclass parent implements Serializable {
int parentVersion = 10;
}
class contain implements Serializable{
int containVersion = 11;
}
public class SerialTest extends parent implements Serializable {
int version = 66;
contain con = new contain();
public int getVersion() {
return version;
}
public static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
SerialTest st = new SerialTest();
oos.writeObject(st);
oos.flush();
oos.close();
}
}
SerialTest
, which is derived from parent
and has a container object, contain
. The serialized format of this object is shown in Listing 7.Listing 7. Serialized form of sample object
Figure 2 offers a high-level look at the serialization algorithm for this scenario.AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 02 49 00 07
76 65 72 73 69 6F 6E 4C 00 03 63 6F 6E 74 00 09
4C 63 6F 6E 74 61 69 6E 3B 78 72 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 01 49 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
00 00 00 0A 00 00 00 42 73 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 01 49 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
70 00 00 00 0B

Figure 2. An outline of the serialization algorithm
Let's go through the serialized format of the object in detail and see what each byte represents. Begin with the serialization protocol information:AC ED
:STREAM_MAGIC
. Specifies that this is a serialization protocol.00 05
:STREAM_VERSION
. The serialization version.0x73
:TC_OBJECT
. Specifies that this is a newObject
.
SerialTest
, so the algorithm starts by writing the description of the SerialTest
class.0x72
:TC_CLASSDESC
. Specifies that this is a new class.00 0A
: Length of the class name.53 65 72 69 61 6c 54 65 73 74
:SerialTest
, the name of the class.05 52 81 5A AC 66 02 F6
:SerialVersionUID
, the serial version identifier of this class.0x02
: Various flags. This particular flag says that the object supports serialization.00 02
: Number of fields in this class.
int version = 66;
.0x49
: Field type code. 49 represents "I", which stands forInt
.00 07
: Length of the field name.76 65 72 73 69 6F 6E
:version
, the name of the field.
contain con = new contain();
. This is an object, so it will write the canonical JVM signature of this field.0x74
:TC_STRING
. Represents a new string.00 09
: Length of the string.4C 63 6F 6E 74 61 69 6E 3B
:Lcontain;
, the canonical JVM signature.0x78
:TC_ENDBLOCKDATA
, the end of the optional block data for an object.
parent
class, which is the immediate superclass of SerialTest
.0x72
:TC_CLASSDESC
. Specifies that this is a new class.00 06
: Length of the class name.70 61 72 65 6E 74
:SerialTest
, the name of the class0E DB D2 BD 85 EE 63 7A
:SerialVersionUID
, the serial version identifier of this class.0x02
: Various flags. This flag notes that the object supports serialization.00 01
: Number of fields in this class.
parent
class. parent
has one field, int parentVersion = 100;
.0x49
: Field type code. 49 represents "I", which stands forInt
.00 0D
: Length of the field name.70 61 72 65 6E 74 56 65 72 73 69 6F 6E
:parentVersion
, the name of the field.0x78
:TC_ENDBLOCKDATA
, the end of block data for this object.0x70
:TC_NULL
, which represents the fact that there are no more superclasses because we have reached the top of the class hierarchy.
00 00 00 0A
: 10, the value ofparentVersion
.
SerialTest
.00 00 00 42
: 66, the value ofversion
.
contain
object, shown in Listing 8.Listing 8. The contain object
Remember, the serialization algorithm hasn't written the class description for thecontain con = new contain();
contain
class yet. This is the opportunity to write this description.0x73
:TC_OBJECT
, designating a new object.0x72
:TC_CLASSDESC
.00 07
: Length of the class name.63 6F 6E 74 61 69 6E
:contain
, the name of the class.FC BB E6 0E FB CB 60 C7
:SerialVersionUID
, the serial version identifier of this class.0x02
: Various flags. This flag indicates that this class supports serialization.00 01
: Number of fields in this class.
contain
's only field, int containVersion = 11;
.0x49
: Field type code. 49 represents "I", which stands forInt
.00 0E
: Length of the field name.63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E
:containVersion
, the name of the field.0x78
:TC_ENDBLOCKDATA
.
contain
has any parent classes. If it did, the algorithm would start writing that class; but in this case there is no superclass for contain
, so the algorithm writes TC_NULL
.0x70
:TC_NULL
.
contain
.00 00 00 0B
: 11, the value ofcontainVersion
.
No comments:
Post a Comment