Friday, November 7, 2008

Serialization - Basics

Serialization is the process of converting an object or a con-nected graph of objects into a contiguous stream of bytes. Deserialization is the process of converting a contiguous stream of bytes back into its graph of connected objects. The ability to convert objects to and from a byte stream is an incredibly useful mechanism.

Formatters

Formatters know how to serialize the complete object graph by referring to the metadata that describes each object's type. The Serialize method uses reflection to see what instance fields are in each object's type as it is serialized. If any of these fields refer to other objects, then the formatter's Serialize method knows to serialize these objects, too.

Formatters have very intelligent algorithms. They know to serialize each object in the graph out to the stream no more than once. That is, if two objects in the graph refer to each other, then the formatter detects this, serializes each object just once, and avoids entering into an infinite loop.

1. Binary formatter
2. Soap formatter

Serialization Steps

1. The developer must apply the System.SerializableAttribute custom attribute to this type he wants to serialize.

2. When serializing an object, the full name of the type and the name of the type's defining assembly are written to the byte stream. By default, the BinaryFormatter and SoapFormatter types output the assembly's full identity. However, you can make these formatters write the simple assembly name (just file name; no version, culture, or public key information) for each serialized type by setting the formatter's AssemblyFormat property to FormatterAssemblyStyle.Simple.

3. When serializing a graph of objects, some of the object's types may be serializable while some of the objects may not be serializable. For performance reasons, formatters do not verify that all of the objects in the graph are serializable before serializing the graph. So, when serializing an object graph, it is entirely possible that some objects may be serialized to the byte stream before the SerializationException is thrown. If this happens, the byte stream is corrupt.

Your application code should try to recover gracefully from this situation. If you think you may be serializing an object graph where some objects may not be serializable, I recommend that you serialize the objects into a MemoryStream first. Then, if all objects are successfully serialized, you can copy the bytes in the MemoryStream to whatever stream (file or network, for example) you really want the bytes written to.

4. When you apply the SerializableAttribute custom attribute to a type, all instance fields (public, private, protected, and so on) are serialized.



Deserialization Steps

1. When deserializing an object, the formatter first grabs the assembly identity and ensures that the assembly is loaded into the executing AppDomain.

Calls Assembly class's Load or LoadWithPartialName methods.

2. After an assembly has been loaded, the formatter looks in the assembly for a type matching that of the object being deserialized. If the assembly doesn't contain a matching type, an exception is thrown and no more objects can be deserialized.

3. If a matching type is found, an instance of the type is created and its fields are initialized from the values contained in the byte stream.

4. If you use Assembly.LoadFrom to load an assembly and then construct objects from types defined in the loaded assembly. These objects can be serialized to a stream without any trouble. However, when deserializing this stream, the formatter attempts to load the assembly by calling Assembly's Load or LoadWithPartialName method instead of calling the LoadFrom method.

You implement a method whose signature matches the System.ResolveEventHandler delegate and register this method with System.AppDomain's AssemblyResolve event just before calling a formatter's Deserialize method. (Unregister this method with the event after Deserialize returns.) Now, whenever the formatter fails to load an assembly, the CLR calls your ResolveEventHandler method. The identity of the assembly that failed to load is passed to this method. The method can extract the assembly file name from the assembly's identity and use this name to construct the path where the application knows the assembly file can be found. Then, the method can call Assembly.LoadFrom to load the assembly and return the resulting Assembly reference back from the ResolveEventHandler method.

Summary of http://msdn.microsoft.com/en-us/magazine/cc301761.aspx

No comments: