.NET: November 2008

Friday, November 7, 2008

Serialization - Advanced Topics

1. IDeserializationCallback

2. FormatterServices

3. Serializing a type that was not designed to be serializable.

4. Overriding the Assembly or Type When Deserializing an Object

http://msdn.microsoft.com/en-us/magazine/cc188950.aspx

Serialization - ISerializable

Occasionally, you will design a type that requires complete control over how it is serialized and deserialized. The type must

1. Implement


public interface ISerializable {
   void GetObjectData(SerializationInfo, StreamingContext context);
}

The GetObjectData method is responsible for determining what information is necessary to serialize the object and adds this information to the SerializationInfo object. The formatter now takes all of the values added to the SerializationInfo object and serializes each of them to the byte stream.

There are many destinations for a serialized set of objects: same process, different process on the same machine, different process on a different machine, and so on. In some rare situations, an object might want to know where it is going to be deserialized so that it can emit its state differently.

A method that receives a StreamingContext structure can examine the State property's bit flags to determine the source or destination of the objects being serialized/deserialized.

Note : Always call an overloaded AddValue methods to add serialization information for your type. If a field's type implements the ISerializable interface, don't call the GetObjectData on the field.

2. A constructor

If your class is sealed, I highly recommend that you declare this special constructor to be private.


protected Hashtable(
      SerializationInfo info, StreamingContext context) {
}

Note: Instead of calling the various Get methods, the special constructor could instead call GetEnumerator, which returns a SerializationInfoEnumerator object that iterates through all the values contained within the SerializationInfo object. Each value enumerated is a System.Runtime.Serialization.SerializationEntry object.

A type may include fields that refer to other objects. When the special constructor is called, any fields that refer to other objects are guaranteed to be set correctly. That is, the fields' values will contain references to allocated objects. As these referenced objects may not have had their fields initialized yet, you should not execute any code in the special constructor that accesses any members on a referenced object.

If your type must access members (such as call methods) on a referenced type, then it is recommended that your type also implement the IDeserializationCallback interface's OnDeserialization method. When this method is called, all objects have had their fields set. But there's no way to tell what order multiple objects have their OnDeserialization method called. So, while the fields may be initialized, you still don't know if a referenced object is completely deserialized if that referenced object also implements the IDeserializationCallback interface.

Case 1 : Base class implements ISerializable

If your type also implements ISerializable, then your implementation of GetObjectData and your implementation of the special constructor must call the same functions in the base class in order for the object to be serialized and deserialized properly.

If your derived type doesn't have any additional fields and therefore has no special serialization/deserialization needs, then you do not have to implement ISerializable at all. Like all interface members, GetObjectData is virtual and will be called to properly serialize the object. In addition, the formatter treats the special constructor as "virtualized." That is, during deserialization, the formatter will check the type that it is trying to instantiate. If that type doesn't offer the special constructor, then the formatter will scan all of the base classes until it finds one that implements the special constructor.

Case 2 : Base class does not implement ISerializable

In this case, your class must manually serialize the base type's fields.



void ISerializable.GetObjectData(
      SerializationInfo info, StreamingContext context) {

      // Serialize the desired values for this class
      info.AddValue("title", title);

      // Get the set of serializable members for our class and base classes
      Type thisType = this.GetType();
      MemberInfo[] mi = 
         FormatterServices.GetSerializableMembers(thisType, context);

      // Serialize the base class's fields to the info object
      for (Int32 i = 0 ; i < mi.Length; i++) {
         // Don't serialize fields for this class
         if (mi[i].DeclaringType == thisType) continue;
         info.AddValue(mi[i].Name, ((FieldInfo) mi[i]).GetValue(this));
      }
   }

Summary of http://msdn.microsoft.com/en-us/magazine/cc301767.aspx

Serialization - Basics

Serialization is the process of converting an object or a con-nected graph of objects into a contiguous stream of bytes. Deserialization is the process of converting a contiguous stream of bytes back into its graph of connected objects. The ability to convert objects to and from a byte stream is an incredibly useful mechanism.

Formatters

Formatters know how to serialize the complete object graph by referring to the metadata that describes each object's type. The Serialize method uses reflection to see what instance fields are in each object's type as it is serialized. If any of these fields refer to other objects, then the formatter's Serialize method knows to serialize these objects, too.

Formatters have very intelligent algorithms. They know to serialize each object in the graph out to the stream no more than once. That is, if two objects in the graph refer to each other, then the formatter detects this, serializes each object just once, and avoids entering into an infinite loop.

1. Binary formatter
2. Soap formatter

Serialization Steps

1. The developer must apply the System.SerializableAttribute custom attribute to this type he wants to serialize.

2. When serializing an object, the full name of the type and the name of the type's defining assembly are written to the byte stream. By default, the BinaryFormatter and SoapFormatter types output the assembly's full identity. However, you can make these formatters write the simple assembly name (just file name; no version, culture, or public key information) for each serialized type by setting the formatter's AssemblyFormat property to FormatterAssemblyStyle.Simple.

3. When serializing a graph of objects, some of the object's types may be serializable while some of the objects may not be serializable. For performance reasons, formatters do not verify that all of the objects in the graph are serializable before serializing the graph. So, when serializing an object graph, it is entirely possible that some objects may be serialized to the byte stream before the SerializationException is thrown. If this happens, the byte stream is corrupt.

Your application code should try to recover gracefully from this situation. If you think you may be serializing an object graph where some objects may not be serializable, I recommend that you serialize the objects into a MemoryStream first. Then, if all objects are successfully serialized, you can copy the bytes in the MemoryStream to whatever stream (file or network, for example) you really want the bytes written to.

4. When you apply the SerializableAttribute custom attribute to a type, all instance fields (public, private, protected, and so on) are serialized.

Deserialization Steps

1. When deserializing an object, the formatter first grabs the assembly identity and ensures that the assembly is loaded into the executing AppDomain.

Calls Assembly class's Load or LoadWithPartialName methods.

2. After an assembly has been loaded, the formatter looks in the assembly for a type matching that of the object being deserialized. If the assembly doesn't contain a matching type, an exception is thrown and no more objects can be deserialized.

3. If a matching type is found, an instance of the type is created and its fields are initialized from the values contained in the byte stream.

4. If you use Assembly.LoadFrom to load an assembly and then construct objects from types defined in the loaded assembly. These objects can be serialized to a stream without any trouble. However, when deserializing this stream, the formatter attempts to load the assembly by calling Assembly's Load or LoadWithPartialName method instead of calling the LoadFrom method.

You implement a method whose signature matches the System.ResolveEventHandler delegate and register this method with System.AppDomain's AssemblyResolve event just before calling a formatter's Deserialize method. (Unregister this method with the event after Deserialize returns.) Now, whenever the formatter fails to load an assembly, the CLR calls your ResolveEventHandler method. The identity of the assembly that failed to load is passed to this method. The method can extract the assembly file name from the assembly's identity and use this name to construct the path where the application knows the assembly file can be found. Then, the method can call Assembly.LoadFrom to load the assembly and return the resulting Assembly reference back from the ResolveEventHandler method.

Summary of http://msdn.microsoft.com/en-us/magazine/cc301761.aspx

Saturday, November 1, 2008

c# Tips

1. as and is operator

if (o is Employee)
{
Employee e = (Employee) o;
}

In this code, the CLR is actually checking the object's type twice: The is operator first checks to see if o is compatible with the Employee type. If it is, inside the if statement, the CLR again verifies that o refers to an Employee when performing the cast.

C# offers a way to simplify this code and improve its performance by providing an as operator:

Employee e = o as Employee;
if (e != null) {
// Use e within the 'if' statement.
}

2. What should one use string or String ?

Because in C# the string (a keyword) maps exactly to System.String (an FCL type), there is no difference and either can be used.

object and string are also primitive types.

3. Type casting

C# allows implicit casts if the conversion is "safe," that is, no loss of data is possible, such as converting an Int32 to an Int64. But C# requires explicit casts if the conversion is potentially unsafe.

4. Checked and Unchecked Primitive Type Operations

a. CLR offers IL instructions that allow the compiler to choose the desired behavior. The CLR has an instruction called add that adds two values together. The
add instruction performs no overflow checking. The CLR also has an instruction called add.ovf that also adds two values together. However, add.ovf throws a System.OverflowException if an overflow occurs.

b. C# allows the programmer to decide how overflows should be handled. By default, overflow checking is turned off. As a result, the code runs faster—but developers must be assured that overflows won't occur or that their code is designed to anticipate these overflows.

One way to get the C# compiler to control overflows is to use the /checked+ compiler switch. The code executes more slowly because the CLR is checking these operations to determine whether an overflow will occur.

c. There are also checked/unchecked operators/statements.

Byte b = 100;
b = checked((Byte) (b + 200)); // OverflowException is thrown

d. Here's the best way to go about using checked and unchecked:

i. As you write your code, explicitly use checked around blocks where an unwanted overflow might occur due to invalid input data, such as processing a request with data supplied from an end user or a client machine.

ii. As you write your code, explicitly use unchecked around blocks where an overflow is OK, such as calculating a checksum.

iii. For any code that doesn't use checked or unchecked, the assumption is that you do want an exception to occur on overflow.

Now, as you develop your application, turn on the compiler's /checked+ switch for debug builds. Your application will run more slowly because the system will be checking for overflows on any code that you didn't explicitly mark as checked or unchecked. If an exception occurs, you'll easily detect it and be able to fix the bug in your code. For the release build of your application, use the compiler's /checked- switch so that the code runs faster and exceptions won't be generated.

5. Value types and reference types

a. Value type instances are usually allocated on a thread's stack (although they can also be embedded in a reference type object).

b. Reference types are always allocated from the managed heap.

c. Value type are sealed.

d. Value type can implement interfaces.

e. Value types have two representations - boxed and unboxed.

f. Value types can't be assigned null.

g. When you assign a value type variable to another value type variable, a field-by-field copy is made.

h. C# compiler selects LayoutKind.Auto for reference types (classes) and LayoutKind.Sequential for value types (structures). However, if you're creating a value type that has nothing to do with interoperability with unmanaged code, you probably want to override the C# compiler's default.

i. The StructLayoutAttribute also allows you to explicitly indicate the offset of each field by passing LayoutKind.Explicit to its constructor. Then you apply an instance of the System.Runtime.InteropServices.FieldOffsetAttribute. This allows you to create unions in C#.

.NET