Sunday, June 6, 2010

XPathDocument vs XmlDocument

XmlDocument
XPathDocument
- Based on W3C DOM Model
- Loads entire Xml in memory
- Read/Write
- Slower than XPathDocument
- Support for XPath/XSLT
- Supports both XPathNavigator and DOM interfaces
- Based on XPath Data Model
- Loads entire Xml in memory
- Read Only
- Faster than XmlDocument
- Optimized support for XPath/XSLT
- Supports only XPathNavigator interfaces


XmlDocument is based on the W3C XML DOM, which is an object model that basically covers all XML syntaxes, including low-level syntax sugar such as entities, CDATA sections, DTD, notations, etc. That's a document-centric model and it allows for full fidelity when loading/saving XML documents.

XPathDocument is based on an XPath 1.0 data model that is a read-only XML Infoset-compatible data-centric object model that covers only semantically significant parts of XML, leaving out insignificant syntax details - no DTD, no entities, no CDATA, no adjacent text nodes, only significant data expressed as a tree with seven types of nodes.

Simple and lightweight. That's why XPathDocument is a preferred data store for read-only scenarios, especially with XPath or XSLT involved.

Saturday, June 5, 2010

Reading an Xml

1. Parsing the XML 1.0 byte stream

a. XmlTextReader : Parses the XML 1.0 Byte Stream and the complexities of the XML 1.0 syntax by serving up the document as a logical-tree structure through higher-level APIs.

- Performance
- Memory
- Traversal
- Operation
- XPath
- XSLT
- Fastest
- Most efficient as only one node needs to be in memory
- Forward Only
- Read Only
- No
- No

2. Processing the Logical Tree via XML APIs

i. Streaming

a. XmlReader

- Models read an Xml as a forward-only, linear stream of nodes.
- XmlReader allows the client to pull the nodes one at a time much like the firehose cursor model in data access technology.


ii. Traversal Oriented

a. XmlNode (XmlDocument)

- Built on top of XmlReader


- Performance
- Memory
- Traversal
- Operation
- XPath
- XSLT
- 2 to 3x slower than XmlTextReader
- Loads entire Xml/Tree Structure in Memory
- Full Traversal
- Read/Write
- Yes
- No

b. XPathNavigator

- Uses a cursor model, which gives the underlying implementation more options in terms of how the tree is actually stored.

- Performance
- Memory
- Traversal
- Operation
- XPath
- XSLT
- Faster than XmlDocument
- More efficient than XmlDocument
- Full Traversal
- Read Only
- Yes
- Yes

3. Choosing which class


  • What kind of reader should I use?
    Use XmlTextReader if:
    * Performance is your highest priority and…
    * You don't need XSD/DTD validation and…
    * You don't need XSD type information at runtime and…
    * You don't need XPath/XSLT services

    Use XmlValidatingReader if:

    * You need XSD/DTD validation or…
    * You need XSD type information at runtime or…
  • Should I load the tree into memory?

    Use the DOM if:

    * Productivity is your highest priority or…
    * You need XPath services or…
    * You need to update the document (read/write)
  • XmlDocument or XPathDocument?
    * You need to execute an XSLT transformation or…
    * You want to leverage an implementation (like XPathDocument)