|
|
Organizations that need to exchange information can agree on a standard set of vocabularies. Alternatively, they can enlist a transformation technology to dynamically translate vocabulary formats.
|
|
Vocabularies can be formally defined using an XML schema language. The same way database schemas establish a structural model for the data they represent, XML schemas define the structure of XML documents. They protect the integrity of XML document data by providing structure, validation rules, type constraints and inter-element relationships. In other words, XML schemas dictate what can and cannot be done with XML data.
|
|
|
Numerous XML schema languages exist. The two most common are explained in the “Document Type Definitions (DTD)” and “XML Schema Definition Language” tutorials, following this section.
|
|
XML documents are generally manipulated using tree-based, event-based or class-based interfaces. The W3C provides a standard tree-based API called the Document Object Model (DOM). The most popular event-based API is the Simple API for XML (SAX), and most development platforms offer proprietary data binding APIs that supply a class-based interface into XML documents.
|
|
Vendor-specific implementations of DOM and SAX APIs can vary in their compliance to the DOM and SAX standards. Some that do comply further increase standard functionality by adding proprietary extensions. Using a compatible programming language, you can interact with the parser's API to manipulate XML documents in many different ways.
|
|
The Document Object Model expresses an XML document using a hierarchical tree view. Each branch of the tree represents an element in the hierarchy. The DOM classifies these elements as nodes, and the API provided by the DOM is also referred to as the node interface . Though the use of DOM-compliant APIs is very common, they can introduce some performance challenges. The API loads the entire XML tree view into memory, which can consume a significant amount of resources when processing larger sized documents.
|
|
The event-based API provided by SAX establishes a linear processing model that notifies the application logic of certain events prior to delivering the data. This approach is very efficient, and addresses many of the performance concerns of DOM. The SAX and DOM APIs complement each other and collectively provide a flexible programming model for XML.
|
|
Data binding APIs are a departure from the structure-oriented nature of DOM and SAX. They allow for a data-centric programming approach, where business classes are provided as the interface into XML document data. Many variations of data binding APIs exist, each with a unique feature-set.
|
|
Let’s take a brief look inside a simple XML document. The first line of markup you will encounter is the XML declaration. It establishes the version of the XML specification being used:
|
|
<?xml version=”1.0”?>
|
|
The part of a document within which data is represented is considered the document instance. It consists of a series of elements that tag data values with meta information.
|
|
An XML document instance orders its information into a hierarchical structure, defined by parent-child relationships between elements. Typically, the parent element establishes a context that is inherited by the child element.
|
In the example below, for instance, the book element has two child elements, title and author:
<book>
<title>Joy of Integration</title>
<author>Joe Smith</author>
</book>
|
|
Individual elements can also have properties, known as attributes. Whereas a parent element can have multiple layers of nested child elements, it can only have a one-to-one relationship with an attribute.
|
In our example, we’ve added the category attribute to the book element:
|
<book category=”Fiction”>
...
</book>
|
To associate a document with a DTD, a separate declaration statement is typically required. Here we link the DTD to our XML document.
|
|
<!DOCTYPE book SYSTEM “book.dtd”>
|
|
Finally, here’s a look at the entire document we just built.
|
<?xml version=”1.0”?>
<!DOCTYPE book SYSTEM “book.dtd”>
<book category=”Fiction”>
<title>Joy of Architecture</title>
<author>Joe Smith</author>
</book>
|
The syntactical conventions introduced here form the basis for all specifications that exist as specialized implementations (or applications) of XML.
|