What is XML-Schema Toolkit?
XML Schema Toolkit is a set of applications which implement various XML vocabularies. The most
important application is SchemaCoder.exe, which implements the schema for the W3C XML Schema
standard. This means that a schema (which is written in XML) can be parsed and a set of code
files produced. These files can be compiled into a library that understands the structure
created by the schema. So a schema that defines an email element can be parsed by
SchemaCoder, and code files will be produced that can recognise the email element in
an XML document. The code files can be improved by a programmer so that the email element
could begin an electronic email when it is parsed.
What's wrong with DOM?
Nothing, if that is what you require. The DOM (which stands for Document Object Model) creates
element, attribute, character data and other type objects whenever an XML document is parsed,
and that can be used to figure out the information in the document and act upon it. The
problem in an object-oriented language is that unless the XML types are commands, a programmer
generally builds up another object model from the data extracted from the DOM. For example,
suppose you have the following XML:-
<street>10 Park Lane</street>
After this is parsed into a DOM structure, the programmer has to extact the information,
usually into an 'address' object, which will have data-types street, city and country. The
toolkit aims to avoid this unnecessary task by parsing straight into the appropriate objects.
I've downloaded the toolkit, how do I use it?
The best guide to using XML Schema Toolkit is the set of
introductory pages. Briefly, SchemaCoder is the application
that converts schemas into code. You also need Microsoft's Visual C++ 6 developer
environment. Open a schema in the SchemaCoder application, make a few choices from
dialog boxes and you have created the code for a compilable DLL that will read in XML
using that schema. Then create an application that will load the parser (a COM DLL called
XMLParser.dll which has similar functionality to the Microsoft XML parser version 3),
and straight away you have a schema-validating DOM-like environment. The advantage is that
you have access to the code for each different element and attribute, not just access to
the structure at run-time. See the
description of the toolkit for more information on the toolkit.
Why C++ and not (Java, C#, ...)?
Java is a useful, well supported language, with many XML tools. It is also cross-platform, so
that Java class files will operate (in theory) on Unix, Windows and Mac operating systems.
However Schema-Toolkit was not created in Java. The reason is that it is not yet used for
commercial client-side software, and the toolkit is intended to encompass XML display languages
such as MathML, SVG and XHTML. Java limits a programmer, preventing the full power of the
operating system from being used, and it is difficult to move from Java to other languages.
Another recent language that looks promising is C#, Microsoft's alternative to Java. However
at the time the toolkit was begun, C# was barely announced. Currently such issues as speed and
tool support have not been settled. It is promoted as being able to handle COM components,
so it should be possible to use C# code with the schema toolkit.
Why not use WSDL/SOAP toolkits to turn code into schemas?
The issue of whether to turn existing code structures into XML or XML structures into new code
is complex. WSDL/SOAP toolkits generally employ the former. XML-Schema Toolkit does the opposite.
The reason for this is that XML schemas can naturally integrate with each other, allowing a
complex XML vocabulary consisting of many schemas (or namespaces). It also allows powerful XML
functional standards (such as XPath, XLink and XML-Signature) to be employed without further
coding (once these have been implemented).
How does the Toolkit relate to/use COM?
COM, Microsoft's Component Object Model, is a binary description of how code can create an
interface into its functionality from a separate library or process. It also consists of a
library of functions to aid this communication. Each non-trivial element type is essentially
a COM object, with functionality that allows early and late binding. In COM language this
means that each type has a dual (IDispatch) interface. The late-binding functionality allows
the parser to connect each element parent with its element and attribute children. The
difference with normal COM objects is that construction is not via class factories but via a
common namespace factory for each namespace (or every schema since a schema defines a
namespace). The COM interface also allows each child element to be retrieved from its parent,
a model similar to the DOM. The interface can be extended by the programmer to take into
account the specific characteristics of the element it represents. For example a 'Signature'
element from the XML-Signature standard may have extended interface methods to sign or verify
the data it holds.
What restrictions are there to using the Toolkit?
For the initial releases, the Schema Toolkit (comprising of the SchemaCoder application and
its libraries) is free to use, but not free to redistribute. The Toolkit is not
Open Source or Free software (as defined by a GPL or similar license) for two reasons. The
Toolkit is intended to become a commercial product - the free software route would prevent
this. Secondly, it would prevent commercial software developers from using the Toolkit to
create their own applications. See the
future intentions section for more details.