Moving to XML

Author and presenter: Simon Brooke.

The full text of this presentation is online at <URL: http://www.jasmine.org.uk/~simon/bookshelf/courses/xml/>

Written May 2001; last revised 30th June 2004

Note: the original version of this text had a horrible error of understanding in the use of the org.w3c.dom package. My apologies to all those mislead by it. This remains, essentially, a three year old course; I have fixed the major error but have not otherwise brought it up to date. Bits of it are still worth reading.

Changes to the presentation since your handouts were printed are highlighted like this.

Simon Brooke, 21 Main Street, Auchencairn, DG7 1QU, Scotland.

What we're going to do today


How we're going to get there

Breaks, meals, fire exits, nearest WCs, how to get to coffee?


Before we start: what do you know

We've got a lot of ground to cover... Just so I can know which bits to concentrate on and which bits to skip, what do you know about...

nothing a little can do it expert
HTML        
Java        
XML        

Before we start: Namespace

[get participants to write their names on bits of paper. Make sure there are not two in the class with the same name. If there are, get them to add something to their names to disambiguate]


A brief chautauqua on language

Words

We can recognise words as belonging to a language because we know them... (sometimes, we can recognise words as belonging to a language even when we don't know them, because they sound right).


Sentences


Language


Words

Although language families have rules about what can be in a word and what can't, it's much harder to tell whether a word is valid or not, unless we know which language we're looking at.


Sentences


Meta-Language [1]


Meta-Language [2]


HTML [1]


HTML [2]


Well Formedness


What is XML


Key Features

A universal, application-independent framework for the communication of semantically rich structured information between software agents.


Differences from HTML


Extensible: what does this mean for you?


Extensible: a simple example [1]

<?xml version="1.0"?>
<!DOCTYPE meeting PUBLIC "-//WEFT//DTD MEETING 0.1//EN" 
        "meeting.dtd">
<meeting id="June Board Meeting">
  <venue>
    28 Forth Street, Edinburgh
  </venue>
  <invitees>
    <attendee attendance="required" 
        meeting-role="convenor">
      <name>
    Simon Brooke
      </name>
      <position>
    Technical Director
      </position>
    </attendee>
    <attendee attendance="required">
      <name>Angela Stormont</name>
      <position>
    Communications Director
      </position>
    </attendee> 
  </invitees>
</meeting>
   

Extensible: a simple example [2]


Strictly parsed: what does this mean for you? [1]


Strictly parsed: what does this mean for you? [2]


Differences from SGML


A bit about the other bits


A bit about the other bits [ii]: XLink


A bit about the other bits [iii]: XPath and XPointer


A bit about the other bits [iv]: XSL-T


Digression: Visual Appearance and Stylesheets [i]

If people are interested, you can open the stylesheet for this presentation, slideshow.css, in a text editor.


Visual Appearance and Stylesheets [ii]


Visual Appearance and Stylesheets [iii]: Status of XSL


Visual Appearance and Stylesheets [iv]: XSL Summary


Visual Appearance and Stylesheets [v]: What about CSS?


A bit about the other bits [v]: XML Schemas


Digression: Dialects of XML


What is a Document Type Definition?


What about Schemas? (Schemata?)


More about Schemas [i]: benefits


More about Schemas [ii]: examples

The pattern specification seems to have changed at some stage in the drafting process. The examples given in Learning XML don't work with Daniel Potter's tutorial applet. Treat all tutorials with care and refer back to the formal specification!


More about Schemas [iii]: conversion


Do I have to use a DTD or Schema?


What DTDs and Schemas are available?


Who will write DTDs and Schemas? [i]


Who will write DTDs and Schemas? [ii]


Ownership of DTDs: Communities vs single vendors


Another cautionary tail about software vendors


A bit about the other bits [vi]: SOAP


More about SOAP


XML in your context


Applications which benefit greatly from XML

At present, only where the audience is controlled


Applications which will benefit little from XML


XML in action: content syndication


What is content syndication


History of Syndication


Standards for Syndication


Offering Syndication


Incorporating Syndication [i]


Incorporating Syndication [ii]: Sample code

<!-- sidebar sections: show title and top eight entries -->
  <xsl:template match="rss">
    <h2>
      <xsl:apply-templates select="channel/title" />
    </h2>
    <xsl:for-each select="channel/item">
      <xsl:if test="9 > position()">
    <p>
      <a>
        <xsl:attribute name="href"><xsl:value-of 
          select="link"/> 
        </xsl:attribute>
        <xsl:apply-templates select="title" />
      </a>
    </p>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>
Sample XSL code Moreover Internet Europe headlines, processed with this XSL 22nd May 2001

Aggregation


Worked Example: a meeting arranger system


Creating an example document (quite easy)

This is a good opportunity for a whiteboard and some interaction! If possible, get the participants to do an example for themselves.


Creating the DTD and/or Schema (hard, but we'll use a trick)

Again, if possible, get the participants to actually do this.


Viewing it: creating a style-sheet (harder)


Using it: applications

Now we need to write applications which will:

Specifying


The Structure of an XML document


Overall Structure


Processing Instructions


XML Namespaces

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


Elements [i]

Syntactically, an element is what is delimited by its tags.


Elements [ii]


Attributes


When to use which

    <meeting id="June Board Meeting">
      <agenda>
        <item proposer="Simon Brooke">
          <title>
            Adoption of new project management
            procedures manual
          </title>
        </item>
        <item proposer="Angela Stormont">
          <title>
            Transfer of shares
          </title>
        </item>
      </agenda>
    </meeting>
   

Exercise period [i]


Creating


Building XML applications: tools and technologies


Why Java?


Other languages for building XML applications


Tools, components and toolkits


Transformation engines

Apply XSL stylesheets to transform a document from one representation to another.


What we will be using today

Apache Xalan
XSL processor contributed to the Apache Foundation by IBM closely related to IBM's LotusXSL processor
Apache Xerces
XML parser contributed to the Apache Foundation by IBM; based on IBM's XML4J parser
SAX
Simple API for XML, by David Megginson and others
DOM
The W3C Document Object Model API
W3C Jigsaw
HTTP Server and Servlet Server developed by W3C
Jacquard
A toolkit of useful bits for sticking it all together. By me. Not neccesarily the best but it's what I know and use.

Constructing the document


First an apology

Previous versions of this course contained a howling error at this point. It suggested creating DOM objects essentially by calling the newInstance method of the implementing classes. This only works with the particular DOM implementation you happen to be using and is not portable between DOM implementations (or even, necessarily, between successive versions of the same DOM implementation). So clearly it is very bad practice to do this.

I can only apologise to people who were mislead by this.


The Document Object Model


The DOM: what is a Document?


The DOM: what is an Element?


The DOM: what is a Text?


Create a document object

    // get a handle on a DOM implementation...
    DOMImplementation di = DOMStub.getDOMImplementation( context);
    // and use it to create a document object
    Document doc = di.createDocument( getNamespaceURI( context), 
						  rootName, doctype);
      

DOMStub is a Jacquard utility class which gets hold of whatever DOM implementation is available. If you don't use Jacquard you'll have to instantiate a DOM implementation for yourself.


Add a root ('content') element

    doc.appendChild( doc.createElement( doc, "eventsdiary"));
    Element content = doc.getDocumentElement();
    

Add further elements recursively as required

            // match the pattern against the convenience view and pull
            // back the rows that match as namespaces
            Contexts events =
                TableDescriptor.getDescriptor( VIEW, null, 
                                               context ).match( pattern );

            Enumeration e = events.elements(  );

            // and pass each of those namespaces in turn to my event element 
            // generator to generate children for my element
            while ( e.hasMoreElements(  ) )
                content.appendChild( eventEltGenerator.generate( doc,
                        (Context) e.nextElement(  ) ) );
      

This is a bit of a cheat. It depends on having a view in the database which collects together all the necessary fields for us:

---- EVENTS_VIEW -----------------------------------------------------

CREATE VIEW events_view AS
     SELECT EVENT.Actor,
            EVENT.Event,
            CATEGORY.Description AS Type,
            LOCATION.Description AS Location,
            EVENT.Eventdate,
            EVENT.Starttime,
            EVENT.Endtime,
            EVENT.Description
       FROM EVENT,
            CATEGORY,
            LOCATION
      WHERE EVENT.Location = LOCATION.Location
        AND EVENT.Category = CATEGORY.Category
   ORDER BY Eventdate,Starttime
;

Let's see that again [i] the source

public class DayView extends DocumentGeneratorImpl
{
    //~ Static fields/initializers --------------------------------------------

    /**
     * the name of the convenience view in the database from which I will
     * collect all the information I need
     */
    protected static final String VIEW = "events_view";

    /** the field in that view which represents the date of the event */
    protected static final String EVENTDATEFIELD = "when";

    //~ Instance fields -------------------------------------------------------

    /** a generate to generate the event elements which will be my children */
    protected EventElementGenerator eventEltGenerator =
        new EventElementGenerator(  );

    //~ Methods ---------------------------------------------------------------

    /**
     * generate a document containing all the events on the day implied by
     * this context
     */
    public Document generate( Context context ) throws GenerationException
    {
        DOMImplementation di = DOMStub.getDOMImplementation( context );
        Document doc = di.createDocument( "", "eventsdiary", null );

        String day = context.getValueAsString( "day" );
        uk.co.weft.dbutil.Calendar when = new uk.co.weft.dbutil.Calendar(  );

        if ( day != null )
        {
            // if we've got a date, set my calendar to that day
            // (by default it sets itself to today)
            when.setTime( java.sql.Date.valueOf( day ) );
        }

        Element content = doc.getDocumentElement(  );

        content.setAttribute( "date", when.toString(  ) );

        try
        {
            // create a new, blank, context as a pattern to match
            Context pattern = new Context(  );

            // give it the database username, password and url from the current context
            pattern.copyDBTokens( context );

            // put the date we're interested in into the pattern
            pattern.put( EVENTDATEFIELD, when );

            // match the pattern against the cnvenience view and pull
            // back the rows that match as namespaces
            Contexts events =
                TableDescriptor.getDescriptor( VIEW, null, context ).match( pattern );

            Enumeration e = events.elements(  );

            // and pass each of those namespaces in turn to my event element 
            // generator to generate children for my element
            while ( e.hasMoreElements(  ) )
                content.appendChild( eventEltGenerator.generate( doc,
                        (Context) e.nextElement(  ) ) );
        }
        catch ( DataStoreException dex )
        {
            throw new GenerationException( "Failed to read from data store: " +
                dex.getMessage(  ) );
        }

        return doc;
    }


Let's see that again [ii]: the event element generator

The event element is a simple wrapper round a context element generator:

    //~ Inner Classes ---------------------------------------------------------

    /**
     * a generator for an XML element representing a single event. This uses
     * ContextElementGenerator which knows how to construct a DOM element
     * node by taking values out of a context, so all we need to do is tell
     * it which value names to treat as attributes and which as children
     */
    class EventElementGenerator extends ContextElementGenerator
    {
        //~ Constructors ------------------------------------------------------

        /**
         * the tag of the element I generate is 'event'
         */
        public EventElementGenerator(  )
        {
            super( "event" );
        }

        //~ Methods -----------------------------------------------------------

        /**
         * return a String array of the names of my properties to output as
         * attributes
         */
        protected String[] getAttrNames(  )
        {
            String[] attrNames =
            { "event", "type", "location", "starttime", "endtime", "actor" };

            return attrNames;
        }

        /**
         * return a String array of the names of my properties to output as
         * children
         */
        protected String[] getChildNames(  )
        {
            String[] childNames = { "description" };

            return childNames;
        }
    }
}


Let's see that again: [iii] the context element generator


Let's see that again: [iv] the output

<?xml version="1.0"?>
 <eventsdiary
  date="Jul 18, 2000">
  <event
   actor="simon"
   endtime="5:30:00 PM"
   event="19"
   location="Yokohama, Japan"
   starttime="9:00:00 AM"
   type="Otherwise unavailable">
   <description>
    Lecture, Java XML, all day
   </description>
  </event>
 </eventsdiary>

Should be online here (login required). HTML formatted view here


Exercise period [ii]

We may skip this one if time's short or the group is struggling!


Transforming


Beginning XSL-T [i] The 'stylesheet'

    
<?xml version="1.0"?>
<xsl:stylesheet version=1.0
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<!-- Basic XSL stylesheet for day view of events diary.  -->

  <xsl:output indent="yes" method="html" 
          doctype-public="-//W3C//DTD HTML 4.0 Transitional//EN"/>

  <xsl:template match="eventsdiary">
    <html>
      <head>
    <title>
      Diary for <xsl:value-of select="@date" />
    </title>
    <link rel="StyleSheet" href="/styles/jacquard.css" type="text/css" 
      media="screen"/>
      </head>
      <body>
    <h1>
      Diary for <xsl:value-of select="@date" />
    </h1>
    <table>
      <tr>
        <th rowspan="2">
        Who
        </th>
        <th rowspan="2">
        Where
        </th>
        <th colspan="2">
        When
        </th>
        <th rowspan="2">
        What
        </th>
        <th rowspan="2">
        Details
        </th>
        <th rowspan="2">
        <a href="event">Add</a>
        </th>
      </tr>
      <tr>
        <th>
          Starts
        </th>
        <th>
          Ends
        </th>
      </tr>
      <xsl:apply-templates select="event" />
    </table>
      </body>
   </html>
  </xsl:template>

  <xsl:template match="event">
    <tr>
      <td>
    <xsl:value-of select="@actor"/>
      </td>
      <td>
    <xsl:value-of select="@location"/>
      </td>
      <td>
    <xsl:value-of select="@starttime"/>
      </td>
      <td>
    <xsl:value-of select="@endtime"/>
      </td>
      <td>
    <xsl:value-of select="@type"/>
      </td>
      <td>
    <xsl:value-of select="description"/>
      </td>
      <td>
    <a>
      <xsl:attribute name="href">event?event=<xsl:value-of 
        select="@event"/>
      </xsl:attribute>
      Edit
    </a>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

Beginning XSL-T [ii] The 'stylesheet' tag

<?xml version="1.0"?>
<xsl:stylesheet version=1.0 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

Beginning XSL-T [iii] comments

<!-- Basic XSL stylesheet for day view of events diary.  -->

Beginning XSL-T [iv] output specifier

  <xsl:output indent="yes" method="html" 
          doctype-public="-//W3C//DTD HTML 4.0 Transitional//EN"/>

Beginning XSL-T [v] declaring a template

  <xsl:template match="eventsdiary">

This template matches every instance of the element eventsdiary which is found in the document being processed. As eventsdiary is the root element of the document type we're interested in, there will only be one.

    <html>
      <head>
        <title>

As you can see, what is in the template is just the HTML markup that will be output (if we were outputting XML, it would be XML, of course)...

      Diary for <xsl:value-of select="@date" />

with scattered among it special xsl tags which cause things to be spliced into the output. This one says 'use the value of the date attribute of the current element'

        </title>
      </head>
      <body>
        <h1>
          Diary for <xsl:value-of select="@date" />
        </h1>
        <table>
          <tr>
            <th rowspan="2">
              Who
            </th>
            <th rowspan="2">
              Where
            </th>
            <th colspan="2">
              When
            </th>
            <th rowspan="2">
              What
            </th>
            <th rowspan="2">
              Details
            </th>
            <th rowspan="2">
              <a href="event">Add</a>
            </th>
          </tr>
          <tr>
            <th>
              Starts
            </th>
            <th>
              Ends
            </th>
          </tr>
          <xsl:apply-templates select="event" />

This is the important one. It says "apply the templates in this stylesheet to all the instances of event elements which are children of the current node".

        </table>
      </body>
   </html>
  </xsl:template>

Beginning XSL-T [vi] other useful bits

<xsl:template match="section[ @slot='main']">

This template will match only section elements which have an attribute named slot whose value is main

  <p>
    <xsl:call-template name="toc"/>

paste in the output of the named template called toc.

  </p>
  <xsl:apply-templates select="section">
    <xsl:sort select="title"/>

Apply templates in this stylesheet to sections which are children of this section, sorted alphabetically by their title sub-element

  </xsl:apply-templates>
</xsl:template>

<xsl:template name="toc">

This is the named template which was called earlier. Most templates are not named: they are applied automatically if their patterns match an element

  <xsl:for-each select="section">

for-each iterates over matching elements in turn

      <xsl:sort select="title"/>
        <a>
              <xsl:attribute name="href">#<xsl:value-of select="title"/>
                </xsl:attribute>

xsl:attribute allows us to construct the value of an attribute of the enclosing tag

          <xsl:value-of select="title"/>
      </a> |
  </xsl:for-each>
</xsl:template>

XSL-T elements: reprise

xsl:output
allows us to define how we want the output to be formatted
xsl:template
defines what should be output for elements matching a given pattern
xsl:apply-templates
applies the templates to the elements which match its pattern
xsl:call-template
calls a template with a particular name, overriding the pattern-matching system
xsl:for-each
produces output iteratively, overriding the pattern-matching system
xsl:sort
orders the result of its enclosing element (an xsl:apply-templates or an xsl:for-each)
xsl:value-of
produces the value of the thing matched by its pattern
xsl:attribute
outputs an attribute for the output element which encloses it

There are a few more XSL elements, but these will do most things for you.


Beginning XSL-T [vii]: Patterns

*
matches any element
foo
matches any element whose type is foo
foo | bar
matches any element whose type is foo or bar
foo/bar
matches any bar element with a foo parent
foo//bar
matches any bar element with a foo ancestor
foo[ @bar='baz']
matches any foo element which has a bar attribute which has the value baz
foo[1]
matches any foo element which is the first foo child of its parent
foo[ position() = 1]
matches any foo element which is the first child of its parent
[ position() < 5]
matches any element which is the first, second, third or fourth child of its parent
text()
matches any text element.

This is just the basics. The full definition is here


XSL-T: A deceptively simple language


Exercise period [iii]


Communicating


Just a bit about transport


Parsers

Parsing is quite compute-intensive - don't do it if you don't have to!


More about parsers [i] types


More about parsers [ii] types


Parsing from XML into the database


Identifying the data to store


Other things to bear in mind


Parsing: very simple worked example

Sample XML document

<?xml version="1.0"?>
<workshop tutor="Simon Brooke" 
  title="Parsing XML" venue="small">
  <attendee name="Jon Smith" age="37" 
    sex="M" country="UK" />
  <attendee name="Jane Doe" age="42" 
      sex="F" country="US" />
</workshop>

those who were here yesterday will probably recognise this from the 'WORKSHOP' database - I'm using this because I can't predict what your 'MEETING' databases will look like


Sample Java class

import java.io.*;       // to read things from the user
import java.sql.*;      // to talk to the database
import uk.co.weft.domutil.*;    // things to convert elements to namespaces
import uk.co.weft.dbutil.*; // things to store namespaces in databases
import org.w3c.dom.*;       // interrogates a DOM tree...
import org.apache.xerces.dom.*; // using Apache's DOM implementation
import org.apache.xalan.xslt.*; // Apache's XSL processor
import org.apache.xerces.parsers.DOMParser;
                // and Apache's XML parser


public class ParseExample
{
    static Context connectionContext = new Context();
                // a context to hold database
                // connection details

    /** walk down a document tree looking for nodes we recognise */
    public static void walk( Node node)
    throws SQLException, DataStoreException
    {
    if ( node.getNodeType() == Node.ELEMENT_NODE)
        {
        Element elt = ( Element) node;

        System.out.println( "Considering element of type " +
                    elt.getTagName());

        if ( elt.getTagName().equals( "workshop"))
            handleWorkshop( elt);
        else
            {
            NodeList children = elt.getChildNodes();

            for ( int i = 0; i < children.getLength(); i++)
                walk( children.item( i));
                // recurse down through the children
            }
        }
    }


    /** handle a workshop element; extract its attribute (and
     *  actually, it's text-only child) values, and store them in the
     *  database. Then look for attendees.*/
    protected static void handleWorkshop( Element elt) 
    throws SQLException, DataStoreException
    {
    Object key = null;

    Context c = ( Context)connectionContext.clone(); 
                // construct a new namespace with just
                // the database connection details in
                // it
    ContextElement.populateContext( elt, c);
                // fill it with values from the element

    TableDescriptor workshopDescriptor = 
        TableDescriptor.getDescriptor( "WORKSHOP", "Workshop", c);
                // get a descriptor on the WORKSHOP table

    Contexts rows = workshopDescriptor.match( c);
                // try to match that against what's
                // already in the table

    if ( rows != null && rows.size() > 0)
        {           // there was a match
        key = ( ( Context)rows.get( 0)).getValueAsInteger( "Workshop");
                // get its primary key value
        System.out.println( "Found workshop " + key.toString());
        }
    else
        {
        key = workshopDescriptor.store( c);
                // store it and get its primary key value
        System.out.println( "Created workshop " + key.toString());
        }

    NodeList children = elt.getChildNodes();

    for ( int i = 0; i < children.getLength(); i++)
        {           // look through the children for my attendees
        Node child = children.item( i);

        if ( child.getNodeType() == Node.ELEMENT_NODE &&
             ( ( Element) child).getTagName().equals( "attendee"))
            {
            handleAttendee( ( Element)child, key);
            }
        }
    }

    /** handle an attendee element by finding or storing it in the
     *  database, and fixing up the link table */
    protected static void handleAttendee( Element elt, Object workshopKey)
    throws SQLException, DataStoreException
    {
    Object attendeeKey = null;

    Context c = ( Context)connectionContext.clone(); 
                // construct a new namespace with just
                // the database connection details in
                // it
    ContextElement.populateContext( elt, c);
                // fill it with values from the element

    TableDescriptor attendeeDescriptor = 
        TableDescriptor.getDescriptor( "ATTENDEE", "Attendee", c);
                // get a descriptor on the ATTENDEE table

    Contexts rows = attendeeDescriptor.match( c);
                // try to match that against what's
                // already in the table

    if ( rows != null && rows.size() > 0)
        {           // there was a match
        attendeeKey = 
            ( ( Context)rows.get( 0)).getValueAsInteger( "Attendee");
                // get its primary key value
        System.out.println( "Found attendee " + 
                    attendeeKey.toString());
        }
    else
        {
        attendeeKey = attendeeDescriptor.store( c);
                // store it and get its primary key value
        System.out.println( "Created attendee " + 
                    attendeeKey.toString());
        }

    String q = "insert into ATTENDANCE ( Attendee, Workshop) values ("
        + attendeeKey.toString() + ", " + workshopKey.toString() + ")";

    Connection conn = c.getConnection();
    Statement s = conn.createStatement();
                // set up a database connection

    s.executeUpdate( q);    // run the statement
    System.out.println( "Inserted link into link table");

    s.close();      // close it...
    c.releaseConnection( conn);
                // and release it back into the pool
    }

    /** prompt the user for input; if we get any, return it */
    protected static String maybeGetFromUser( BufferedReader in, String prompt,
                       String val) throws IOException
    {
    System.out.print( prompt + " ] ");

    String s = in.readLine();

    if ( s != null || s.length() == 0)
        val = s.trim();
    
    return val;
    }

    /** start me up... */
    public static void main(String args[]) 
    {
    BufferedReader in = new 
        BufferedReader( new InputStreamReader( System.in));

                // get from the user the name of the
                // database driver to use
    try
        {
        Class.forName( 
              maybeGetFromUser( in, "Database Driver", 
                    "sun.jdbc.odbc.JdbcOdbcDriver"));

                // get from the user the details
                // needed to connect to the database
        connectionContext.put( "db_url", 
                   maybeGetFromUser( in, "Database URL", 
                         "jdbc:odbc:workshop"));
        connectionContext.put( "db_username", 
                   maybeGetFromUser( in, "Database Username", 
                         "nobody"));
        connectionContext.put( "db_password", 
                   maybeGetFromUser( in, "Database Password", 
                         "doesntmatter"));


        DOMParser p = new DOMParser();
            
        p.parse( maybeGetFromUser( in, "URL of XML to handle", 
                         "file:workshop.xml"));

        walk( p.getDocument().getDocumentElement());

        System.exit( 0); // all satisfactory
        }
    catch ( Exception e)
        {
        System.out.println( "Failed: " + e.getClass().getName() +
                    ": " +e.getMessage());
        System.exit( 1); // whoops
        }
    }
}

Exercise period [iv]


References

XML

news:comp.text.xml
Newsgroup for XML - recommended

FAQs, Directories and Resources

Extensible Markup Language (XML): http://www.oasis-open.org/cover/xml.html
A useful and authoritative overview of the technology; another good place to start.
Frequently Asked Questions about the Extensible Markup Language: http://www.ucc.ie/xml/
The most superior FAQ. Everyone seriously interested in XML should start here.
SCHEMA.NET: The XML Schema Site: http://www.schema.net/
Cafe con Leche XML News, and Resources: http://metalab.unc.edu/xml/index.html
DEVELOPERLIFE.COM brought to you by Nazmul Idris.: http://developerlife.com/
xmlTree - The leading directory of XML content on the Web: http://www.xmltree.com/

News

Welcome to XMLNews.org: http://www.xmlnews.org/
Mulberry Technologies, Inc.: XSL-List -- Open Forum on XSL: http://www.mulberrytech.com/xsl/xsl-list/
XMLephant: News: http://www.xmlephant.com/pages/News/
XML.ORG - A good XML Portal: http://www.xml.org/
XML.com - Another good XML portal: http://www.xml.com/pub

Standards

Authoritative sources of standards documents, mostly from the World Wide Web Consortium (W3C)

Core standards

The Annotated XML Specification: http://www.xml.com/axml/testaxml.htm
The standard annotated by one of the editor's personal comments -- very revealing!
Extensible Markup Language (XML) 1.0: http://www.w3.org/TR/1998/REC-xml-19980210
XML Linking Language (XLink): http://www.w3.org/TR/WD-xlink#addressing

Resource Description Framework

W3C Resource Description Framework: http://www.w3.org/RDF/
java tutorial help resource only at gamelan.com: http://www.gamelan.com/journal/techfocus/090199_rdf1.html
UKOLN: DC-dot, A Dublin Core Generator: http://www.ukoln.ac.uk/metadata/dcdot/
Dublin Core Metadata Initiative / Documents / Proposed Recommendations / Dublin Core Element Set, Version 1.1: http://purl.org/DC/documents/rec-dces-19990702.htm
Dublin Core Metadata Initiative: http://purl.org/dc/index.htm
UKOLN Metadata Resources - DC: http://www.ukoln.ac.uk/metadata/resources/dc/
UKOLN Metadata Resources - DC: http://www.ukoln.ac.uk/metadata/resources/dc/
Welcome to XMLNews.org: http://www.xmlnews.org/

XSL

XSL Transformations (XSLT) Specification: http://www.w3.org/TR/WD-xslt

DocBook

The nwalsh.com Home Page - XSL DocBook Stylesheets: http://nwalsh.com/docbook/xsl/
XSL DocBook Stylesheets: http://nwalsh.com/docbook/xsl/

WML

WAP WAP Binary XML (WBXML) Encoding Specification: http://www.w3.org/TR/wbxml/
Welcome to WAP School: http://www.refsnesdata.no/wap/default.asp
Nokia WAP Developer Forum: Nokia WAP Toolkit: http://www.forum.nokia.com/wapforum/main/1,6668,1_1_3_2,00.html

RSS: Rich Site Summary

Tutorials

My Netscape Network: http://my.netscape.com/publish/
Using RSS News Feeds - Webreference.com: http://www.webreference.com/perl/tutorial/8/

Feed Directories

Webfeeds: http://www.stirbitch.com/cgi-bin/agg/sources.pl
Moreover... Top stories: http://w.moreover.com/
StartsHere Channel List: http://theweb.startshere.net/channels.phtml
Open Directory - Computers: Internet: WWW: Web Portals: Netscape Netcenter: My Netscape Network: http://dmoz.org/Computers/Internet/WWW/Web_Portals/Netscape_Netcenter/My_Netscape_Network/

Internet Alchemy : Internet Alchemy : RSSMaker: http://internetalchemy.org/rss/index.phtml
xmlTree - The leading directory of XML content on the Web: http://www.xmltree.com/rss/index.htm

XML.COM - Standards List Sorted by Date: http://www.xml.com/xml/pub/standate/
W3C Scalable Vector Graphics (SVG): http://www.w3.org/Graphics/SVG/
VML - the Vector Markup Language: http://www.w3.org/TR/1998/NOTE-VML-19980513
Vector (infinitely zoomable) graphics for the Web, with implications especially for maps and technical diagrams.
News Industry Text Format: http://www.nitf.org/
Meta Content Framework Using XML: http://www.w3.org/TR/NOTE-MCF-XML/
'Content about content' - i.e. information for search and indexing engines and other software agents which must make some sense of the document.
Audio, Video, and Synchronized Multimedia: http://www.w3.org/AudioVideo/
The SMIL standard. I believe SMIL has implications not just for the Web, but for all sorts of presentation media including digital television.
XHTML 1.0: The Extensible HyperText Markup Language: http://www.w3.org/TR/WD-html-in-xml/
Backwards compatibility: implementing HTML in XML. Only very well written HTML is going to work!
XML Catalog proposal: http://www.ccil.org/~cowan/XML/XCatalog.html
XHTML 1.0: The Extensible HyperText Markup Language: http://www.w3.org/TR/xhtml1/
Template Resolution in XML/HTML: http://www-uk.hpl.hp.com/people/ak/doc/trix.html
eXtensible Server Pages (XSP) Layer 1: http://java.apache.org/cocoon/xsp/WD-xsp.html
Workflow Management Coalition: http://www.aiim.org/wfmc/mainframe.htm
DSML.ORG: The Standards Effort to Link Directories with XML: http://www.dsml.org/

Turorials

Info for Newcomers to XML at XMLINFO: http://www.xmlinfo.com/newcomers/
Producing HTML tables with XSLT: http://www.cogsci.ed.ac.uk/~dmck/xslt-tutorial.html
A Tutorial in XML and XSL Authoring: http://pdbeam.uwaterloo.ca/~rlander/XML_Tutorial/
Java & XML: 1 + 1 > 2: http://www.sun.com.au/sjug/pres/xml/JavaAndXML/seminar.html#Slide3
The WDVL: XML Tutorials: http://www.wdvl.com/Authoring/Languages/XML/Tutorials/
Generally Markup: XML Resources: http://pdbeam.uwaterloo.ca/~rlander/XML_Tutorial/
developerWorks : XML : Education: http://www.software.ibm.com/developer/education/xmlintro/xmlintro.html
SGML/XML: Using Elements and Attributes: http://www.oasis-open.org/cover/elementsAndAttrs.html
Producing HTML tables with XSLT: http://www.cogsci.ed.ac.uk/~dmck/xslt-tutorial.html
Welcome to XML School: http://www.refsnesdata.no/xml/
Practical XML : An introduction to XML and XSL stylesheets: http://www.kst.com/articles/2000/January/practical_xml1/index.php
Crane Softwrights Ltd. - Training: http://www.CraneSoftwrights.com/training/index.htm#ptux-dl
developerWorks : XML : Education: http://www-4.ibm.com/software/developer/education/xmlintro/xmlintro.html
RSS Tutorial: http://my.netscape.com/publish/help/mnn20/quickstart.html#rsssyntax
XML DTD Tutorial: http://www.xml101.com/dtd/

Software resources

Editors

Editing SGML with Emacs and PSGML - Table of Contents: http://rainbow.ldeo.columbia.edu/documentation/programs/psgml/psgml_toc.html#SEC2
A GNU Emacs mode for SGML files: http://www.lysator.liu.se/projects/about_psgml.html
This is what I use and recommend (I personally use XEmacs rather than GNU Emacs)
SoftQuad XMetaLhttp://www.softquad.com/index_main.html
Mulberry Technologies -- tdtd Emacs Major Mode for SGML and XML DTDs: http://www.mulberrytech.com/tdtd/
Download Morphon XML Editor 1.0b41: http://www.lunatech.com/products/morphon-xml-editor/download/

Browsers

Jumbo: http://ala.vsms.nottingham.ac.uk/vsms/java/jumbo/
Doczilla: http://www.doczilla.com/download/index.html
XML Viewer : another alphaWorks technology: http://www.alphaworks.ibm.com/tech/xmlviewer
InDelv: http://www.indelv.com/

XML to HTML on the fly

IBM XML Web Site, Education - Accessing XML on the Client: http://www.software.ibm.com/xml/education/client/client.html
Apache Cocoon: http://xml.apache.org/cocoon/
Apache is the world's most widely used Web server. This is the Apache project's server-side XML to HTML conversion strategy, important for serving XML documents while many browsers are still unable to interpret it. Implemented as a Java Servlet, may work with other Servlet enabled Web servers (but then does anyone serious use anything other than Apache anyway?)

XML Database integration

DB2XML A tool for transforming relational databases into XML documents: http://www.informatik.fh-wiesbaden.de/~turau/DB2XML/index.html
Tamino - The Information Server for Electronic Business, Software AG: http://www.softwareag.com/tamino/
A database which claims to store XML directly. Whether this means that it's really an object-oriented database underneath I'm not sure.
ODBC2XML: Merging ODBC data into XML documents: http://members.xoom.com/_XOOM/gvaughan/odbc2xml.htm
pgxml homepage: http://www.morinel.demon.nl/pgxml/
My favourite database engine, Postgres,
XML Lightweight Extractor : another alphaWorks technology: http://alphaworks.ibm.com/tech/xle

Conversion tools and filters

RTF2XML: http://www.xmeta.com/omlette/
Tool for converting RTF to XML, written in Omnimark
OmniMark Technologies Corporation: http://www.omnimark.com/
A programming language for manipulating data streams, useful in writing conversion filters from other formats into XML.

Quick ways to produce DTDs

DTDGenerator Frontend: http://www.pault.com/Xmltube/dtdgen.html
DB2XML A tool for transforming relational databases into XML documents: http://www.informatik.fh-wiesbaden.de/~turau/DB2XML/index.html
schematron: http://www.ascc.net/xml/resource/schematron/schematron.html
Widely recommended as a very powerful and elegant solution, knows about schemas as well as DTDs.
XMLschema.com: http://apps.xmlschema.com/

Structured Search tools

Downloading sgrep: http://www.cs.helsinki.fi/~jjaakkol/sgrep/download.html
Probably the most powerful simple tool for manipulating SGML and XML documents

Software collections and directories

xml.apache.org: http://xml.apache.org/
XMLSOFTWARE.COM: The XML Software Site: http://www.xmlsoftware.com/
This (commercial) site tries to keep track of XML related software tools which are available. Likely not to effectively index open source tools in the longer term.
Free XML software: http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html#SC_XSL

IBM Developers: XML : Overview: http://www.ibm.com/developer/xml/
eXtensible Server Pages (XSP) Layer 1: http://java.apache.org/cocoon/xsp/WD-xsp.html
OpenXML: http://www.openxml.org/
Major open source project to provide XML tools in Java
PHP3: Manual: XML Parser Functions: http://www.php.net/manual/ref.xml.php3
PHP is a server-side scripting language -- probably the best of the open source ones available. This manual section shows how the PHP project intends to handle XML at the server side, and is thus an alternative to Apache's Cocoon technology.
XML Authority Product Overview: http://www.extensibility.com/xml_authority/xml_ath_specs.htm
eidon products - Solutions for Structured Documents: http://www.eidon-products.com/
Dynamic XML for Java : another alphaWorks technology: http://www.alphaworks.ibm.com/tech/dynamicxmlforjava
XML Products Evaluation Form: http://www.bluestone.com/scripts/SaApps/SaCGI.exe/XMLevaluate.class
XML Script - XML tools for E-commerce: http://www.xmlscript.org/
SAX: The Simple API for XML: http://www.megginson.com/SAX/
Activated Intelligence Rocks Your Java World!: http://www.activated.com/
W4F, the World Wide Web Wrapper Factory: Welcome: http://db.cis.upenn.edu/W4F/
JDOM: Who We Are: http://www.jdom.org/credits/index.html

Commentry and background

XML, Java, and the future of the Web: ftp://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.html
Scientific American: Feature Article: XML and the Second Generation Web: May 1999: http://www.scientificamerican.com/1999/0599issue/0599bosak.html
An extremely clear and well written article
DevEdge Online - Metadata: http://developer.netscape.com/tech/metadata/index.html
Netscape's official take on metadata.
XML.COM - XML support in IE5: http://www.xml.com/xml/pub/1999/03/ie5/first-x.html
XML.com sets out to be a newsletter on XML and related developments. It's contributors are in general exceptionally well informed. In this article Tim Bray (who works closely with Netscape) reviews Microsoft IE5's XML compatibility.
CNET News.com - Taking sides on XML: http://www.news.com/News/Item/0,4,37072,00.html
XML, Java, and the future of the Web: ftp://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.html
XML Namespaces: http://www.jclark.com/xml/xmlns.htm
The Last Page: XML's Achilles Heel (Web Techniques, June 1999): http://www.webtechniques.com/archives/1999/06/lastpage/

XML EDI and e-Commerce stuff

A number of competeing proposals are being developed to do automatic businessto business transfer of invoices, orders,et cetera...

CNET.com - News - Services & Consulting - Big-name chemical firms join business e-commerce trend: http://news.cnet.com/news/0-1008-200-1579569.html?tag=st

Collaborative initiatives

The OBI Consortium: http://www.openbuy.org/
A solid business community consortium
Welcome to RosettaNet: http://www.rosettanet.org/
Probably the most incompetent and unprofessional Web site I've ever seen. This organisation claims to be the hub of EDI in XML development, but their Web site gives no comfort whatever regarding their competence.
Biztalk - Letting computers speak the language of business: http://www.biztalk.org/
Microsoft's tame e-Commerce consortium.
FpML.org: http://www.fpml.org/
JP Morgan - PriceWaterhouseCoopers initiative, apparently mainly aimed at financial services.
Electronic Business XML (ebXML) Home Page: http://www.ebXML.org/

Suppliers

DEDIOUX - Dynamic EDI Objects Using XML: http://www.americancoders.com/OpenBusinessObjects
ariba.com - welcome: http://www.ariba.com/
Welcome To OpenLink Software: http://www.openlinksw.com/virtuoso/

Stories

XML Applications Stand Up To EDI: http://www.techweb.com/wire/story/TWB19990416S0002
XML Applications Stand Up To EDI: http://www.techweb.com/se/directlink.cgi?INW19990419S0014
News story about Dell Computer's XML
CNET News.com - IBM links business software, e-commerce: http://www.news.com/News/Item/0,4,35128,00.html
News story about IBM's XML e-Commerce

WAP/WML

WAP WAP Binary XML (WBXML) Encoding Specification: http://www.w3.org/TR/wbxml/
wml-tools: http://www.pwot.co.uk/wml/
www.kannel.org: http://www.kannel.org/

XML Icon Gallery.: http://www.iol.ie/~alank/xml/icons.htm

give me feedback on this page // show previous feedback on this page