Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] OT: XML?



>>>>> "Jean-Christian" == Jean-Christian Imbeault <jean_christian@example.com> writes:

    Jean-Christian> Can anyone point me to any ressource that explain
    Jean-Christian> *why* and *how* XML can be put to good use on a
    Jean-Christian> rather simple web site?

XML is just (pretty close, anyway) a specific subset of SGML.  That
means it's simply about structuring data in a human and machine
parsable way.  (The ISO 8879 document misspells "structured" as
S-T-A-N-D-A-R-D. ;-)  The structure part is simple but fairly
flexible:  it's just Lisp with fat flavored fuzzy parentheses.

So you're basically limited to hierarchical structures.  But that's
the way people think.  If your web site has a hierarchical structure,
it can be described with an XML DTD or schema.  If you have classes of
documents, each document of a class having similar hierarchical
structure, you can use sub-DTD/schemas.  It's very likely that you do,
even for a "simple" web site.  If you write the structure out as
schemas, then you'll be aware of when you're breaking the rules.  Your
tools will squawk.  Go ahead, break them as necessary---or rather,
rewrite them.  That's the first half of the "why."

The how is simple.  Your favorite programming language(s) will have
libraries for handling XML.  Some are suitable for presentational
manipulation and surface structure transformations (often based on the
SAX model).  Others can do arbitrarily complex deep structure
manipulation (the DOM-based systems).

The second half of "why" is that you can use these libraries to create
abbreviations that expose the common structure of your site, while
hiding the boilerplate.  Ie, you generate the presentation from the
source.  You can automatically verify syntax and often large parts of
semantics automatically.  And you can use the deep structure to
generate varying views of your content from the same sources.

Remember, much SGML/XML processing is from one DTD to another DTD.
Eg, Docbook-SGML to HTML.  And much of the rest is from SGML/XML to
another markup language (such as LaTeX).  If you've ever wondered why
the first 1kB of all your web pages has to be the same, the answer is
"they don't---substitute another processing stage to insert the
redundancy the browsers need."

The other thing about using XML at this stage is that the tools are
available to everybody you might need to communicate with.  Show them
your DTD/schemas, and they can talk directly to your sources.  So this
provides a path for growth as your "simple little" web site gets more
complex and bigger.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
 My nostalgia for Icon makes me forget about any of the bad things.  I don't
have much nostalgia for Perl, so its faults I remember.  Scott Gilbert c.l.py


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links