Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] CrossOver Office



On Sat, Apr 20, 2002 at 11:51:20PM +0900, Jonathan Byrne wrote:

> > clients request is for Excel, or rarely Word. Is there an open publishing
> > standard like xhtml for word processing? An open XML standard could be
> 
> Microsoft is going (has gone as of Office XP?) to XML as its internal format
> for MS Office documents.  Now, I wouldn't be shocked to learn that it's
> an "embraced and extended" (AKA "non-standard and broken") XML, but 
> achieving file level compatiblity with even a broken XML standard will
> likely be easier than achieving it with the wholly proprietary file
> formats that have been used up to now.

I had occasion to closely examine the MS Office XML formats a while back
when I was working in a Microsoft shop. We were discussing putting all
of our course content into an XML content management system, and I was
trying to determine how much work it would take to convert the existing
files, which were all in Word 2000 and PowerPoint 2000. My observations:

 * The document formats were compliant with the letter (except as noted
   below) of XML 1.0, but not the spirit. To be specific, they were 
   well-formed, but they were at best barely human-readable, because
   they were jam-packed with application-specific formatting crap.
 * XML document bodies were embedded in HTML in a non-standard way. This
   didn't seem to be a major problem, though, because this was only done
   at the top level of the document hierarchy, so as long as you knew 
   what to expect, it would be easy to strip off the HTML shell and put
   the remaining XML through a standard parser.
 * The XML elements seemed to be completely documented in WinHelp, but
   there was no DTD (nor any other type of schema) available. In fact,
   the documents may not be DTD-compatible, since the content models
   appear to vary according to context.

I concluded that it would be easier to write a VB app that converted between
the Word/PowerPoint/etc. object models and human-readable XML than to 
figure out all the intricacies of the MS-style XML.

Not long after that I was laid off, so I never got around to writing the
app, but I'd still like to do it at some point. Sort of a way to promote
interoperability in spite of MS's devious tricks.

BTW, it seems to me I read somewhere about 6 months ago that a MicroSoft
spokesman was explaining why it was OK to embed proprietary binary data
in XML. I forget the gist of the argument, but you can be sure it was the
usual twisted MicroCruftian logic. 

-- 
Matt Gushee
Englewood, Colorado, USA
mgushee@example.com
http://www.havenrock.com/


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links