Friday, February 2, 2007

XSLT 2.0 is Fantastic, but there are some hurdles

When XSLT 1.0 became a W3C Recommendation back in 2001, I thought it was the coolest thing out there. Oh the things I could do with XML+XSLT 1.0+XalanSaxon! Later on, when I wanted to do things like grouping and outputting to more than one result file, I realized this wasn't built in. Even now, I can't fully wrap my head around the Meunchian Method for grouping; and for outputting multiple result files, I had to rely on XSLT extensions to support this. This meant that my stylesheets now were bound to a particular XSLT processor. This completely sent shivers up my spine - The whole idea behind XSLT in my (perhaps idealistic, naive) view was that you should be able to take an XML file and any compliant XSLT engine to create an output result (set). Still, despite the warts and shortcomings, XSLT 1.0 proved to be a faithful companion to my XML content.

Enter XSLT 2.0. In so many ways it is so much better than its predecessor! Built-in grouping functionaly, multiple output result documents were now part of the specification! Huzzah!

But wait! There's more! In-memory DOMs (Very nice!), Functions (Very handy), XQuery, XPath 2.0, unstructured-text processing (very handy for things like embedding CSS stylesheets, processing CSV files), better string manipulation functions, including Regex processing. This is just a taste of things in the latest version.

It just became a WC3 Recommendation (along with XQuery and XPath 2.0) last month. Yeah! Finally!

Still, this latest version has major obstacles to overcome before it can enjoy widespread adoption. There's only one notable XSLT 2.0 compliant engine: Saxon 8 by Dr. Michael Kay. It is developed in Java, but there is a .NET port (via the IKVM Libraries).

Not that I have anything against Saxon. It is outstanding. Yet where is Xalan? MSXSL? Why haven't they come to party? Scouring the blogs and mailing lists, there doesn't appear to be activity on Xalan toward an XSLT 2.0 implementation. Microsoft's current priority is XLinq, and has decided that it will support XQuery, but not XSLT 2.0 or XPath 2.0.

Microsoft's decision not to implement XSLT 2.0 and XPath 2.0 could have an unfortunate effect on adoption of these standards. While XQuery is extremely powerful (and wicked fast) and can do all the things that XSLT can do, I wouldn't necessarily recommend trying to create XQuery scripts to transform a DocBook XML instance (the XSLT is already complex enough).

I would rather write matches against the appropriate template than attempt to write a long complex set of switch cases to handle the complex content model. That said, it could be done, but it won't be a trivial task.

XSLT 2.0 is amazingly powerful with many of the features that were lacking in the 1.0 Recommendation. In fact, for the DocStandards Interop Framework intends to use XSLT 2.0 to take advantage of many of these new features to support different things like generating topic maps or bookmaps from the interchange format. Looks like Saxon will be the de facto engine of choice, though not a hard choice to make.

No comments: