tag:blogger.com,1999:blog-50524579209761078202024-03-13T23:23:52.322-06:00Jim's ThoughtspotOver time this blog will provide my thoughts and ideas about technology, particularly in (but not limited to) the area of Information Management, focusing on XML, XSLT, XPath and XQuery.Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.comBlogger53125tag:blogger.com,1999:blog-5052457920976107820.post-62680939062744421852016-01-09T09:56:00.000-07:002016-01-09T09:56:41.669-07:00It's been a whileIt's been a while since I've posted on this blog. A lot has happened in the intervening months (years). Mostly, I've moved forward and backward in the tech world. I've hummed and hawed over the direction of my career. I've also been somewhat distracted by local events that required my attention. You can label it a higher calling; a change in priorities from a completely geeky world that I have embraced as my own to one that encompasses the future of my geeks-in-training. <br />
<br />
Needless to say, I haven't abandoned the world of XML, XSLT, XPath, XQuery entirely. I've evolved. I've had a gap year (or two) and seen the world outside the comfortable confines of angle brackets and FLOWR statements, and it has changed me - a bit. <br />
<br />
For those who've read some of my posts, I drank the kool-aid in the 90's, and wanted everyone else to share from the same cup. What the last few years have shown is that kool-aid is for kids. It's time to grow up. The technology and content worlds have changed, and I need to change with it. <br />
<br />
Primarily, what has changed is my thinking about the role of XML technologies in the landscape. It has a place, and honestly, it's a very important player in the wide ranging landscape - just not in the way I perceived it 5, 10, or even 15 years ago. In fact, I'm not sure that Sir Berners-Lee would have envisioned the path that markup languages have taken. Nonetheless, it's time to embrace these changes for what they are.<br />
<br />
<ol>
<li>XML is here to stay. It's mature, it's lived up to its promise of extensibility, and it won't go away.</li>
<li>XML technologies are stable. There is little in the way of implementation variability among different providers now. Whether you are using Java, a classic 'P' (Python, PHP, Perl) language, or any one of the newer languages, they all must honor XML in order to be complete.</li>
<li>Incremental changes in XML technologies are principally to support scale. DOMs are nice, elegant, and easy-to-use structures, but quickly turn into boat-anchors when we attempt to embrace internet-scale data within them. Streaming XML is the new sexy.</li>
<li>Virtually any data model can be represented in XML for a myriad of business purposes with self describing semantics and the capability to flex its node hierarchy based on the data. For this reason alone, XML has been, and will continue to be, a workhorse. Think about Spring, one of the web's most successful Java frameworks. XML is the underlying data for nearly part of it. </li>
<li>As a data persistence layer, XML plays well with tabular, relational, and hierachical structures. With its rich semantics and vendor-agnostic format, XML technologies are powerful, flexible, and scalable. Yes, it's also a great storage model for pernicious content models - like DITA, DocBook, and, gulp, OOXML (I'll shower later for that). </li>
<li>From XML, I can deliver and/or display content/data in virtually any format imaginable, even to the point of backward compatibility to legacy formats (ask me about HTML 3.2/EDGAR transformations sometime)</li>
</ol>
<div>
With all that XML has going for it, what can go wrong? Well, depending on who you ask, the answer will vary. Some criticize XML for not living up to the hype of Web 2.0. XML's initial purpose was to be the "SGML for the web." To some degree, it is, but it is far from ubiquitous. That isn't to say that we didn't try. From XML Data Islands to <code>XMLHttpRequest</code> objects in Javascript, XML was given first class status on the web. The problem was (and is) that, as a DOM, extracting data often relied on a lot of additional code to recurse through the XML content. For some, the browser's tools felt like a blunt instrument when finer grained precision was needed. Eventually, JSON became the lingua franca for web data, and rightfully so. </div>
<br />
Perhaps its biggest limitation or failure is the countless attempts to make XML usable for the masses. I'll admit that I was one of the biggest evangelists. I honestly believed that we could build authoring tools that were intuitive and easy-to-use back by powerful semantic markup. We would be able to enrich the web (and by proxy, the world) with content that had <i>meaning</i> - it could searched intelligently, reused, repurposed, translated, and delivered anywhere. As one of my friends and mentor, Eric Severson, said, XML has the capability of making content personal, designed for a wide audience and personalized for an "audience of one."<br />
<br />
Intrinsically, I still have some faith in the idea, but the <i>implementation</i> never lived up to the hype. For over twenty years, we've tried to build tools that could manage XML authoring workflows from creation to delivery. Back in the late 90's and early 2000's, I remember evangelizing for XML authoring solutions to a group of technical writers for a big technology firm. I was surprised by the resistance and push back I got. Despite the benefits of XML authoring, the tools were still too primitive, and instead of making them more productive, it slowed them down. Nevertheless, I kept evangelizing like Linus in the pumpkin patch. <br />
<br />
Eventually, the tools <i>did</i> improve. They <i>did</i> make authoring easier... for some. What we often glossed over was the level of effort required to <i>make</i> the tools easier to use. Instead of being tools that could be used by virtually anybody who didn't want to see angle brackets (tools for the masses), we made built-for-purpose applications. For folks like me who understood the magical incantations and sorcery behind these applications, they were fantastic. They were powerful. They also came with a hefty price tag, too. And, because they were often heavily customized, users were locked in to the tools, the content model, and the processes designed to support it. <br /><br />Even if we attempted to <i>standardize</i> on the grammar to enable greater interchange, it still required high priests and wizards to make it work. The bottom line is that the cost of entry is just too high for many. The net result is that XML authoring is a niche, specialized craft left to highly trained technical writers and the geekiest of authors.<br /><br />Years ago, I read Thomas Kuhn's <i>The Structure of Scientific Revolutions</i>. The main premise is that we continue to practice our crafts under the premise of well-accepted theory. Over time, through the course of repeated testing, anomalies emerge. Initially, we discard these anomalies, but as they continue to accumulate, we realize that we can't ignore these anomalies anymore. New theories emerge. However, we reject these new ideas and vigorously debate that the old theories are still valid, until enough evidence disproves them entirely. At that moment, a new paradigm emerges.<br />
<br />
We are at that moment of paradigmatic shift. No longer can XML be thought of as a universal theory of information and interchange. Instead, we need to reshape our thinking to accept that XML solves many difficult problems, and has a place in our toolbox of technology, but other technologies and ideas are emerging that are easier, cheaper, faster methods for content authoring. For many, the answers to "intelligent content" aren't about embedding semantics <i>within</i>, but rather to extend content with rich metadata <i>about</i> the content that live as wrappers <i>on</i> the content - that can be dynamic, contextual, and mutable. <br />
<br />
Before I'm labeled a heretic, let me be clear. XML isn't going away, nor is it inherently a failed technology. Quite the opposite. Its genius is in its relative simplicity and flexibility to be widely used in a vast number of technologies in an effective manner. The difference is that we've learned that we could never get enough inertia behind the idea of XML as a universal data model for content authoring, and it was too cumbersome for web browsers to manipulate. We have other tools for that.<br />
<br /><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-68703415024059499942013-10-25T09:40:00.001-06:002013-10-25T09:40:17.322-06:00Intellectual Property Affects K-12 Students Too<p>My oldest daughter was asked to enter a national contest through her school with some photos she created, sponsored by the <a href="http://www.pta.org/" target="_new">National Parent Teacher Organization</a>. Last night they sent home a waiver form that I had to sign. After my daughter read the waiver, <em>she</em> was concerned and asked me to look at it. After looking it over, I was a bit alarmed. The provision that raised red flags for me was:</p>
<blockquote>
I grant to PTA an irrevocable, unlimited license to display, copy, sublicense, publish, and create and sell derivative works from my work submitted for the Reflections Program.
</blockquote>
<p>
OK. I'm not under any delusion to think that my daughter or any other student should be paid or recompensed for submitting to a contest, nor am I contesting that the PTA shouldn't have the right to redistribute or derive the works. What I am contesting is that there isn't a single provision in the waiver that states that they will do so with the condition of proper attribution to the author for the original and any derivative works. Let me go on the record by saying that I don't believe that the PTA would ever act in a malicious way, nor are they trying to profit from students' creative work. In fact, the opposite is quite true - they are encouraging kids to be creative, and I applaud that heartily. Nonetheless, after working with numerous publishers on IP Rights issues, this is a sticky issue. My main point is that even though my daughter is in the K-12 school system and participating in a school function doesn't mean that any creative endeavor she pursues shouldn't be protected.
</p>
<p>
The way out, in my view, is that PTA should seriously consider that any waiver for this activity be governed by the <a href="http://creativecommons.org/licenses/by/3.0/us/" target="_new">Creative Commons License</a>. It basically states that the author of the work grants others the rights to use, sell, derive the work, <em>provided that the user <u>must include proper attribution to the author of that work</u></em>. It gives the PTA broad rights on how it can use these creative works, without the kids (my daughter) giving up all her rights to the work entirely.
</p>
<p>
For me, this is just another indicator that IP Rights are becoming more and more important, and that we need technology (ODRL, and other platforms) to support it. We've built such technology.
</p><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-16511129652523811372013-09-24T11:33:00.000-06:002013-09-25T09:35:24.580-06:00XML Schemas and the KISS Principle <p>I recently had the opportunity to work on an interesting XML schema..  The intent
was to create an HTML 5 markup grammar to create digital content for EPUB and the web
primarily, then ultimately for print.  The primary design goal is to create an
authoring grammar that facilitates some level of semantic tagging and that is natively
HTML 5 compliant, i.e., there is no transformation required to move between the
authoring format and HTML5.</p>
<p>What is interesting about this particular schema is that it resembles similar design
patterns used for <a href="http://microformats.org/">microformats</a>.  The markup
semantics for typographic structures such as a bibliography or a figure are tagged with
standard HTML elements and with additional typographic semantics express using the
<i>class</i> attribute.  For example, a figure heading structure must look like
the following:</p>
<pre><figure>
<h2><span class="caption">Figure </span>
<span class="caption_number">1.1 </span>Excalibur and the Lady of the Lake</h2>
</figure></pre>
<p>Notice the <span> tags.  From the perspective of describing our typographic
semantics (figures must have captions and captions must have a number), this isn’t too
bad.  However from a schema perspective, it’s much more complex, because the
underlying HTML5 grammar is quite complex at the level of <div>, <h2> and
<span> elements.  In addition to the required <i>“caption”
</i>and<i>“caption_number” </i>semantics applied to the <span> tag, the
<h2> element also allows text, other inline flow elements, such as <strong>,
<em>, and, of course, other <span> tags that apply other semantics.</p>
<p>To enforce the mandate that a figure heading must have a label and number as the first
two nodes of the <h2> element, we can use <a></a><a
href="http://www.w3.org/TR/xmlschema11-1/#cAssertions">XML Schema 1.1 assertions</a>
.  Assertions allow us to apply business rules to the markup that cannot be
expressed directly in the content model sequences.  Assertions allow us to use a
limited subset of XPath axes and functions that return a boolean result.</p>
<p>Alternately, Schematron could be used independently (or in addition to assertions) as a
means of enforcing the business rules in the markup. The issue here is that a Schematron
rule set resides outside of the XML schema, therefore requiring additional tooling
integration from the authoring environment to apply these rules. </p>
<p>So, for our
heading above, we must apply the following assertion:</p>
<pre><xs:assert test="child::h2/node()[1][@class='caption']/following-sibling::span[@class='caption_number']""/></pre>
<p>In this case, the assertion is stating that the <h2> element’s first node must have
a <i>class </i>attribute value of “caption”, followed immediately by an element with its
<i>class</i> attribute value of “caption_number.”  After that, any acceptable
text or inline element defined by the HTML5 grammar is allowed. </p>
<p>This is a very simple example of how the existing HTML5 grammar alone cannot enforce the
semantic structure we wish to express.  There are numerous other examples within
the content model that would leverage the same design pattern.</p>
<p>We have done
several successful projects with this approach and the value of having a single
authoring/presentation grammar (HTML 5) is very appealing. However, there can be issues
and difficulties with this approach. Consider:</p>
<ol>
<li>Microformats are clever applications that give semantic meaning to simple HTML
formatting tags.  It’s valid HTML by virtue of tags and attributes, with
additional semantics expressed through the value of certain attribute such as the
class attribute.  In general, these microformat markup documents are small,
discrete documents, as they are intended to be machine readable to give the
application its functionality.  From an authoring perspective, it’s relatively
simple to create a form that captures the essential data that is processed by
machine to generate the final microformat data (or for the markup and microformat
savvy, create it by hand – but we are in the minority). Think of microformat
instances as small pieces of functionality embedded as a payload within a larger
document that are only accessed by applications with a strong understanding of the
format. If we take the notion of microformats and use them throughout a document, we
can run into tooling issues, because we’re now asking a broader range of
applications (e.g. XML editors) to understand our microformat. </li>
<li>The “concrete” structural semantics (how to model figures and captions) are
specified with “abstract” formatting HTML tags. Conflating presentation and
structural semantics in this way is contrary to a very common design principle in
use today in many languages and programming frameworks, namely to separate the semantics/structure from the
formatting of content. </li>
<li>The schema’s maintainability is decreased by the vast number of assertions that must
be enforced for each typographical structure.  Any changes to any one structure
may have ripple effects to other content classes. </li>
<li>Not all XML authoring tools are created equal.  Some don’t honor assertions.
Others do not support XML 1.1 Schemas at all.  Consequently, this means that
your holistic XML strategy becomes significantly more complex to implement.  It
might mean maintaining two separate schemas, <i>and</i> it might also mean
additional programming is required to enforce the structural semantics that we wish
to be managed in the authoring tool. </li>
<li>A corollary to the previous point, creating a <i><u>usable</u>
</i>authoring experience will require significant development overhead to ensure
users can apply the right typographical structures with the correct markup.  It
could be as simple as binding templates with menus or toolbars, but it could easily
extend into much more.  Otherwise, the alternative is to make sure you invest
in authors/editors who are trained extensively to create the appropriate
markup.  Now consider point #3.  Any changes to the schema have ripple
effects to the user experience also. </li>
<li>Instead of simplifying the transformation process, tag overloading can have the
reverse effect.  You end up having to create templates for each and every class
value, and it’s not difficult to end up with so many permutations that an ambiguous
match results in the wrong output.  Having gone down this road with another
transformation pipeline for another client, I can tell you that unwinding this is
not a trivial exercise (I’ll share this in another post). </li>
<li>Assertion violation messages coming from the XML parser are extremely cryptic: <pre>cvc-assertion: Assertion evaluation ('child::node()[1]/@class='label'') for element 'summary' on schema type 'summary.class' did not succeed.</pre>
<p>For any non-XML savvy practitioners, this kind of message is the precursor to
putting their hands up and calling tech support.  Even if you use something
like Schematron on the back end to validate and provide more friendly error
messages, you’ve already made the system more complex.</p>
</li>
<li>It violates the KISS principle.   The schema, at first glance, appears to
be an elegant solution.  If used correctly, it mitigates what is a big problem
for publishers:  How do I faithfully render the content to appear as prescribed
in the content?  Theoretically, this schema would only require very light
transformation to achieve the desired effect. Yet, it trades one seemingly
intractable problem for several others that I’ve described above. </li>
</ol>
<p>Several years ago, I recommended using microformats as an interoperability format for
managing content between DITA, DocBook, and other XML markups.  The purpose of the
format was specifically to be able to <i>generated and read</i> with a set of XSLT
stylesheets do the heavy lifting of converting between standards.  The real benefit
is that you create a transformation once for each input and output, rather than building
“one-off” transformations for each version of the standard.  Once in the converted
markup, the content could leverage its transformations to produce the desired output. </p>
<p>I think the key distinction is that XML Interoperability Framework was never intended to
be an authoring medium.  Users would create content in the target format, using the
tools designed for that format.  This schema’s strategy is to author directly into
the interop, and the unintended consequences described above only make the complexity of
implementing, using, and maintaining it far greater than it needs to be. 
Sometimes, cutting out the middle man is not cheaper or easier.</p>
<p>Here’s another alternative to consider:</p>
<ol>
<li>A meaning for everything:  create a schema with clear, discrete semantics with
specific content models for each structure.  Yes, it explicitly means you have
to create stylesheets with some greater degrees of freedom to support the output
styling you want, and perhaps it’s always a one-off effort, but overall, it’s easier
to manipulate a transformation with overrides or parameters than trying to overload
semantics. <p>For example, consider our example above: If we want to mandate a
figure heading must have a caption label and a caption number, then semantically
tagging them as such gives you greater freedom for your inline tagging markup
like <code><span></code>. Using this principle, I could see a markup like
the following:</p>
<pre><figure>
<figtitle>
<caption_label>Figure</caption_label>
<caption_number>1.1</caption_number>
Excalibur and the Lady of the Lake
</figtitle>
</figure> </pre>
<p>Which might be rendered in HTML5 as:</p>
<pre><figure>
<h2>
<span class="caption">Figure </span>
<span class="caption_number">1.1 </span>
Excalibur and the Lady of the Lake
</h2>
</figure></pre>
<p>That also allows me to also distinguish from other types of headings that have
different markup requirements. For example, a section title might not have the
same caption and numbering mandate: </p>
<pre><section>
<title>The Relationship Between Arthur and Merlin</title>
<subtitle>Merlin as Mentor</subtitle>
...
</section></pre>
<p>Which might rendered in HTML5 as: </p>
<pre><section>
<h1>The Relationship Between Arthur and Merlin</h1>
<h2>Merlin as Mentor</h2>
...
</section></pre>
<p>Notice that in both cases we’re not throwing all the HTML5 markup overboard
(figure and section are HTML5 elements), we’re just providing more explicit
semantics that model our business rules more precisely. Moreover, it’s
substantially easier to encapsulate and enforce these distinctive models in the
schema, without assertions or Schematron rules, unless there are specific
business rules within the text or inline markup that must be enforced
independently from the schema. </p>
<p>Of course, if you change the schema, you may have also make changes in the
authoring environment and/or downstream processing. However, that would be true
in either case. And, irrespective of whether I use an HTML 5-like or a
semantically-explicit schema, I still need to apply some form of transformation
on content written against earlier versions of the schema to update to the most
current version. The key takeaway is that there is little in the way of
development savings with the HTML5 approach. </p>
</li>
<li>Design the system with the author as your first priority.  For example, most
XML authoring tools make it easy by inserting the correct tags for required markup
(e.g., our figure heading), especially when each tag’s name is distinct. Many of
these same tools also provide functionality to “hide” or “alias” the tags in a way
that’s more intuitive to use. Doing this in an overloaded tagging approach will
require a lot more development effort to provide same ease of use. Without that
effort, and left to their own devices, authors are going to struggle to create valid
content, and you are almost certain to have a very difficult time with adoption. </li>
<li>Recognize that tools change over time. The less you have to customize to make the
authoring experience easy, the more likely you can take advantage of new features
and functionality without substantial rework, which also means lower TCO and
subsequently, higher ROI. </li>
<li>Back end changes are invisible to authors. By all means, it’s absolutely vital to
optimize your downstream processes to deliver content more efficiently and to an
ever-growing number of digital formats. However, the tradeoffs for over-simplifying
the backend might end up costing more </li>
</ol>
<p>HTML5 will become the base format for a wide range of digital media, ranging from EPUB to
mobile and the web. On the surface, it would appear that using HTML5 makes sense as both
a source format and a target format. The idea has a lot of appeal particularly because
of the numerous challenges that still persist today with standard or custom markup
grammars that have impacted both authoring and backend processes.</p>
<p>Microformats’ appeal is the ability to leverage a well-known markup (HTML) to create
small, discrete semantic data structures targeted for applications with a strong
understanding of the format. Leveraging the simplicity of HTML5, we had hoped to create
a structured markup that was easy to use for content creation, and with little to no
overhead on the back end to process and deliver the content. However, I discovered that
it doesn’t scale well when we try applying the same design pattern to a larger set of
rich semantic structures within a schema designed for formatting semantics.</p>
<p>Instead, the opposite appears to be true: I see greater complexity in the schema design
due to the significant overloading of the class attribute to imply semantic meaning. I
also see limitations in current XML authoring tools to support a schema with that level
of complexity, without incurring a great deal of technical debt to implement and support
a usable authoring environment. </p>
<p>I also discussed how implementing an HTML5 schema with overloaded class attributes likely
won’t provide development savings compared to more semantically-explicit schemas when
changes occur. In fact, the HTML5 schema may incur greater costs due to its dependency
on assertions or Schematron to enforce content rules. </p>
<p>Rather than overloading tags with different structural semantics, an alternative might be
the use of a “blended” model. Leverage HTML5 tags where it makes sense: article,
section, figure, paragraphs, lists, inline elements, and so on. Where there are content
model variations or the need for more constrained models, use more explicit semantics.
The advantages to this kind of approach takes advantage of built in features and
functionality available in today’s XML authoring tools, and mitigates the level
programming or training required. Also, the underlying schema is much easier to maintain
long term. Of course, there are trade-offs in that back-end processing pipelines must
transform the content. However, with the right level of design, the transformations can
be made flexible and extensible enough to support most output and styling scenarios.
With this in mind, this type of tradeoff is acceptable if the authoring experience isn’t
compromised.</p><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com2tag:blogger.com,1999:blog-5052457920976107820.post-43823335555146349212012-07-24T11:37:00.000-06:002012-07-26T13:42:52.034-06:00Enumerated Constants in XQueryI’ve been working on a little project that allows me to merge my love of baseball with my knowledge of XML technologies. In the process of working through this project, I am creating XQuery modules that encapsulate the logic for the data. Part of the data that I’m looking at must account for different outcomes during the June amateur draft.<br />
<br />
It turns out that the MLB June Amateur draft is quite interesting in that drafting prospects is a big gamble. Drafts may or may not sign in any given year, and remain eligible for drafts in subsequent years. If they don’t sign during that year, they could be drafted by another team in following years. Alternately, they could be selected by the same team and signed. However, even if they do sign, there’s no guarantee that they’ll make it to big leagues. And even if they do, they might not make it with the same team they signed with initially (in other words, they were traded before reaching the MLB). <br />
<br />
In effect there are several scenarios, depending how the data is aggregated or filtered. However, these scenarios are well defined and constrained to a finite set of possibilities:
<br />
<ul>
<li>All draft picks </li>
<li>All signed draft picks </li>
<li>All signed draft picks who never reach the MLB (the vast majority don’t) </li>
<li>All signed draft picks who reached the MLB with the club that signed them </li>
<li>All signed draft picks who reached the MLB with another club </li>
<li>All unsigned draft picks </li>
<li>All unsigned draft picks who reached the MLB with a different club </li>
<li>All unsigned draft picks who reach with the same club, but at a later time </li>
<li>All unsigned draft picks who never reach the MLB</li>
</ul>
All of these scenarios essentially create subsets of information that I can work with, depending whether I’m interested in analyzing a single draft year, or all draft years in range. They’re essentially the same queries, with minor variations to filter to meet a specific scenario. <br />
<br />
Working with various strongly typed languages like C# or Java, I would use a construct like an <span style="color: #c0504d; font-family: Consolas;"><strong>enum</strong></span> to encapsulate these possibilities into one object. Then I can pass this into a single method that will allow me to conditionally process the data based on the specified enum value. Pretty straightforward. For example, in C# or Java I would write:<br />
<blockquote>
<pre>public enum DraftStatus {
ALL, //All draft picks (signed and unsigned)
UNSIGNED, //All unsigned draft picks
UNSIGNED_MLB, //All unsigned picks who made it to the MLB
SIGNED, //All signed draft picks
SIGNED_NO_MLB, //Signed but never reached the MLB
SIGNED_MLB_SAME_TEAM, //signed and reached MLB with the same team
SIGNED_MLB_DIFF_TEAM //signed and reached with another club
};</pre>
</blockquote>
The important aspect of enumerations is that each item in an enumeration can be descriptive and also map to a constant integer value. For example <code>UNSIGNED</code> is much more intuitive and meaningful than <code>1</code>, even though they are equivalent.<br />
<br />
Working with XQuery, I don’t have the luxury of an enumeration. Well, at least in the OOP sense. I could write separate functions for each of the scenarios above and perform the specific query and return a the desired subset I need. But that’s just added maintenance down the road. <br />
<br />
At first I toyed with the idea of using an XML fragment containing a list of elements that mapped the element name to an integer value: <br />
<blockquote>
<pre><draftstates>
<ALL>0</ALL>
<UNSIGNED>1</UNSIGNED>
<UNSIGNED_MLB>2</UNSIGNED_MLB>
<SIGNED>3</SIGNED>
<SIGNED_NO_MLB>4</SIGNED_NO_MLB>
<SIGNED_MLB>5</SIGNED_MLB>
<SIGNED_MLB_SAME_TEAM>6</SIGNED_MLB_SAME_TEAM>
<SIGNED_MLB_DIFF_TEAM>7</SIGNED_MLB_DIFF_TEAM>
</draftstates></pre>
</blockquote>
And then using a variable declaration in my XQuery:<br />
<blockquote>
<pre>module namespace ds="http://ghotibeaun.com/mlb/draftstates";
declare variable $ds:draftstates := collection("/mlb")/draftstates;</pre>
</blockquote>
To use it, I need to cast the element value to an integer. Using an example, let's assume that I want all signed draftees who reached the MLB with the same team:<br />
<blockquote>
<pre>declare function gb:getDraftPicksByState($draftstate as xs:integer, $team as xs:string) as item()* {
let $picks :=
if ($draftstate =
xs:integer($ds:draftstates/SIGNED_MLB_SAME_TEAM)) then
let $results :=
/drafts/pick[Signed="Yes"][G != 0][Debut_Team=$team]
return $results
(: more cases... :)
else ()
return $picks
};
(:call the function:)
let $sameteam :=
gb:getDraftPicks(xs:integer($ds:draftstates/SIGNED_MLB_SAME_TEAM),
"Rockies")
return $sameteam</pre>
</blockquote>
It works, but it’s not very elegant. Every value in the XML fragment has to be extracted through the <code>xs:integer()</code> function which is added logic and makes the code less readable. Add to that, IDEs like <a href="http://www.oxygenxml.com/" target="_blank">Oxygen</a> that enable code completion (and code hinting) doesn’t work with this approach. <br />
<br />
What does work well (at least in Oxygen, and I suspect in other XML/XQuery IDEs) are code completion for variables and functions, which led me to another idea. Prior to Java 5, there weren’t enum structures. Instead, enumerated constants were created through the declaration of constants encapsulated in a class:<br />
<blockquote>
<pre>public class DraftStatus {
public static final int ALL = 0;
public static final int UNSIGNED = 1;
public static final int UNSIGNED_MLB = 2;
public static final int SIGNED = 3;
public static final int SIGNED_NO_MLB = 4;
public static final int SIGNED_MLB = 5;
public static final int SIGNED_MLB_SAME_TEAM = 6;
public static final int SIGNED_MLB_DIFF_TEAM = 7;
}</pre>
</blockquote>
This allowed static access to the constant values via the class, e.g., <code>DraftStatus.SIGNED_MLB_SAME_TEAM</code>.<br />
The same principle can be applied to XQuery. Although there isn’t the notion of object encapsulation by class, we do have <em>encapsulation by namespace</em>. Likewise, XQuery supports code modularity by allowing little bits of XQuery to be stored in individual files, much like .java files. To access class members, you (almost always) have to import the class into the current class. The same is true in XQuery. You can import various modules into a current module by declaring the referenced module’s namespace and location. <br />
Using this approach, we get the following:<br />
<strong><span style="font-size: small;"><br /></span></strong><br />
<strong><span style="font-size: small;">mlbdrafts-draftstates.xqy</span></strong><br />
<blockquote>
<pre>xquery version "1.0";
module namespace ds="http://ghotibeaun.com/mlb/draftstates";
declare variable $ds:ALL as xs:integer := 0;
declare variable $ds:UNSIGNED as xs:integer := 1;
declare variable $ds:UNSIGNED_MLB as xs:integer := 2;
declare variable $ds:SIGNED := 3;
declare variable $ds:SIGNED_NO_MLB := 4;
declare variable $ds:SIGNED_MLB := 5;
declare variable $ds:SIGNED_MLB_SAME_TEAM := 6;
declare variable $ds:SIGNED_MLB_DIFF_TEAM := 7;</pre>
</blockquote>
Now we reference this in another module:<br />
<blockquote>
<pre>import module namespace ds="http://ghotibeaun.com/mlb/draftstates" at "mlbdrafts-draftstates.xqy";</pre>
</blockquote>
Which gives as direct access to all the members like an enumeration:<br />
<a href="http://lh3.ggpht.com/-m5eBDdPkAQY/UA7df6mlPAI/AAAAAAAAAqk/G-CCvcnw_Mw/s1600-h/xqueryconstants-autocomplete%25255B5%25255D.png"><img alt="xqueryconstants-autocomplete" border="0" height="170" src="http://lh4.ggpht.com/-LHDQqbL2lO0/UA7dgbAIJqI/AAAAAAAAAqs/9O1dSRhuH4E/xqueryconstants-autocomplete_thumb%25255B3%25255D.png?imgmax=800" style="background-image: none; border-bottom-width: 0px; border-left-width: 0px; border-right-width: 0px; border-top-width: 0px; display: inline; padding-left: 0px; padding-right: 0px; padding-top: 0px;" title="xqueryconstants-autocomplete" width="545" /></a><br />
The bottom line is that this approach has worked really well for me. I can use descriptive constant names that map to specific values throughout my code and shows how you can add a little rigor to your XQuery coding.<div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com2tag:blogger.com,1999:blog-5052457920976107820.post-63566735251330322292012-01-17T15:28:00.001-07:002012-01-17T15:38:28.140-07:00A First Look at ODRL v2With other things taking high priority over the last 6 months, this is the first opportunity I’ve had to look at the progression of ODRL Version 2.0, and evaluating where it’s improved from the earlier versions. <br />
<br />
First things first, ODRL has migrated to the W3C as a <a href="http://www.w3.org/community/odrl/" target="_blank">Community Working Group</a>. Overall, this is a good thing. It opens it up to the wider W3C community, gives greater credence to the effort and more importantly, more exposure. Well done. <br />
<br />
On to my first impressions: <br />
<br />
<strong>1 .</strong> <strong>The model has been greatly simplified. </strong> With ODRL 1.x, it was possible to express the same rights statement in several different ways. The obvious implication was that it was virtually impossible to build a generalized API for processing IP Rights, save running <a href="http://jaxb.java.net/jaxb20-ea/docs/xjc.html" target="_blank">XJC</a> on the schema, which isn't necessarily always what I want. It wasn’t all bad news though, the 1.x extension model was extremely flexible and enabled the model to support additional business-specific rights logic. <br />
<br />
<strong>2. Flexible Semantic Model.</strong> The 2.0 model has a strong RDF-like flavor to it. Essentially, all of the entities, assets, parties (e.g., rightsholders, licensees), permissions, prohibitions, and constraints are principally URI-based resource pointers that imply semantics to each of the entities. Compared to 1.x, this is a vast improvement to its tag-based semantics, which meant that you were invariably extending either the ODRL content model, data dictionary, or both.<br />
<br />
<strong>3. Needs More Extensibility. </strong> The <a href="http://www.w3.org/community/odrl/two/xml/#appendix-a" target="_blank">current normative schema</a>, still in draft, does need some additional design. Out of the box testing with Oxygen shows that only one element is exposed (<em>policy</em>). All of the other element definitions are embedded within the complexType models, which means makes it difficult to extend the model with additional structural semantics. This is extremely important on a number of fronts:<br />
<ul>
<li>The current model exposes assets as explicit members of a permission or prohibition. Each “term” (i.e., permission or prohibition) is defined by an explicit action (print, modify, sell, display). It’s not uncommon to have a policy that covers dozens or hundreds of assets. So for each <em>term</em>, I have to explicitly call out each asset. This seems a little redundant. The 1.x model had the notion of <em>terms</em> that applied to all declared assets at the beginning of the policy (or in the 1.x semantics, <em>rights</em>). I’d like to see this brought back into the 2.0 model. </li>
<li>The constraint model is too flat. The new model is effectively a tuple of: <em>constraintName, operator, operand.</em> This works well for simple constraints like the following psuedo-code : “print”, “less than”, “20000”, but doesn’t work well for situations where exceptions may occur (e.g., I have exclusive rights to use the asset in the United States until 2014, except in the UK; or I have worldwide rights to use the asset in print, except for North Korea, and the Middle East). Instead, I have to declare the same constraint twice: once within a <em>permission,</em> and second time as a <em>prohibition.</em> I’d like the option to extend the constraint model to enable more complex expressions like the ones above. <br /><br />Additionally list values within constraints are expressed tokenized strings within the <em>rightOperand</em> attribute. While completely valid to store values in this, I have a nit against these types of token lists, especially if the set of values is particularly long, like it can for things like countries using ISO-3166 codes. </li>
</ul>
I shouldn’t have to extend the whole complexType declaration in order to extend the model with my own semantics. However the current schema is structured that way. Instead, I’d like to see each entity type exposed as an “abstract” element, bound to a type, which ensures that my extension elements would have to at least conform to the base model. <br />
<br />
<strong>Takeaways</strong><br />
<strong><br /></strong><br />
I’m looking forward to using this with our Rights Management platform. The model is simple and clean and has a robust semantics strategy modeled on an RDF-like approach. This will make it easier to use the out-of-the-box model. That said, it’s missing some key structures that would make it easier to use and extend if I have to, but that can be address with a few modifications to the schema. (I have taken a stab at refactoring to test this theory – it’s pretty clean and I’m able to add my “wish list” extensions with very little effort.<br />
<br />
<b>Link: </b><a href="http://dl.dropbox.com/u/29013483/odrl-v2-proposed.xsd">http://dl.dropbox.com/u/29013483/odrl-v2-proposed.xsd</a><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-48278405825093798712011-12-31T15:59:00.001-07:002011-12-31T16:28:09.329-07:00Parallels between Punk and Anonymous<strong>Prologue:</strong> Before starting my career in the tech world 15+ years ago, I was a graduate student in Sociology studying political movements and economies. <br />
<br />
At any rate, what’s intriguing about technology is not only about 0s or 1s, data structures, angle brackets, optimized queries or distributed architectures (don’t get me wrong, I love elegant code and design as much as any other geek) – it’s also the intended and unintended consequences it has on society at large. As the automobile and large manufacturing re-shaped our society a hundred years ago, the internet and all of the emerging technologies are transforming our social interactions today. <br />
<br />
<hr />
2011 was a landmark year. We saw “Arab Spring” unfold before us in large part because of mobile devices and social media (granted, the other necessary ingredients – anger, resentment, disenfranchisement, chronic poverty and unemployment – have been brewing for many years). The “Occupy” movement harnessed the same political, social, economic, and technological ingredients along with a sprinkling of hyper-aggressive tactics of the NYPD and transformed a seemingly innocuous protest into a worldwide meme. WikiLeaks, rightly or not, also changed the way we view government, particularly when sensitive or embarrassing information is exposed. And to that end, this year demonstrated that the combination of mobile and social technology meant that information could spread virally, beyond the full control of any one entity. This has spurred new tensions between individuals who interact with data and entities who provide and/or control the data.<br />
<br />
In this case, I see many interesting parallels between the Punk subculture of the 1970s and early 1980s and the nascent subculture of Anonymous that is growing today. Both have emerged during periods of economic turmoil, and both have a strong anti-authoritarian sentiment that are willing to challenge the current establishment. <br />
<br />
I love the Sex Pistols (and the Smiths, the Cure, The Damned, Souixsie and Banshees, and so on, and on, etc.). I can listen to “Anarchy in the UK”, “God Save the Queen”, or “Pretty Vacant” any time. It’s loud and raucous. It’s fun. It’s… well, rebellious. Johnny Rotten’s menacing, sarcastic vocals epitomized the political, social and philosophical undertones of the Punk subculture of the mid-to-late 1970s. <br />
<br />
From many accounts, the Punk subculture, particularly in the UK, emerged during the mid-1970s in part because of the poor economy. Disenfranchised youths with few economic prospects gravitated to a style of music and dress that was non-conformist by nature and expressed their anger and frustration against society and government. <br />
<br />
The ethos, or ideology of Punk is well described here (source: <a href="http://www.bunnysneezes.net/page192.html">http://www.bunnysneezes.net/page192.html</a>):<br />
<blockquote>
It is passionate, preferring to encounter hostility rather than complacent indifference; working class in style and attitude if not in actual socio-economic background; defiant, unconventional, bizarre, shocking; starkly realistic, anti- euphemism, anti-hypocrisy, anti-bullshit, anti-escapist, happy to rub people's noses in realities they don't wish to acknowledge; angry, aggressive, confrontational, tough, willing to fight — yet this stance is derived from an underlying vulnerability, for the archetypal Punk is young, small, poor, and powerless, and he knows it very well; sceptical, especially of authority, romance, business, school, the mass media, promises, and the future; socially critical, politically aware, pro-outlaw, anarchistic, anti-military; expressive of feelings which polite society would censor out; anti-heroic, anti-"rock star" ("Every musician a fan and every fan in a band!"); disdainful of respectability and careerism; night-oriented; with a strong, ironic, satirical (often self-satirical), put-on-loving sense of humor, which is its saving grace; stressing intelligent thinking and deriding stupidity; frankly sexual, frequently obscene; apparently devoted to machismo, yet welcoming "tough" females as equals (and female Punks are often as defiant of the males as of anyone else) and welcoming bisexuals, gays, and sexual experimentation generally; hostile to established religions but sometimes deeply spiritual; disorganized and spontaneous, but highly energetic; above all, it is honest.</blockquote>
Compare this to the first two parts of Quinn Norton’s (Wired Magazine) well-done analysis of Anonymous in “Anonymous: Beyond the Mask” (Part 1 here: <a href="http://www.wired.com/threatlevel/2011/11/anonymous-101/all/1">http://www.wired.com/threatlevel/2011/11/anonymous-101/all/1</a>; Part 2 here: <a href="http://www.wired.com/threatlevel/2011/12/anonymous-101-part-deux/">http://www.wired.com/threatlevel/2011/12/anonymous-101-part-deux/</a>). One of the first things this series does incredibly well is to identify Anonymous for what it is – a culture, or more accurately, a counter-culture. <br />
<br />
Like Punk, Quinn goes on to describe the Anonymous culture:<br />
<blockquote>
The birthplace of Anonymous is a website called <a href="http://www.4chan.org/">4chan</a> founded in 2003, that developed an “anything goes” random section known as the /b/ board. <br />
…<br />
Like <a href="http://en.wikipedia.org/wiki/V_(comics)">Alan Moore’s character V</a> who inspired Anonymous to adopt the Guy Fawkes mask as an icon and fashion item, you’re never quite sure if Anonymous is the hero or antihero. The trickster is attracted to change and the need for change, and that’s where Anonymous goes. But they are not your personal army – that’s Rule 44 – yes, <a href="http://knowyourmeme.com/memes/rules-of-the-internet#.Trhw3PSXuso">there are rules</a>. And when they do something, it never goes quite as planned. The internet has no neat endings.</blockquote>
What’s more, both are media savvy in their own ways, leveraging them for their own purpose. Obviously, in the ‘70s and ‘80s, the internet wasn’t even a twinkle in our eyes yet, so they relied on <a href="http://en.wikipedia.org/wiki/Punk_zine">print</a> and radio (typically either on small, low-band college stations or on pirate radio stations since mainstream radio stations wouldn’t give them airplay) to get their message out. Anonymous, however, have the luxury of the internet and search engines, where information is easily accessible and available: <br />
<blockquote>
But to be historical, let’s start with 4chan.org, a wildly popular board for sharing images and talking about them, and in particular, 4chan’s <a href="http://boards.4chan.org/b/">/b/ board</a> (Really, really, NSFW). /b/ is a web forum where posts have no author names and there are no archives and it’s explicitly about anything at all. This technological format meeting with the internet in the early 21st Century gave birth to Anonymous, and it remains the mother’s teat from which Anonymous sucks. (Rule 22) </blockquote>
Both follow its own rules, many of which run counter to conventionally accepted protocols, and frequently meant to shock, ridicule and otherwise laugh at mainstream society. <br />
<blockquote>
/b/ is the id of the internet, the collective unconscious’s version of the place from which the base drives arise. There is no sophistication in the slurs, sexuality, and destruction in the savage landscape of /b/ — it is the natural state of networked man. </blockquote>
<blockquote>
In this, it has a kind of innocence and purity. Terms like ‘nigger’ and ‘faggot’ are common, but not there because of racism and bigotry – though racism and bigotry are easily found there. Their use is there to keep you out. These words are heads on pikes warning you that further in it gets much worse, and it does. </blockquote>
<blockquote>
Nearly any human appetite is acceptable, nearly any flaw exploited, and probably photographed with a time stamp. But /b/ reminds us that the id is the seat of creative energy. Much of it, hell even most of it, is harmless or even sweet. People reach out for help on /b/, and they find encouragement and advice. The id and /b/ are the foxholes of those who feel powerless and disenfranchised.</blockquote>
And like Punk, it never intended to be overtly political. Rather, the circumstances and events of the time instigated it. “The Guns of Brixton”, written by The Clash about the <a href="http://en.wikipedia.org/wiki/1981_Brixton_riot">1981 Brixton Riots</a> is one of many examples. For Anonymous, its forays into political protest were spurred on by their collective belief that Julian Assange and WikiLeaks were wrongfully targeted by governments and large, multinational corporations, and that fellow “compatriots” at BitTorrent site, Pirate Bay, were wrongfully attacked. In all cases, the common thread was a belief of suppression by the establishment. <br />
<br />
Where they differ, however, is in their means of expression. Punk is analog. It could only reach those in proximity to a radio signal (or the <a href="http://www.youtube.com/watch?v=p25SdQEnhHI">occasional TV appearance</a>), a concert venue, or to a “zine”. It’s effect and impact on society at large could only scale to the number of members it could congregate in any one physical location, which meant that it could remain largely contained and isolated. On the other hand, Anonymous is digital. Its reach is unbounded and its impact on society much more significant. The virtual nature of Anonymous means that they are able to challenge mainstream society more directly with little or no impunity. With tools like the Low Orbit Ion Cannon for DDOS attacks, and with more talented hacker members able to break into corporate and government servers and stealing sensitive information from them, governments and corporations see them as a real threat. <br />
<br />
At its essence, the Punk subculture provided its members a means of “flipping off” mainstream culture, through its music, dress, art, literature, and language. Yet, it was easy for mainstream society to ignore early punk youth, since their access to media was relatively limited. Anonymous shares this same “f--- you” attitude along with the same antipathy toward authority, yet they have the means to express their views more dramatically, and with greater reach, particularly because the internet, social media, and mobile devices enable members of Anonymous to be anywhere, or anyone. <br />
<br />
Punk has evolved over the decades. The music has changed; the aesthetics are different, and to some extent, what was considered shocking then is widely accepted now. Yet, the <em>idea</em> of Punk is still here. Anonymous is just the latest manifestation of it, and it could potentially have even greater impact on society-at-large.<div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-75614053852559607382011-12-14T23:14:00.001-07:002011-12-14T23:47:08.529-07:00SOPA Will Be Our Generation’s McCarthy Witch Hunt<p>In the late 1940s and early 1950s Joseph McCarthy was determined to eradicate the <a href="http://en.wikipedia.org/wiki/Red_Scare">Red Scare</a> by accusing numerous Americans of treason and being communists. It resulted in many actors being <a href="http://en.wikipedia.org/wiki/Hollywood_blacklist">blacklisted</a>, and resulted in the now infamous question to the “Hollywood Ten” from the House Committee on Un-American Activities – “Are you now or have you ever been a member of the Communist Party?” They exercised their 5th Amendment rights and refused to answer the question, principally because they felt their First Amendment rights were being impinged.</p> <p>In its current form, the “Stop Online Piracy Act” (SOPA) would allow the Department of Justice and Copyright holders to seek injunction against websites that are accused of enabling, facilitating or engaging in copyright infringement. It doesn’t stop there: It would force search engines to remove all indexes for that site, mandate that ISPs block access to the site, and require 3rd party sites like PayPal from engaging or transacting with the offending website. All because the copyright holder (or DOJ) makes an accusation. The burden of proof is on the ISPs, the search engines and the 3rd party vendors to show that the “offending website” is not violating any copyright (So perhaps Congress should consult the <a href="http://en.wikipedia.org/wiki/Sixth_Amendment_to_the_United_States_Constitution">6th Amendment</a>). The implications are severe even for websites that reference these infringing sites. They could be shut down too. </p> <p>Let’s be clear, I’m not condoning piracy of any kind. Intellectual Property vis-à-vis copyright is the coin of the realm of many companies, even whole industries like Publishing, Media, Software, and yes, the Entertainment world, and they should protect their assets. They should derive value and profit from their IP. An author who pours their heart into a publication, or an artist whose performance I like should be paid. Likewise, content producers – studios, publishers, media companies – should be able to garner payment for their role in providing content. But they are looking at the whole piracy issue the wrong way. </p> <p>Brute-force tactics to protect copyright have been epic failures. DRM approaches don’t work. In fact, they <a href="http://gigaom.com/2011/12/14/what-louis-ck-knows-that-most-media-companies-dont/">incite piracy</a>, and worse, they <a href="http://www.antipope.org/charlie/blog-static/2011/11/cutting-their-own-throats.html">harm the very companies they try to protect</a>. In 2007, Radiohead released their album <a href="http://boingboing.net/2007/10/09/radioheads-new-downl.html">“In Rainbows” DRM-free</a>. A year later, <a href="http://weblogs.variety.com/thesetlist/2008/10/radioheads-publ.html">they had sold over 1.75 <em>million</em> copies and 1.2 <em>million</em> fans would buy tickets to their show</a>. Bottom line: Locking down content doesn’t protect copyright holders. Instead, DRM tactics will end up frustrating consumers who legally purchase content but can’t use it or copy it to a new device and, as a result, <em>diminishes revenue. </em>And at that point, the <a href="http://en.wikipedia.org/wiki/Opportunity_cost">opportunity cost</a> of future purchases with the same DRM constraints will grow higher and higher. Media, publishing and entertainment executives know that DRM has failed, and feel that their only recourse is through SOPA.</p> <p>There will always be a small percentage of consumers who will use pirated content. But it needn’t be a negative sum game. In some cases, it should be written off as a business cost in order to generate more revenue: a pirated song, might lead to the offending consumer to purchase a ticket to a concert, or to the next movie because they can’t wait. Yet, to prevent wholesale piracy, technology exists today that can protect copyrighted content: XMP (even <a href="http://odrl.net">ODRL</a> can be serialized into XMP), digital fingerprinting for starters. By using these, along with other tools that can scan the internet for matching assets, asset producers can identify and isolate pirated copies. Then they can go after the offending sites directly. </p> <p>SOPA won’t stop piracy, but it will impact everyone’s access on the Internet. And in that vein, SOPA legitimizes the piracy of 1st Amendment rights, much in the same way that McCarthyism censored free though in the 1950s, simply by accusation of copyright infringement. </p> <p><strong>NOTE: </strong>The views expressed in this post and on this blog are my own. They do not reflect the views of my employer, its employees or its partners. </p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-12907100490993887172011-11-21T17:01:00.001-07:002011-11-21T18:22:35.192-07:00Note to Fanboys: Don’t Hate the Player, Hate the Game…This is a bit of a rant. I get tired of hearing and reading fanboy comments that go along the lines of: “X rules, Y[,Z] drools…”, “You’re just a hater…" Blah. Blah Blah. It’s like listening to reverb on PA system.<br /><br />
My irritation stems from an article I read recently about the potential repercussions of <a href="http://www.infoworld.com/d/application-development/will-adobes-html5-strategy-really-help-developers-178568">Adobe’s move to stop development of Flash for mobile devices</a>. The article, in my opinion, was well balanced and made the point that while Flash is on the decline, there’s plenty of room for Adobe to maneuver and claim a stake in the RIA/HTML5 world. What struck me though were the comments. Several of them were antagonistic and claiming author bias against Flash.<br /><br />
The comments also struck a chord with me in that I recently ran into a buzzsaw-like argument with a client with respect to implementing and deploying a No-SQL data solution against trying to do the same thing in one of the big RDBMSs. The debate was that they felt that there wasn’t anything their current RDBMS couldn’t do that the intended NO-SQL system did. Sure, their system could, but it didn’t the specific kinds of things they wanted to do with the NO-SQL system nearly as well. In fact some of the things were bolted on with the technical equivalent of bailing wire and duct tape, and in the long run, cost them more in overhead and maintenance. <br /><br />
After the debate, I took some time to reflect on their argument. The underlying theme that occurred to me was this: they <em>understood</em> RDBMS; they <em>didn’t understand</em> the NO-SQL system we recommended they implement. Bottom line: Go with what you know. <br /><br />
Yet I’ve seen this kind of resistance to various technologies throughout my career. I’ve seen the esoteric debates between the DocBook and DITA content models and architecture, the religious orthodoxy of Windows vs. Linux vs. Mac, and more recently, the pissing contests of iOS vs. Android. The main contention between camps always seems to boil down to “mine or bigger/better/faster/cooler than yours.” My 5 year old twins do it better than anyone, but to hear it from grown-up professionals is like listening to a murder of cackling crows. <br /><br />
If we’re intellectually honest, all of these arguments/dogmatic disputes boil down to the same time-tested axiom: all of us will tend to gravitate to tools/technologies/practices that we’re familiar with, understand, are (reasonably) good at, scratch a particular (set of) itch(es), or just think are cool. Any variance from these, or the suggestion that something is better/faster/cooler than what exists in your comfort zone warrants unabashed trolling, simply because they don’t fit within our particular paradigm. <br /><br />
Tools and technology are applied to solve a specific set of problems, under a specific finite set of assumptions. Don’t like the “evil empire” Microsoft, but appreciate commodity hardware? here comes Linux; like beautiful form, closed, but controlled functionality? Mac seems a good fit. Need structured data without a lot of noise? JSON might be a good fit, however if your data is rich and structured? XML is game for it. Want a single seamless experience for your smartphone? iPhone; want to use an open-source mobile platform with many choices of devices? Android. <br /><br />
The point is this - When a problem veers away from these binding assumptions, or new assumptions are introduced, either the tool or technology must be modified/enhanced to fit these assumptions, or other technologies will be built to replace it. <br /><br />
I’m not entrenched in the idea that “<a href="http://drmacros-xml-rants.blogspot.com/2006/02/all-tools-suck.html">all tools suck</a>, some worse than others”. Every tool and technology has limitations - we need look no further than Joel Spolsky’s seminal work, “<a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">The Law of Leaky Abstractions</a>.” For instance, we rely heavily on virtualized environments for our development work. Works great for most Linux and Windows environments, but you’re out of luck for Macs. Does that mean Macs suck? For working in the virtualized environment we have, it’s a buzzkill; but overall no. We also do a lot of work with XML standards like DITA and DocBook. DITA’s great for its flexibility and reusability; but DocBook still has a place too especially for longer content components where minimalism is not applicable.<br /><br />
But now we can begin to boil down tools and technology down to their real “suck factor”: <br /><br />
In the grand scheme the evolution of technology plays out very much like <a href="http://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions">Thomas Kuhn’s seminal work</a>. In many cases, it doesn’t build on older work, but rather there is a creative destruction and replacement with new technology. During that process, there is a polarization between the two technical/philosophical camps. Eventually, as the new technology attains enough momentum through adoption, the other older technology relinquishes (perhaps not to complete obscurity, and sometimes becoming a small, niche player).<br /><br />
As mentioned above, all tools and technology are constrained by the the underlying assumptions they were built on, and within the bounding box of a specific problem set. <em>Assumptions are rarely ever static</em> – they evolve over time, and when they do, the underlying premise on which a particular tool or technology is built on will start to falter. <br /><br />
For example, Flash works pretty damn well on my laptop with Firefox or Chrome – it works reasonably well on my Android phone, even though it does eat up my battery. Flash basically did things that HTML + Javascript could never do (well). Along comes HTML5, and the underlying assumptions are changing, and they are building in specifications into the standard that will make it possible to create rich internet applications natively (<a href="http://jims-thoughtspot.blogspot.com/2011/07/html5-well-maybe.html">though not right away</a>). <br /><br />
Additionally, smart mobile devices are exceedingly becoming users’ primary access to the internet meaning that lightweight, small footprint applications are incredibly important. Combine these with sprinkle of animosity/frustration/angst/whatever from Steve Jobs and Apple, and the foundations on which your technology are built will inevitably weaken.<br /><br />
Throw in some market forces and what you think is the greatest thing since Gutenberg’s press turns out to be yet another Edsel on the trash heap of “other great ideas”. Case in point: we can argue ‘til the cows come home that BetaMax was far superior than VHS, but that and a couple of dollars will buy you a cup of coffee. <br /><br />
So now that I’ve gone on a somewhat random dissertation of my original rant, I’ll leave all any fanboys with the key message: <a href="http://www.youtube.com/watch?v=fcIH5DQY6U8">Don’t hate the player, hate the game</a>. Technology comes and goes. Assumptions change constantly. Try to keep an open mind and recognize when you’re falling into the familiarity trap. Improvise and adapt, or you’ll be left behind like yesterday’s news.<br /><br />
<strong>Full Disclosure</strong><br />
In full disclosure, and keeping with the theme of intellectual honesty:<br /><br />
I own an Android phone, because my carrier didn’t support iPhone at the time. I like my Android and continue to go with what I know, and like that it’s built on open source software. I think the latest generation of iPhones with Siri are pretty amazing though.<br /><br />
I’ve used several Linux variants throughout my career, but do most of my work on Windows because that’s what’s on my laptop, and it works well with the tools I use everyday. My last real experience with Mac was back in 1997-1998 when I was in grad school. So I won’t claim any real knowledge here. <br /><br />
I use Eclipse plus numerous plugins for Java development, Microsoft Visual Studio for .NET development (though SharpDevelop is pretty cool too!), and Oxygen for XML development. I prefer Notepad++ over TextPad, and I like Chrome over Firefox and use IE only when I have to. <br /><br />
I use JSON when I’m working with jQuery, Dojo or YUI, and I use XML for structured authoring and when I work with XML databases, XSLT, and XQuery and for things like Rights Management. I like Flex for building UIs quickly for prototypes (hey, demos are in controlled environments, right? :), but recognize its limitations when it comes to device support and will consider my options carefully in a production environment.<br /><br />
I like REST over SOAP over other RPC protocols. RESTEasy rocks for simple apps; Spring for bigger implementations. <a href="http://guide.couchdb.org/draft/consistency.html">Eventual Consistency</a> is in; <a href="http://en.wikipedia.org/wiki/ACID">ACID</a> is out.<br /><br />
I still think HTML5 is a work in progress and needs maturity among the “Big Three” browsers and think Flash is still a few years from replacement (Firefox, IE, and Chrome/Safari – OK, I mention 4 but I lump Chrome and Safari together for their use of WebKit). While it’s still very early, I’m eager to see if Google Dart has legs and can displace Javascript (I’m not a big fan debugging other’s JS code when it comes to determining data types or scope). <br /><br />
I’m still trying to grok my way through XProc pipelines and tend to use XSLT 2.0 in somewhat creative ways that it wasn’t intended for, and use Ant for processing pipelines even though I know that it is IO-bound. <br /><br />
And finally, I’m truly into Spanish Riojas right now, and only drink Merlots or Cabernets when I have to :)<div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com1tag:blogger.com,1999:blog-5052457920976107820.post-83449743966649584992011-07-23T16:41:00.001-06:002011-07-23T17:10:23.289-06:00HTML5: Well, Maybe.<p>I just finished reading an <a href="http://www.businessinsider.com/roger-mcnamee-video-2011-7?op=1" target="_blank">article</a> about Roger McNamee’s bold predictions about social media. Aside from some the interesting business predictions (e.g., don’t invest in new social media startups: that train has left the station) that I mostly agree with, he is strongly emphasizing the emergence of HTML5 as the technology that will drive application development in the future. On this point, I’m not ready to throw my FlexBuilder, Visual Studio, Eclipse and Android SDK development environments in the dust bin just yet. Forget about scrapping my Notepad++ or Oxygen environments, these are keepers for the long, long term.</p> <p>Yeah, HTML5 definitely has much promise: Canvas alone is just cool. I’ve seen some really interesting things done with this, and it can only get better from there. Yet one cool enhancement for a browser isn’t enough to keep my attention long term, nor is it a game-changer that will revolutionize how users interact with application interfaces. </p> <p>So what kinds of things will keep me saying, “you had me at hello.”? The big deal for me is looking at the world through the publishing industry’s collective eye: Many of the big publishers are in the midst of what can considered a paradigmatic shift – while print will still be a prominent part of their business model, it won’t be the dominant model. This is a significant change. Publishers will transform themselves from <em>content designers</em> to <em>media conduits</em>. </p> <p>OK, so how what will HTML5 need to have to be compelling for publishers to adopt it? I see three things, all of which are requirements for the browser vendors to reconcile:</p> <ol> <li>Media Codec Standardization <li>Support other key technical standards (EPUB, MathML, etc.) <li>Form-factor scaling</li></ol> <p><strong>Media Codec Standardization</strong></p> <p>Right now, there are myriad of audio and video standards like H.264, Ogg (Theora for audio; Vorbis for video), MP3, Speex, AAC, WAV, and so on and on and on. The problem is that none of the current browsers support a common set of these, and even when they do support them, their support varies. Until they figure that out, HTML5 will not be able to leverage its full capability and publishers will be reluctant to adopt it.</p> <p>References: <a href="http://en.wikipedia.org/wiki/Comparison_of_layout_engines_(HTML5_Media">http://en.wikipedia.org/wiki/Comparison_of_layout_engines_(HTML5_Media)</a>; </p> <p><strong>Native Support for other Standards</strong></p> <p>OK, this one is a big, huge stretch and probably not going to happen anytime really soon. Well, OK. Ever. That said, these are the types of challenges that publishers have to face currently as well as going forward. EPUB is the biggest stretch only because it leverages HTML (and ZIP compression) anyway, but the capability to embed EPUB in an HTML container would be a big win. Yet for technical publishers, i.e., engineering, science and math publishers, there hasn’t been a good solution for displaying all manner of math equations in browsers – they’ve had to rely on either transforming the equation to a raster image (and only recently to vector images like SVG) or rely on plugins to render the equation. More recently, we’ve seen developments like MathJAX (<a href="http://mathjax.org">http://mathjax.org</a>) that rely on Javascript libraries to consume LaTeX scripts and display equations. A bit better, but not quite as elegant as leveraging structural markup. </p> <p>The bottom line is that this requirement is probably more of a “nice to have,” but for STM publishers, its key to their business. </p> <p><strong>Form Factor</strong></p> <p>This is probably the most significant limitation today. It would be one thing if all applications/browsers were bound to desktops and laptops. The reality is that mobile devices, all of which have different dimensions ranging from relatively small smart phones to now tablets means that application interfaces have added challenges to support these different form factors. Today, I would be hard-pressed to recommend HTML5 UI libraries over native mobile OS UI controls. </p> <p><strong>The Future</strong></p> <p>Will HTML5 become the preeminent technology platform? My magic 8-ball on my smart phone says “Ask again later…” This resonates the same for me. I’m hopeful that HTML5 can live up to the promise and can become the common technology platform for all applications. But right now, there’s just too many holes in the various browser engines to make it practical. Don’t expect browser vendors to patch these holes quickly. In the meantime, several factors will impede HTML5 adoption: Flash, warts and all, is still largely ubiquitous. Its influence is slowly diminishing, but it won’t go away anytime soon. In addition, Javascript libraries like JQuery, YUI and Dojo are maturing, but I think we’ll need to see how they shake out over time. I’ll defer to Javascript experts to tell me which of these will become integral for HTML5 applications.</p> <p>Lastly, HTML5 won’t be promoted to true standard status for <a href="http://www.techrepublic.com/blog/programming-and-development/html-5-editor-ian-hickson-discusses-features-pain-points-adoption-rate-and-more/718" target="_blank">another 10 – 11 years</a>. This is a lifetime, almost an epoch, for technology. Lots can happen in that time. It’s hard to predict right now what emerging technologies will come along that will impact content and media, but chances are something will.</p> <p><strong>Update (7/23/2011 05:08 PM MST):</strong> Even more articles are coming out suggesting HTML5 will be a boom industry (see <a href="http://gigaom.com/2011/07/22/the-html5-boom-is-coming-fast/">http://gigaom.com/2011/07/22/the-html5-boom-is-coming-fast/</a>): Could be real, but could be a bubble. I’m not convinced yet that browsers are up to the task – yet.</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-74223530538694293272011-06-23T10:49:00.003-06:002011-06-23T11:27:19.864-06:00IPRM != DRMOver the last year, I've been developing strategies that allow publishers to define and identify IP Rights. The big difference between <q>digital rights management</q> (DRM), and <q>IP rights management</q> IPRM is that DRM is about locking down assets to mitigate against piracy. IPRM is about identifying and calculating clearance to use assets for any given context, and enabling publishers to make informed decisions about using specific assets.<br /><br /><a href="http://odrl.net/">ODRL</a>, or <q>Open Digital Rights Language</q>, is a well-established, robust, extensible XML markup designed specifically for this purpose. At it's core is the ability to define relationships between parties, assets, and permissions (i.e., print, display, execute). But it's real power is the ability to express complex permissions that include conditions and constraints. For example, "<i>a licensee can use an asset in a printed book, but the print run is limited to 2,000 copies, and the asset creator must be given proper attribution and will receive two copies of the book prior to its release"</i>, or "<i>the asset can be used in print, except that it can't be distributed in North Korea"</i>.<br /><br />This is powerful, and gives publishers the capability to monitor and evaluate rights clearance while the product is in development. Using an XML Database and XQuery, it's relatively trivial to calculate clearances for all assets for a product and to display the information in a dashboard. Editors can monitor the progress of rights clearances against all assets and determine whether to acquire additional rights to use assets that haven't been cleared, or to use other assets instead. Publishers can also track asset usage to ensure that the proper royalties are paid. It also helps publishers in "what if" scenarios: they can easily determine the cost and feasibility of adapting a product for a different market, which will tell them how many of the existing assets are cleared for use in that market and how many remain that either need additional clearance or should be replace with other assets. <div><br /></div><div>Another scenario we're working on is using ODRL for wholly-owned assets. Publishers frequently commission third parties to produce photos, images, and other rich media for which the publisher retains the rights to. They want to reuse these assets for obvious cost savings, however, they don't want to over-expose assets. Frequently, editorial teams are primarily focused on one project or program, and have little insight as to what others are doing, so it's quite possible that an image could be used by more than one product at the same time. Not that this is always a bad thing, but it can lead to over-exposure. Using ODRL to manage access to assets, using embargo dates and other usage information, editorial groups can quickly make informed decisions whether to use an asset or look for another. </div><div><br /></div><div>Pretty cool stuff</div><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com3tag:blogger.com,1999:blog-5052457920976107820.post-47042944390563174792011-06-23T09:03:00.002-06:002011-06-23T09:09:38.934-06:00Using DITA for Genealogical Data<p>I’ve been working on putting my family history together for the last few years. Most of the genealogical applications have some pretty nice features, but none seemed to have all of the features I wanted. I wanted the ability to manage all of the summary information and relationships (all applications do this), but also cross reference the factual data with individual biographies. And, I want to be able to display the information in different ways and formats – not just the ones supported by any particular application.</p> <p>I started looking at the format that most genealogy programs store the data into. With few exceptions, they all use <a href="http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gctoc.htm" target="_blank">GEDCOM</a>, or <strong>GE</strong>nealogical <strong>D</strong>ata <strong>COM</strong>munication. The standard was developed by the Church of Jesus Christ of Latter-day Saints as a means of creating a portable data format to express information about individuals, families and sources (bibliography). </p> <p>GEDCOM is line-delimited field format that identifies the start of a new record with the number 0. Fields within a record are identified with an incremented number. For example a first level field line would start with the number 1. A subfield (e.g, the given name of a person’s full name) would start with the next highest number. The following is an example record for an individual</p> <ol><pre> 0 @1@ INDI<br /> 1 NAME <b> Robert Eugene/Williams/</b><br /> 1 SEX <b> M</b><br /> 1 BIRT<br /> 2 DATE <b> 02 OCT 1822</b><br /> 2 PLAC <b> Weston, Madison, Connecticut</b><br /> 2 SOUR <b> @6@</b><br /> 3 PAGE <b> Sec. 2, p. 45</b><br /> 3 EVEN <b> BIRT </b><br /> 4 ROLE <b> CHIL</b><br /> 1 DEAT<br /> 2 DATE <b> 14 APR 1905</b><br /> 2 PLAC <b> Stamford, Fairfield, CT</b><br /> 1 BURI<br /> 2 PLAC <b> Spring Hill Cem., Stamford, CT</b><br /> 1 RESI<br /> 2 ADDR <b> 73 North Ashley</b><br /> 3 CONT <b> Spencer, Utah UT84991</b><br /> 2 DATE <b> from 1900 to 1905</b><br /> 1 FAMS <b> @4@</b><br /> 1 FAMS <b> @9@</b></pre></ol><br /><p> </p><br /><p>Other than the line/sequencing delimiters, the data structures are pretty free form, and is parser dependent. Even the field names, outside the common set supplied by GEDCOM are parser dependent. So if you use one genealogy tool, it can understand these fields, but if you try to load it in another, it blows up. Gah! Add to that, GEDCOM just isn’t that great for handling rich content like pictures in a biography. </p><br /><p>This sounds like a job for XML. </p><br /><p>So the first question I had to address is how to model this. I’ve looked at some the of GEDCOM XML sites, and they suffer from the same problems as the text data structure do. Just not enough rich data. </p><br /><p>The answer I came up with was to use DITA, which has several things going for it:</p><br /><ol><br /><li>I can easily mimic GEDCOM’s data structure with a specialized map <br /><li>I can extend the model to support other potentially valuable metadata <br /><li>I can easily model rich biographical content as a topic specialization <br /><li>DITA’s numerous linking mechanisms work well for the various types of links I would need: internal references within a map, rel-tables, cross-references, external hyperlinks to third-party websites and content.</li></ol><br /><p>The first thing I did was to model and create a map specialization that mimics the GEDCOM data. For the sake of brevity, I’ll show a sample of a specialized map. If you want more information, ping me:</p><pre><?xml version="1.0" encoding="UTF-8"?><br><!DOCTYPE familytree SYSTEM "file:/opt/dita/1.2/dtd1.2/genealogy/dtd/familytree.dtd"><br><familytree><br> <title>Schmoe Family Tree</title><br /><p><br> <individual id="I1" keys="I000001" gender="male"><br> <vitals><br> <personname><br> <firstname>Joseph</firstname><br> <firstname type="nickname">Joe</firstname><br> <middlename>Aloysius</middlename><br> <lastname>Schmoe</lastname><br> <generationidentifier>III</generationidentifier><br> </personname><br> <birth><br> <date><br> <day>1</day><br> <month>1</month><br> <year>1968</year><br> </date><br> <location><br> <placename>The Stork Factory</placename><br> <addressdetails><br> <locality>Anytown</locality><br> <administrativearea>Anystate</administrativearea><br> <country>USA</country><br> </addressdetails><br> </location><br> </birth><br> </vitals><br> </individual></p><br /><p> <individual id="I2" keys="I00002" gender="male"><br> <vitals><br> <personname><br> <firstname>John</firstname><br> <firstname type="nickname">Jack</firstname><br> <middlename>Michael</middlename><br> <lastname>Schmoe</lastname></p><br /><p> </personname><br> <birth><br> <date><day>1</day><month>1</month><year>1948</year></date><br> </birth><br> </vitals><br> </individual></p><br /><p> <individual id="I000003" keys="I000003" gender="female"><br> <vitals><br> <personname><br> <firstname>Jane</firstname><br> <middlename></middlename><br> <lastname type="maidenname">Doe</lastname><br> <lastname type="marriedname">Schmoe</lastname><br> </personname><br> <birth><br> <date><br> <day>1</day><br> <month>1</month><br> <year>1947</year><br> </date><br> </birth><br> </vitals><br> </individual><br> <br> <family id="f1" keys="F1"><br> <familymeta><br> <marriage><br> <date><br> <day>1</day><br> <month>1</month><br> <year>1967</year><br> <br> </date><br> </marriage></p><br /><p><br> </familymeta><br> <child keyref="I000001"/><br> </family></p><br /><p> <familyreltable><br> <record><br> <indi><br> <personref keyref="I00001"/><br> </indi><br> <famc><br> <familyref keyref="F1"/><br> </famc><br> <fams/><br> </record><br> <record><br> <indi><br> <personref keyref="I00002"/><br> </indi><br> <famc/><br> <fams><br> <familyref keyref="F1"/><br> </fams><br> </record><br> <record><br> <indi><personref keyref="I00003"/></indi><br> <famc></famc><br> <fams><br> <familyref keyref="F1"/><br> </fams><br> </record><br> <br> </familyreltable></p><br /><p></familytree><br></p></pre><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-67662447625999031092010-08-15T16:06:00.002-06:002010-08-15T17:49:07.114-06:00The Butterfly Effect of Oracle’s Lawsuit Against Google<p>No one knows with any degree of certainty what the outcome will be from <a href="http://www.infoworld.com/d/the-industry-standard/oracle-sues-google-over-java-use-in-android-852" target="_blank">Oracle’s patent infringement lawsuit against Google</a>. While the impetus of the lawsuit is largely focused on <a href="http://www.dalvikvm.com/" target="_blank">Dalvik</a>, Google’s mobile VM based on Java, the consequences of legal action will likely reverberate through the larger Java world. </p> <p>I don’t have a dog in this fight. Yet. On the one hand, Oracle appears to be hell bent on reversing Sun’s decision to release Java to the open source community, according to an <a href="http://www.infoworld.com/t/languages-and-standards/oracle-launches-scorched-earth-fight-profit-java-875?page=0,1" target="_blank">InfoWorld article</a>:</p> <blockquote> <p>It's no secret that Larry Ellison wants to make money from Java, something Sun's execs, whom Ellison <a href="http://blogs.barrons.com/techtraderdaily/2010/05/13/oracles-ellison-sun-execs-were-astonishingly-bad-managers/">held in contempt</a>, was never able to do. It may be that Oracle wants nothing more than a cut of Google's Android revenue -- IDC's Will Stofega told Bloomberg News that the case will probably end with <a href="http://www.bloomberg.com/news/2010-08-13/oracle-says-google-s-android-operating-system-infringes-its-java-patents.html">Google agreeing to pay to license Oracle's patents</a>. </p></blockquote> <p>On the flip side, it sure looks like <a href="http://perens.com/blog/d/2010/8/13/33/" target="_blank">Google didn't do any favors for itself</a> either. </p> <p>Whatever the case, there are potential <a href="http://www.zdnet.com/blog/open-source/oracle-google-suit-challenges-open-source-establishment/7142?tag=content;feature-roto" target="_blank">far reaching implications of Java in the open source world</a>. For one, it will <a href="http://techcrunch.com/2010/08/07/why-we-need-to-abolish-software-patents/" target="_blank">reinvigorate the debate around software patents</a>. More specific to my interests here, XML was, and is, heavily influenced by Java and open source. Think <a href="http://xerces.apache.org/" target="_blank">Xerces</a> and <a href="http://xalan.apache.org/" target="_blank">Xalan</a>, <a href="http://xalan.apache.org/" target="_blank">Saxon</a>, <a href="http://xmlgraphics.apache.org/fop/" target="_blank">FOP</a>, <a href="http://ant.apache.org" target="_blank">Ant</a>, and more recently <a href="http://xmlcalabash.com/" target="_blank">Calabash</a> and <a href="https://community.emc.com/community/edn/xmltech" target="_blank">Calumet</a>. It also factors in to related technologies like the APIs for <a href="http://exist.sourceforge.net/" target="_blank">eXist</a> and <a href="http://www.marklogic.com" target="_blank">MarkLogic</a>, and fuels reference implementations of standards like DITA (see the <a href="http://ditaot.sourceforge.net" target="_blank">DITA Open Toolkit</a>).</p> <p>My immediate reaction is the cat’s already out of the bag: Java is already open source, which means nothing will change for XML technologies. For now. In the short term, it’s very likely the only ones affected are Android application developers. Longer term, however, Oracle’s behavior might well impact a wide array of technologies, including XML, by deterring developers from using Java in the first place. That would be a huge detriment to XML across many technologies and industries. Let’s hope it doesn’t get to that point.</p> <div style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px" id="scid:0767317B-992E-4b12-91E0-4F059A8CECA8:4d12200a-ee3a-4083-8283-c6ea06e46fbe" class="wlWriterEditableSmartContent">Technorati Tags: <a href="http://technorati.com/tags/XML" rel="tag">XML</a>,<a href="http://technorati.com/tags/Oracle" rel="tag">Oracle</a>,<a href="http://technorati.com/tags/Open+Source" rel="tag">Open Source</a></div><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-83222829981745666232010-02-27T13:40:00.003-07:002010-02-27T14:42:56.871-07:00Cancer SucksMy Grandparents died from it. My mother had it (in remission), but now she needs to have a full mastectomy because a genetic marker indicates she's at higher risk to get it again (she has surgery on March 7). And now, a friend and colleague of mine has it. This guy is healthier than 95% of other men his age, has like zero body fat, and is always encouraging everyone to ride with him (more like follow from a long distance). That's just hard to take in and process. While he's the one that has his world rocked, cancer also has a nasty side effect on family and friends. Put it bluntly: Cancer sucks.<br /><br />Few other spoken words, irregardless of language or dialect, can evoke as much emotion as 'cancer'. Even as scary as influenza and other diseases like AIDS are, and as much devastation they can bring to an individual, a family, a community, and even the world, there's something about cancer that is so viscerally scary to us. For me it's because cancer is, in my non-medical, simpleton view, a mutation of cells. It's not like a virus or an infection where a foreign organism is making you sick, and you can take medication to kill off the nasty invaders - it's your own cells that, for some reason have started going haywire. That's just downright frightening to me. And that cancer is so seemingly random and unpredictable just makes that much more hard to take - you could be in the best possible physical shape and still get it.<br /><br />Science and medicine have come so far when it comes to curing many forms of cancer. Many people have a good chance of living long, healthy lives if they catch the disease early. But often they have to go down through the depths of hell physically and emotionally to get to the finish line: cancer-free. Yet, they keep their eyes on the prize.<br /><br />I'm pretty sure my friend will beat this. His odds are really high, and his outlook on life and his current illness are positive, which also increases his chances. <br /><br />If nothing else, it's a reminder that life is too short, so live it to its fullest. It's so easy to become burdened with frustration, stress, angst, even hatred. In other words, we lose sight of the forest within the trees. I'm probably the most susceptible to this. <br /><br />In short: Work hard. Play harder. Laugh more. Celebrate with your family and friends. Find something everyday that is good. A sunrise or sunset, a laugh with your kids, a hug or a kiss with your spouse or significant other, a walk in the park with your dog. It's your life - own it.<div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com1tag:blogger.com,1999:blog-5052457920976107820.post-57657920805500538002010-01-27T07:35:00.002-07:002010-01-27T07:38:21.195-07:00Apple to Release Tablet PC Today: e-Books are One Target Market<p><a href="http://news.cnet.com/8301-31021_3-10440931-260.html?tag=contentMain;contentBody">Apple is releasing their tablet PC today.</a>  According to <a href="http://www.npr.org/templates/story/story.php?storyId=122994968">NPR</a>, they plan to include an e-reader.  As I mentioned in a <a href="http://jims-thoughtspot.blogspot.com/2010/01/tablet-pcs-what-does-this-mean-for-e.html">previous post,</a> I though that the tablet was the perfect medium for e-books, and would supplant the Kindle, Nook and other current e-book devices.  Only time will tell for sure.</p> <p>The move by Apple is brilliant.  It already has a first class distribution model with iTunes pushing out applications and music for the iPhone and iPod.  e-Books are just a natural progression.  </p> <p>From an XML publishing standpoint, EPUB is a relatively easy format to render. The DocBook XSLT stylesheets has an EPUB format built in.  I'm not aware of one for DITA yet, but I can't imagine that it would be too far behind, or difficult to build.</p><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-476062463241037932010-01-18T21:39:00.001-07:002010-01-18T21:39:38.712-07:00Tablet PCs - What does this mean for e-Readers?<p><a href="http://www.pcworld.com/article/186160/hps_multitouch_tablet_previewed_arrives_later_2010.html">HP previewed their new tablet PC at the 2010 Consumer Electronics Show</a>.  Apple will release their version later this spring.  Sounds to me like this could supplant the Kindle, Nook and other e-Reader devices.  It might also be the impetus for EPUB distribution.  e-Readers are nice, but they're one trick ponies.  </p> <p>From an education publishing perspective, this could really open the door to new revenue models for schools at all levels to distribute published content.  It also opens the door for new models for authors writing content.  </p> <p>More on these ideas later.</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-54081788054686661932009-11-29T10:30:00.002-07:002009-11-29T11:26:01.924-07:00Fun With XProc<p>I've been so busy with clients over the last 6 months that I haven't had much time to tinker with XProc much. I took the Thanksgiving holiday week off with the hope of having a little time to dabble with the language. Up until yesterday, I didn't open my computer once (has to be a new record) since we were busy with other things. A side note: If you're in Denver before February 7th, I highly recommend you see the <a href="http://www.dmns.org/gk/">Ghengis Khan</a> exhibit at the Museum of Nature and Science.</p> <p>As I often do, I had already done some preliminary reading beforehand. <a href="http://www.wordsinboxes.com/">James Sulak's blog</a> is a must-read. Another very useful and informative website is from EMC: <a href="https://community.emc.com/docs/DOC-3337">"XProc: Step By Step"</a>, originally authored by Vojtech Toman. Even the <a href="http://www.w3.org/TR/xproc/">W3C specification</a> is generally helpful.</p> <p>The biggest hurdle for me was to stop thinking of XProc working in the same way I think of Ant. While Ant <em>does </em>process XML content, it isn't the tool's principle focus - Ant was principally designed as Java implementation of MAKE tools. For that purpose, Ant has become the de facto standard. Before XProc was conceived, many of us used Ant as a way to control the sequencing of complex XML publishing pipelines. I worked on XIDI, a DocBook-based publishing system at HP, which was principally based on Ant scripts; the DITA Open Toolkit is an Ant-based build environment. For the most part Ant works admirably, but there are limitations. The biggest limitation is the <strong>xslt </strong>task's static parameter declarations, and the indirect nature by which parameter values are passed to an XSL Transformation through property values. It works, but it can get kludgy pretty fast for complex stylesheets. More importantly, Ant is primarily a developer's tool that acts like a Swiss Army knife that has a tool for just about every purpose. Most of these tools work very well for very specific tasks, but they aren't intended to perform specialized tasks. For that, you'll need to create custom Ant tasks or use other specialized tools. XProc is one of these specialized tools that is designed specifically for XML processing.</p> <p>So the biggest conceptual difference to grok in XProc (I like this…) is how <em>steps</em> are connected together to form a complete pipeline process. Rather than using target dependencies and explicit target calls like you do in Ant, XProc uses the concept of <em>pipes</em> to connect the output of one step to the input of the next step. It's very much like Unix shell or DOS command line pipelines. For example:</p> <p><span style="font-family:Courier New;">ps -ax | tee processes.txt | more</span></p> <p>Since many steps (including the <em>p:xslt</em> step) can have more than one input and one or more outputs (think of xsl:result-document in XSLT 2.0) we need to explicitly bind uniquely named output streams to input streams of subsequent steps. It's very analogous to plumbing, and another way that XProc is different than Ant: Ant's tasks are very dependent on the file system to process inputs and outputs; XProc pipelines are in-memory input and output streams until you explicitly serialize to the file system. </p> <p>With this I was able to create my first "real" XProc pipeline to generate <a href="http://doxsl.sourceforge.net/">Doxsl</a> output. Here it is:</p> <p><span style="font-family:Courier New;font-size:78%;"><p:declare-step name="doxsl" type="dxp:generate-doxsl-docs" <br /> psvi-required="false"<br /> xmlns:p="</span><a href="http://www.w3.org/ns/xproc%22"><span style="font-family:Courier New;font-size:78%;">http://www.w3.org/ns/xproc"</span></a> <br /><pre style="font-family:Courier New;font-size:78%;"> xmlns:dxp="urn:doxsl:xproc-pipeline:1.0"> <br /> <p:input port="source" kind="document" primary="true" <br /> sequence="false" /> <br /> <p:input port="parameters" kind="parameter" primary="false" <br /> sequence="true"/> <br /> <p:output port="result" primary="true" sequence="false" > <br /> <p:pipe step="transform" port="result"/> <br /> </p:output> <br /> <p:output port="secondary" primary="false" sequence="true" > <br /> <p:pipe step="transform" port="secondary" /> <br /> </p:output> <br /> <p:option name="format" select="'dita'"/> <br /> <p:choose name="select-stylesheet"> <br /> <p:when test="$format='dita'"> <br /> <p:output port="result" primary="true" <br /> sequence="false" > <br /> <p:pipe step="load-dita-stylesheet" <br /> port="result"/> <br /> </p:output> <br /> <p:load name="load-dita-stylesheet"> <br /> <p:with-option name="href" <br /> select="'../../dita/doxsl.xsl'" > <br /> <p:empty/> <br /> </p:with-option> <br /> </p:load> <br /> </p:when> <br /> <p:when test="$format='html'"> <br /> <p:output port="result" primary="true" <br /> sequence="false"> <br /> <p:pipe port="result"<br /> step="load-html-stylesheet"/> <br /> </p:output> <br /> <p:load name="load-html-stylesheet"> <br /> <p:with-option name="href" <br /> select="'../../html/doxsl.xsl'"/> <br /> </p:load> <br /> </p:when> <br /> </p:choose> <br /> <p:xslt name="transform"> <br /> <p:input port="source" > <br /> <p:pipe step="doxsl" port="source"/> <br /> </p:input> <br /> <p:input port="stylesheet" > <br /> <p:pipe step="select-stylesheet" port="result"/> <br /> </p:input> <br /> <p:input port="parameters"> <br /> <p:pipe port="parameters" step="doxsl"/> <br /> </p:input> <br /> <p:with-param name="debug" select="'true'"/> <br /> </p:xslt> <br /></p:declare-step></pre></p><p><span class="Apple-style-span" style="font-family:'Courier New';"><span class="Apple-style-span" style="font-size:x-small;"><br /></span></span></p><br /><p>Here's a diagram, built with EMC's <a href="http://137.69.120.115:8080/designer-20090703-1510/">XProc Designer</a>.  This tool is a great way to visualize and start your XProc scripts:</p><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_aw9JIJGuHBo/SxK7FeO3wfI/AAAAAAAAABc/nlA7GeQ7X6I/s1600/doxsl-xproc.PNG"><img style="cursor:pointer; cursor:hand;width: 320px; height: 253px;" src="http://1.bp.blogspot.com/_aw9JIJGuHBo/SxK7FeO3wfI/AAAAAAAAABc/nlA7GeQ7X6I/s320/doxsl-xproc.PNG" border="0" alt="" id="BLOGGER_PHOTO_ID_5409591805114565106" /></a><br /><br /><p>Essentially, I used the <em>p:declare-step</em> declaration so that I can declare it as a custom step (dxp:generate-doxsl-docs), which will allow it to be integrated into other pipelines. It has one option, <em>format,</em> which is used to specify which output format to generate (for Doxsl, 'html' and 'dita' are supported). The first step ("select-stylesheet") evaluates the <em>format</em> option and loads the appropriate stylesheet into the step's "result" output stream. This is used by the second step's ("transform") stylesheet port. The transform's source file (the XSLT stylesheet to be documented) is bound to the root step's source port, as is the parameters port. I also set the stylesheet's 'debug' parameter to true to inject output to the "transform" step's result port.</p> <p>All of this is done in memory and not serialized to the file system. This is intentional so that other pipelines can integrate this custom step. </p> <p>I've tested this with <a href="http://xmlcalabash.com/">Calabash</a>. I still need to evaluate with <a href="https://community.emc.com/docs/DOC-4242#comment-2527">Calumet</a>.</p> <p>Right now, these are baby steps. I think that XProc has a lot of potential. I think the next big task is to consider an XProc implementation for DITA XML processing.</p><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-9792199486525621972009-05-10T16:18:00.001-06:002009-05-10T16:18:15.233-06:00DITA’s New Keys and the 80/20 Rule<p>Have you ever used the lorem() function in Microsoft Word? How about the rand() function? Do you know all the <a href="http://support.microsoft.com/KB/211982">function keys</a>? Most of us have used Microsoft Word for countless years and don’t know about all of the “hidden” functionality that it offers. Chances are, you’ll know a few of these, but you won’t know all of them simply because you’ve never needed them. Many of these functions are extremely powerful utilities that make Word a versatile application beyond a standard formatted text editor. But they’re available if you ever have the need.</p> <p>The same is true with some of the new functionality being made available in the forthcoming DITA 1.2 release currently being worked on. Of particular interest is the introduction of <em>keys</em>. Keys provide a way for authors to create addresses to resources through the use of a named identifier rather than to a specific URI pointer. In other words, I can create an easy-to-remember key, like “ms-word-functions” that actually resolves to a URL “<em>http://support.microsoft.com/kb/211982” </em>and link to this URL using the key name in my DITA topic.</p> <p>Here’s an example of how it works. In my map, I define a topicref and set the <em>keys </em>attribute with an identifier. I also set my <em>href</em> to the physical location of the resource I want to reference.</p> <p><font face="Consolas"><map> <br />    <topicref <strong>keys=</strong>"ms-word-functions"  <br />        href="</font><font face="Consolas"><em>http://support.microsoft.com/kb/211982" <br /></em>        scope="external"/> <br /></map></font></p> <p>In my topic file, I can reference the key that's defined in my map:</p> <p><font face="Consolas"><topic id="my.topic"> <br />    <title>SampleTopic</title> <br />    <body> <br />        <p> <br />            Lorem ipsum dolor sitamet,   <br />            consectetuer adipiscing elit. Maecenas <br />            porttitor conguemassa. Fusce posuere, agna  <br />            sed pulvinar ultricies, purus <br /></font><font face="Consolas"><em><strong><xref keyref="ms-word-function">lectus</xref></strong> <br /></em>            malesuada libero, sit amet commodo magna eros <br />            quisurna. <br />        </p> <br />   </body> <br /></topic></font></p> <p>Now, when the topic is rendered, it will resolve itself to the Microsoft URL defined in my map. Pretty cool stuff. And powerful too. This has many potential uses: localizers can create translated versions of a resource using the same key reference and resolve the link to a locale-specific version of the reference. Consumers can be directed to different resources based on their profile or context within a website.</p> <p>From an authoring perspective, there's another neat user story: I can reference a "yet-to-be-determined" resource via a key, and when that resource has been created, the key's definition in the map file will resolve the key reference.</p> <p>Technically, a key definition doesn't need to be reside directly in the map that references that topic. It can live in an "ancestor" map that pulls in the topic indirectly by way of the map referencing that topic. In fact key values can be overridden: Let's assume that I define a key, called "company-website" in Map A that points to "www.company-a.com", and in Map B, I define the same key as"www.company-b.com". Map B also references Topic-1.dita which contains a keyref to "company-website". Map A references Map B. When the Topic-1.dita is rendered in the context of Map B as the primary map, the keyref will resolve to"www.company-b.com"; when Map A is the primary map, the same topic willreference www.company-a.com.</p> <ul> <li>Map A <br />Key: company-website = "www.company-a.com" <ul> <li>Map B <br />Key: company-website = “www.company-b.com” <ul> <li>Topic-1.dita <br />keyref: company-website <br /><em>resolves to: www.company-a.com</em></li> </ul> </li> </ul> </li> <li>Map B <br />Key: company-website = “www.company-b.com” <ul> <li>Topic-1.dita <br />keyref: company-website <br /><em>resolves to: www.company-b.com</em></li> </ul> </li> </ul> <p>With all great power comes even greater responsibility. Any time a topic makes use of a key reference, that topic is explicitly binding itself to a map(or many maps), meaning that a topic is no longer a unit of information that is completely independent of any particular context in which it is assembled into. You could make the argument that any reference defined in a topic to an external resource (e.g., an image or a cross-reference to another topic) by definition creates a dependency on that topic. And arguably, the <em>referenced</em> (the endpoint) resource is unaware of the object that is referencing it, regardless of whether it's a topic reference or a cross-reference. But there is an additional dependency in the case of keys: Any map that references a topic with a key reference must define the key. So in a sense, not only does the map (or an ancestor map) need to know <em>about</em> the topic, it needs to discover what the topic is <em>about</em>, specifically related to any key references it points to. Consequently, somewhere along the line, at least one map must define the keys used by a topic.  Did you get all that?  Imagine what your XML authoring tools, CMS systems, and rendering platforms will need to do to manage this.</p> <p>This is pretty sophisticated and powerful functionality.  But the question is, do you need to use <em>keys</em> and <em>keyrefs</em> in order to use DITA?  More importantly, will your tools need to support keys to take advantage of DITA's other capabilities?  The short answer is <em>no.</em>  In fact, I would expect that <em>keys/keyref</em> -enabled DITA support is still a way off for most DITA-enabled tools.  Nevertheless, you can still use DITA with the current tools and get most, if not all, of what you need.  Just like Microsoft Word with features like MailMerge, keys and keyref will be there if you need them, but chances are, you can get by without them for most content without ever knowing you missed it. </p> <p>Finally,  the possibility of defining indirect links has opens the door to many different possibilities for dynamically driven, profile- and locale-specific content.  This is very cool stuff - the kind of thing XML folks like me get excited about.  But from a practical standpoint, there are potential downsides too.  Keys and key references add another layer of complexity to planning the authoring, deployment and management of DITA content.  In reality, most tools aren't ready for this complexity just yet.  So while the the standard is ahead of the game, the rest of the industry will be playing catch up.  Still, Ride the Wave.  </p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-49879873907075625192009-04-29T13:56:00.001-06:002009-04-29T13:56:04.446-06:00Content Management Strategies/DITA North America Conference Review<p>I wasn’t able to attend many of the session since I was manning the Flatirons Solution booth.  Yet from talking with the attendees who visited with us, here are some of the key takeaways:</p> <ul> <li><strong>DITA is here to stay</strong>. This is not news, but the key point here is that organizations are adopting the standard in earnest, as evidenced by the 150-200 attendees who came despite a bad economy, and discretionary budgets being whittled to next to nothing.  This means that organizations are thinking about DITA as an integral part of their long term strategy. </li> <li><strong>DITA’s scope is not only Technical Publications.</strong>  Again, not earth-shattering news.  With specializations like Machine Industry, Learning Objects, and gobs of others, DITA is extending its reach to whole industries that haven’t been able to take advantage of XML before now.  At the conference, I spoke to attendees in a wide range of industries including bio-tech, and manufacturing.</li> <li><strong>Shifting focus from Content Authoring to Content Management and Content Delivery Services.</strong>  This is a fundamental shift.  Eric Severson emphasized this point when he demonstrated that Microsoft Word <em>could</em> be used to create DITA for a <em>specific</em> class of users that aren’t the primary audience for more conventional XML authoring solutions.  Obviously this raised a few eyebrows in the audience, but the point is that DITA’s architecture is such that even casual contributors, given a few minor constraints in Word, can certainly provide content that can be easily turned into DITA. </li> <li><strong>DITA will live in Middleware.</strong> This is a key point. While the focus of the conference was centered around DITA and content management, there’s more here than meets the eye.  I had the opportunity to sit in on the open forum that discussed upcoming v1.2 features.  Many of these features are centered around link handling (things like <em>keyref</em>, <em>conref push,</em> and <em>conref keys [conkeyref]</em>).  There will be greater emphasis on managing all kinds of linking, including indirect links that could  have significant implications on vendors’ existing<strong> </strong>architectures.  While it still will be possible to manage small projects from simple file management strategies (including things like Subversion), larger projects and enterprise-wide implementations, particularly those that want to take advantage of these new features will need more sophisticated applications (read: a content management system) to manage the myriad of link strategies being made available.  <br /> <br />Even rendering tools will need to be more sophisticated to support these new features.  The DITA Open Toolkit is currently working on a new version (1.5) to support these.  Other rendering applications will need to start thinking about how they plan to support these features. <br /> <br />I’ll have more thoughts on this particular topic later.   Suffice it to say that there are some key assumptions that current DITA adopters take for granted and make impact how they design and create content in the future. </li> <li><strong>XML Authoring tools will get more complex.  </strong>To support all the new features coming in DITA 1.2, DITA-aware XML authoring tools will need to be <u>tightly</u> integrated into middleware systems, particularly the CMS.  There will also be a strong emphasis for authoring tools to handle a wide variety of link and referencing strategies.  I anticipate that these applications will be more process-intensive, with larger footprints on a user’s PC.  I also anticipate that the level of sophistication required to “operate” these tools will be much higher.  So the emphasis for XML Authoring tool vendors will have to focus on both features and usability.  </li> </ul> <p>This conference was illuminating on many different facets.  Even the vendors I spoke to seemed to realize that DITA is a truly <em>disruptive technology</em> that has changed the way the entire industry thinks about XML. In the current economic reality, this is the perfect time to be thinking about what this all means and how organizations can take advantage of these innovations in their environment.  Ride the wave.</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com1tag:blogger.com,1999:blog-5052457920976107820.post-88004554232775338842009-04-25T15:50:00.002-06:002009-04-25T15:51:05.688-06:00XProc and DITA: Random Thoughts<p>I’ve been following <a href="http://www.wordsinboxes.com/">James Sulak's Blog</a>. He has some pretty impressive detailed discussions about using <a href="http://www.w3.org/TR/xproc/">XProc</a>. XProc is an XML pipeline processing language, specifically designed to provide instructions for processing XML content. The Recommendation specifies many different kinds of “steps” that can be assembled in virtually any order to control the sequencing and output from one step to another.</p> <p>Right now, DITA’s <em>reference implementation</em>, the DITA Open Toolkit (DITA OT) uses Apache Ant and custom tasks to process DITA XML content. One of the principle limitations with the DITA OT is its reliance on XSLT 1.0 and extensions (particularly the Idiom FO Plugin) to handle the rendering. </p> <p>With XProc-enabled tools like <a href="http://www.xmlcalabash.com/">Calabash</a>, it seems like DITA could easily processed using XProc, along with an upgrade of the stylesheets to 2.0. </p><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com1tag:blogger.com,1999:blog-5052457920976107820.post-21529836463929115182009-04-25T08:56:00.001-06:002009-04-25T08:59:45.598-06:00Content Management Strategies/DITA North America Conference<p>I’ll be attending the conference in St. Petersburg, FL.  Come visit the Flatirons Solutions booth while you’re there.  It should be a very interesting conference.  </p> <p>Eric Severson, CTO of Flatirons Solutions will be presenting a potentially “game-changing” presentation that speaks to lowering the “barrier to entry” into XML authoring. I recommend seeing this one.</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-55638330876875086742009-03-28T09:33:00.002-06:002009-03-28T10:24:06.074-06:00DocBook Going Modular<p>Scott Hudson, Dick Hamilton, Larry Rowland and I (AKA, “The Colorado DocBook Coalition”) recently drafted a proposal to support “modular” DocBook and presented it to the DocBook TC yesterday. In general, this proposal is in response to huge demand for DITA-like capabilities for DocBook. </p> Many core business factors are driving DocBook in this direction: <br /> <ul> <li><strong>more distributed authoring</strong>: authors are responsible for specific content areas rather than whole manuals. Content could be authored by many different authors, even some in different organizations altogether. </li> <li><strong>content reuse</strong>: This has long been a "holy grail" of information architects: write content once, reuse in many different contexts </li> <li><strong>change management</strong>: isolate the content that has changed. This is a key driver for companies that have localization needs. By modularizing their content, they can drive down costs by targeting only the changed content for translation. </li> </ul> <p>Additionally, there are additional downstream opportunities for modularized content:</p> <ul> <li><strong>dynamic content assembly</strong>: create "publications" on the fly using an external assembly file that identifies the sequence and hierarchy of modular components rather than creating a single canonical instance. </li> </ul> <p>The following excerpts from the proposal detail the preliminary features (Important: these are not yet set in stone and are subject to change). The final version will be delivered with the 5.1 release. </p> <p><strong>Assemblies</strong></p> <p>The principle metaphor for Modular DocBook is the “assembly”. An <em>assembly</em> defines the resources, hierarchy and relationships for a collection of DocBook components. The <span style="font-family:Consolas;"><assembly></span> element can be the structural equivalent of any DocBook component, such as <br />a book, a chapter, or an article. Here’s the proposed content model in RelaxNG Compact mode:</p> <p><span style="font-family:Consolas;">db.assembly = <br /> element assembly { <br /> db.info?, db.toc*, db.resources+, db.relationships* <br /> }</span></p> <p><strong>Resources</strong></p> <p>The <resources> element is high-level container that contains one or more resource objects that are managed by the <assembly>. An <assembly> can contain 1 or more <resources> containers to allow users to organize content into logical groups based on profiling attributes.</p> <p>Each <resources> element must contain 1 or more <resource> elements. </p> <p><span style="font-family:Consolas;">db.resources = <br /> </span><span style="font-family:Consolas;">element resources { <br /> db.common.attributes, db.resource+ <br /> }</span></p> <p><strong>Specifying Resources</strong></p> <p>The <resource> element identifies a "managed object" within the assembly. Typically, a <resource> will point to a content file that can be identified by a valid URI. However a <resource> can also be a 'static' text value that behaves similarly to a text entity. </p> <p>Every <resource> MUST have a unique ID value within the context of the entire <assembly></p> <p><span style="font-family:Consolas;">db.resource = <br /> element resource { <br /> db.common.attributes, <br /> attribute fileref { text }?, <br /> attribute resid {text}?, <br /> text? <br /> }</span></p> <p>Content-based resources can also be content fragments within a content file, similar to an URI fragment: <em>file.xml/#ID</em>. </p> <p>Additionally, a resource can point to another resource. This allows users to create "master" resource that can be referenced in the current assembly, and indirectly point the underlying resource that the referenced resource identifies.</p> <p>For example:</p> <p><span style="font-family:Consolas;"><resource <br /> id="master.resource" <br /> fileref="errormessages.xml"/> <br /><resource <br /> id="class.not.found" <br /> resid="{master.resource}/#classnotfound"/> <br /><resource <br /> id="null.pointer" <br /> resid="{master.resource}/#nullpointer"/></span></p> <p>The added benefit of indirect references is that users can easily point the resource to a different content file, provided that it used the same underlying fragment ids internally. It could also be used for creating locale-specific resources that reference the same resource id. </p> <p>Text-based resources behave similarly to XML text entities. A content-based resource can reference a resource, provided that both the text resource and the content resource are managed by the same assembly. </p> <p><em>assembly.xml:</em> </p> <p>... <br /><span style="font-family:Consolas;"><resource id="company.name">Acme Tech, Inc.</resource> <br /><resource id="company.ticker">ACMT</resource> <br /></span>... </p> <p><em>file1.xml:</em> </p> <p><span style="font-family:Consolas;"><para><phrase resid="company.name"/> (<phrase resid="company.ticker"/>) is a <br />publicly traded company...</para></span> </p> <p><strong>Organizing Resources into a Logical Hierarchy</strong></p> <p>The <toc> element defines the sequence and hierarchy of content-based resources that will be rendered in the final output. It behaves in a similar fashion to a DITA map and <em>topicrefs</em>. However, instead of each <tocentry> pointing to a URI, it points to a resource in the <resources> section of the assembly: </p> <p><span style="font-family:Consolas;"><toc> <br /> <tocentry linkend="file.1"/> <br /> <tocentry linkend="file.2"> <br /> <tocentry linkend="file.3"/> <br /> </tocentry> <br /></toc></span> </p> <p><span style="font-family:Consolas;"><resources> <br /> <resource id="data.table" fileref="data.xml"/> <br /> <resource id="file.1" fileref="file1.en.xml"/> <br /> <resource id="file.2" fileref="file2.en.xml"/> <br /> <resource id="file.3" fileref="{data.table}/#table1"/> <br /></resources></span></p> <p><strong>Creating Relationships Between Resources</strong></p> <p>One of the more clever aspects of DITA’s architecture is the capability to specify relationships between topics within the context of the map (and independent of the topics themselves). The DocBook TC is currently considering several proposals that will enable resources to be related to each other within the assembly.</p> <h3>The Benefits of a Modular DocBook</h3> <p>There is a current mindset (whether it’s right or wrong is irrelevant) that DocBook markup is primarily targeted for “monolithic” manuscripts. With this proposal, I think there many more possibilities for information architects to create new types of content: websites, true help systems, mashups, dynamically assembled content based on personalized facets (Web 2.0/3.0 capabilities), a simplified Localization strategy like that which has been advocated in DITA.</p> <p>What’s more: the design makes no constraints on the type of content resources referenced in an assembly: In fact they can be any type: sections, chapters, images, even separate books (or assemblies) to mimic DocBook’s <em>set</em> element.</p> <p>The design takes into account existing DocBook content that currently exists as “monolithic” instances, but is flexible enough to support other applications like IMF manifests for SCORM-compliant content, making it easy to create e-Learning content.</p> <p>As the first draft of the proposal, I would expect that there will be changes between now and the final spec. Yet, the core of the proposal should remain relatively intact. If you would like to get involved or have other ideas, let me know. Stay tuned.</p> <div class="wlWriterEditableSmartContent" id="scid:0767317B-992E-4b12-91E0-4F059A8CECA8:a1d377ac-5a80-4c6b-9e73-5b957f522996" style="padding-right: 0px; display: inline; padding-left: 0px; float: none; padding-bottom: 0px; margin: 0px; padding-top: 0px">Technorati Tags: <a href="http://technorati.com/tags/DITA" rel="tag">DITA</a>,<a href="http://technorati.com/tags/DocBook" rel="tag">DocBook</a>,<a href="http://technorati.com/tags/Modular+DocBook" rel="tag">Modular DocBook</a></div> <div class="wlWriterEditableSmartContent" id="scid:0767317B-992E-4b12-91E0-4F059A8CECA8:23cd70ad-038d-412a-a0fb-dcb7b112f9c1" style="padding-right: 0px; display: inline; padding-left: 0px; float: none; padding-bottom: 0px; margin: 0px; padding-top: 0px">del.icio.us Tags: <a href="http://del.icio.us/popular/DocBook" rel="tag">DocBook</a>,<a href="http://del.icio.us/popular/Modular+DocBook" rel="tag">Modular DocBook</a>,<a href="http://del.icio.us/popular/DITA" rel="tag">DITA</a></div><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com7tag:blogger.com,1999:blog-5052457920976107820.post-18363290821227166562009-02-13T10:32:00.001-07:002009-02-13T10:32:50.183-07:00XMetaL Reviewer Webinar<p>I attended a webinar yesterday hosted by Just Systems for their XMReviewer product.  The problem space is that conventional reviewing processes are cumbersome and inefficient, particularly when there are multiple reviewers that need to review a document concurrently.  In general, most review processes rely on either multiple draft copies being sent out, one to each reviewer, and then it’s up to the author to “merge” the comment feedback into the source.</p> <p>With XMReviewer, the entire review process is centralized on the XM Reviewer server:  Reviewers simply access the document online, provide their comments and submit. What’s really cool is that reviewers are notified in almost real time when another reviewer has submitted their comments and can integrate their fellow reviewer’s comments into their own.</p> <p>The real advantage is that authors have all reviewer comments integrated and merged into a single XML instance, and <em>in context</em>. Very Nice.  </p> <p>There’s also a web service API that allows you to integrate XMReviewer with other systems including a CMS that can automatically deploy your content to the XMReviewer server.</p> <p>There are some nice general reporting/auditing features built in as well.  However, I didn’t see anything that would allow me to customize the reports or to manipulate the data, but I wouldn’t consider that a show stopper.</p> <p>For folks used to “offline” reviews, e.g., providing comments at home, or on a plane, this won’t work for you as it is a server application.  Nonetheless, having the ability to have full <em>control</em> and <em>context</em> for review comments far outweighs the minor inconvenient requirement of being online and getting access to the server (most companies these days have VPN, so it’s not a showstopper).  Though, I can envision the possibility of the server downloading and installing a small-footprint application that would allow users to review the document “offline” and being able to “submit” the comments back to the server when the reviewer is back online.  </p> <p>The only other limitation right now is that XMReviewer doesn’t support DITA map-level reviews in which you can provide comments on multiple topics within a map.  This is currently in development for a future release – stay tuned.</p> <p>Overall, XMReviewer looks great and can simplify your content review process.  Check it out.</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-88916431431770835892009-02-11T11:39:00.001-07:002009-02-11T11:39:31.127-07:00Microsoft Live Writer Convert<p>After reading a few blogs here and there, I’ve seen a few posts about Microsoft’s Live Writer for creating blog posts.  Always on the lookout for new toys and tools, I decided to download it and try it out. I gotta admit, I’m sold.  This is a pretty nice application that allows me to work offline to write and edit my posts and when I am ready and able to connect, I simply push the “Publish” button and away it goes.  Sweet.</p> <p>It’s simple to install, and simple to configure to point to virtually any blog host out there.  In short: <em>It just works.</em>  </p> <p>This is what software should be like.  It should solve a particular set of problems and only those problems well without requiring massively complex installation and configuration steps.  The interface should be intuitive (Live Writer is <em>wickedly intuitive</em>) and should help rather than hinder me in my productivity.  This tool does that.  Well done, Microsoft!</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-47643742532445194502009-02-09T21:27:00.001-07:002009-02-09T21:27:50.526-07:00Implementing XML in a Recession<p>With the economic hard times, a lot of proposed projects that would allow companies to leverage the real advantages of XML are being shelved until economic conditions improve.  Obviously, in my position, I would love to see more companies pushing to using XML throughout the enterprise. We’ve all heard of the advantages of XML: reuse, repurposing, distributed authoring, personalized content, and so on. These are underlying <em>returns on investment</em> for implementing an XML solution.  The old business axiom goes, “you have to spend money to make money.”  A corollary to that might suggest that getting the advantages of XML must mean spending <em>lots</em> of money.</p> <p>However, here’s the reality: implementing an Enterprise-wide XML strategy <em>doesn’t have to break the bank.</em> In fact, with numerous XML standards that are ready to use out of the box, like DITA and DocBook for publishing and XBRL for business, the cost of entry is reduced dramatically compared to a customized grammar.  </p> <p>And while no standard is always a 100 percent perfect match for any organization’s business needs, at least one is likely to support at least 80 percent.  We often consult our clients to use a standard directly out of the box (or with very little customization) until they have a good “feel” of how well it works in their environment before digging into the real customization work.  Given that funding for XML projects is likely to be reduced, this is the perfect opportunity to begin integrating one of these standards into your environment, try it on for size while the economy is slow, and when the economy improves, <em>then</em> consider how to customize your XML content to fit your environment.</p> <p>Any XML architecture must encompass the ability to create content and to deliver it, even one on a budget.  Here again, most XML authoring tools available on the market have built-in support for many of these standards, with little to no effort, you can use these authoring environments out of the box and get up to speed.  </p> <p>On the delivery side, these same standards, and in many cases the authoring tools have prebuilt rendering implementations that can be tweaked to deliver high-quality content, with all of the benefits that XML offers.  In this case, you might want to spend a little more to hire an expert in XSLT.  But it doesn’t have to break the bank to make it look good.</p> <p>The bottom line: A recessionary economy is a golden opportunity to introduce XML into the enterprise. In the short term, keep it simple, leverage other people’s work and industry best practices and leave your options open for when you <em>can</em> afford to do more.  Over time when funding returns, then you can consider adding more “bells and whistles” that will allow you to more closely align your XML strategy with your business process.</p> <div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com0tag:blogger.com,1999:blog-5052457920976107820.post-74605394003733921792009-02-06T23:22:00.007-07:002009-02-07T08:19:36.013-07:00DOXSL: Reflexive Code Documentation and Testing, and other random XSLT thoughtsOne of the cool things about Doxsl is that I can test it on itself. Since Doxsl is an XSLT application (v2.0), I can create documentation using itself. I'll be posting these on the Sourceforge project website soon - when I finish documenting my own code. Hmmm... walking the talk and eating your own dogfood at the same time - who woulda thunk it?<div><br /></div><div>There's something about reflexive tools that is just pretty cool. I built another application to document the DocBook RelaxNG schemas into DocBook. <div><br /></div><div>The Doxsl DocBook stylesheets are coming along. If I can manage to get some free time at night, I might be able to finish these in about a week. The one thing I really need to do is check out <a href="http://code.google.com/p/xspec/w/list">xspec</a> to see if I can write test cases against the code. I've tried XMLUnit about a year ago, but the critical difference is that it tests the artifact of the transform, rather than the code itself. Implicit testing is better than no testing at all, but it doesn't mean that it's optimal. I <span class="Apple-style-span" style="font-style: italic;">love</span> JUnit and NUnit for testing my Java and .NET code, and it's great for the large enterprise-wide projects I work on. While Doxsl is just a teeny, tiny little application (tool is more like it), there is enough code right now that even simple changes can cause big problems. I'll let you know what I think about xspec when I've had a chance to tinker with it.</div><div><br /></div><div>Another XSL application I've been working over the last year or so is an alternative to the DITA Open Toolkit. The OT is OK as a reference implementation, but it can be a bear to work with even to handle minor customizations. Part of the problem, in my opinion, is that the OT's stylesheets are dependent on the Ant scripts that drive it. In fact, it takes some fancy footwork to get the stylesheets to run outside of the ant environment. And here again, Ant <span class="Apple-style-span" style="font-style: italic;">is</span> the tool for creating a consistent and reliable sequence of build steps for a development environment. Where it falls short is dealing with sophisticated XSLT applications that have lots of parameters (optional or otherwise). The parameters have to be "hardcoded" into the XSLT task. Not my idea of extensible.</div><div><br /></div><div>Add to that: the stylesheets are still using XSLT 1.0 - ehhh. I'll use 1.0 if I <span class="Apple-style-span" style="font-style: italic;">have</span> to (thanks Microsoft). There's just so much more that 2.0 provides that makes stylesheet development much, much easier. At any rate, I've been working on my own implementation of DITA using XSLT 2.0 and with relying on Ant. HTML and CHM are working, FO is the hard part. What I find interesting is that I can process a map containing over 160 topics into HTML in about 20 seconds with my stylesheets. It takes over 2 minutes with the OT! The results are anecdotal , and I haven't really tested the stylesheets on anything really big, but I like what I see so far (in fact the <a href="http://doxsl.sourceforge.net/">DOXSL</a> website uses DITA and my stylesheets to render it).</div><div><br /></div><div><br /></div></div><div class="blogger-post-footer"><a href="http://blogsearch.google.com/ping?name=Jim%27s+Thoughtspot&url=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2F&changesURL=http%3A%2F%2Fjims-thoughtspot.blogspot.com%2Fatom.xml"></a>
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div>Jim Earleyhttp://www.blogger.com/profile/05803711165788676445noreply@blogger.com4