It turns out that the MLB June Amateur draft is quite interesting in that drafting prospects is a big gamble. Drafts may or may not sign in any given year, and remain eligible for drafts in subsequent years. If they don’t sign during that year, they could be drafted by another team in following years. Alternately, they could be selected by the same team and signed. However, even if they do sign, there’s no guarantee that they’ll make it to big leagues. And even if they do, they might not make it with the same team they signed with initially (in other words, they were traded before reaching the MLB).
In effect there are several scenarios, depending how the data is aggregated or filtered. However, these scenarios are well defined and constrained to a finite set of possibilities:
- All draft picks
- All signed draft picks
- All signed draft picks who never reach the MLB (the vast majority don’t)
- All signed draft picks who reached the MLB with the club that signed them
- All signed draft picks who reached the MLB with another club
- All unsigned draft picks
- All unsigned draft picks who reached the MLB with a different club
- All unsigned draft picks who reach with the same club, but at a later time
- All unsigned draft picks who never reach the MLB
Working with various strongly typed languages like C# or Java, I would use a construct like an enum to encapsulate these possibilities into one object. Then I can pass this into a single method that will allow me to conditionally process the data based on the specified enum value. Pretty straightforward. For example, in C# or Java I would write:
The important aspect of enumerations is that each item in an enumeration can be descriptive and also map to a constant integer value. For examplepublic enum DraftStatus { ALL, //All draft picks (signed and unsigned) UNSIGNED, //All unsigned draft picks UNSIGNED_MLB, //All unsigned picks who made it to the MLB SIGNED, //All signed draft picks SIGNED_NO_MLB, //Signed but never reached the MLB SIGNED_MLB_SAME_TEAM, //signed and reached MLB with the same team SIGNED_MLB_DIFF_TEAM //signed and reached with another club };
UNSIGNED
is much more intuitive and meaningful than 1
, even though they are equivalent.Working with XQuery, I don’t have the luxury of an enumeration. Well, at least in the OOP sense. I could write separate functions for each of the scenarios above and perform the specific query and return a the desired subset I need. But that’s just added maintenance down the road.
At first I toyed with the idea of using an XML fragment containing a list of elements that mapped the element name to an integer value:
And then using a variable declaration in my XQuery:<draftstates> <ALL>0</ALL> <UNSIGNED>1</UNSIGNED> <UNSIGNED_MLB>2</UNSIGNED_MLB> <SIGNED>3</SIGNED> <SIGNED_NO_MLB>4</SIGNED_NO_MLB> <SIGNED_MLB>5</SIGNED_MLB> <SIGNED_MLB_SAME_TEAM>6</SIGNED_MLB_SAME_TEAM> <SIGNED_MLB_DIFF_TEAM>7</SIGNED_MLB_DIFF_TEAM> </draftstates>
To use it, I need to cast the element value to an integer. Using an example, let's assume that I want all signed draftees who reached the MLB with the same team:module namespace ds="http://ghotibeaun.com/mlb/draftstates"; declare variable $ds:draftstates := collection("/mlb")/draftstates;
It works, but it’s not very elegant. Every value in the XML fragment has to be extracted through thedeclare function gb:getDraftPicksByState($draftstate as xs:integer, $team as xs:string) as item()* { let $picks := if ($draftstate = xs:integer($ds:draftstates/SIGNED_MLB_SAME_TEAM)) then let $results := /drafts/pick[Signed="Yes"][G != 0][Debut_Team=$team] return $results (: more cases... :) else () return $picks }; (:call the function:) let $sameteam := gb:getDraftPicks(xs:integer($ds:draftstates/SIGNED_MLB_SAME_TEAM), "Rockies") return $sameteam
xs:integer()
function which is added logic and makes the code less readable. Add to that, IDEs like Oxygen that enable code completion (and code hinting) doesn’t work with this approach. What does work well (at least in Oxygen, and I suspect in other XML/XQuery IDEs) are code completion for variables and functions, which led me to another idea. Prior to Java 5, there weren’t enum structures. Instead, enumerated constants were created through the declaration of constants encapsulated in a class:
This allowed static access to the constant values via the class, e.g.,public class DraftStatus { public static final int ALL = 0; public static final int UNSIGNED = 1; public static final int UNSIGNED_MLB = 2; public static final int SIGNED = 3; public static final int SIGNED_NO_MLB = 4; public static final int SIGNED_MLB = 5; public static final int SIGNED_MLB_SAME_TEAM = 6; public static final int SIGNED_MLB_DIFF_TEAM = 7; }
DraftStatus.SIGNED_MLB_SAME_TEAM
.The same principle can be applied to XQuery. Although there isn’t the notion of object encapsulation by class, we do have encapsulation by namespace. Likewise, XQuery supports code modularity by allowing little bits of XQuery to be stored in individual files, much like .java files. To access class members, you (almost always) have to import the class into the current class. The same is true in XQuery. You can import various modules into a current module by declaring the referenced module’s namespace and location.
Using this approach, we get the following:
mlbdrafts-draftstates.xqy
Now we reference this in another module:xquery version "1.0"; module namespace ds="http://ghotibeaun.com/mlb/draftstates"; declare variable $ds:ALL as xs:integer := 0; declare variable $ds:UNSIGNED as xs:integer := 1; declare variable $ds:UNSIGNED_MLB as xs:integer := 2; declare variable $ds:SIGNED := 3; declare variable $ds:SIGNED_NO_MLB := 4; declare variable $ds:SIGNED_MLB := 5; declare variable $ds:SIGNED_MLB_SAME_TEAM := 6; declare variable $ds:SIGNED_MLB_DIFF_TEAM := 7;
Which gives as direct access to all the members like an enumeration:import module namespace ds="http://ghotibeaun.com/mlb/draftstates" at "mlbdrafts-draftstates.xqy";
The bottom line is that this approach has worked really well for me. I can use descriptive constant names that map to specific values throughout my code and shows how you can add a little rigor to your XQuery coding.