Image

Order in XML

There is a lot of controversy regarding how and why one can order items in and XML file.





Generally speaking, setting items order in XML should not differ from
ordering them in a relational database or in any other representation
form. But due to XML readability, certain misconceptions penetrate the
minds of those who want to a) keep information regarding items order in
XML, and b) keep it simple.





First, several definitions.





If X is a set, we define partial order on set X as a binary
relationship R that is reflective, transitive and antisymmetric, that
is,





Axiom a) xRx;


Axiom b) xRy and yRz => xRz;


Axiom c) xRy and yRx => x = y.





See http://en.wikipedia.org/wiki/Partial_order for reference.





Total order, or linear order is a partial order where all elements are comparable (see http://en.wikipedia.org/wiki/Totally_ordered_set):





Axiom d) xRy or yRx.





We encounter partial order when we need to specify the order of actions
- some of them have to follow each other, some of them can be executed
in parallel. In programming languages there is no easy way to specify
parallelism; operators are written sequentially and are perceived as
following each other (although actually they may be shuffled when
executed).





It is natural to place totally-ordered items sequentially - e.g. in
XML, if items specify certain actions, and if item B lexicographically
follows item A, action B is perceived as following action A. Linear
order, though, is not what one expects to use when describing
processes. That's why, for instance, in Ant one has to specify target
dependencies in depends attribute, for example:





<target name="clean" depends="init, cleanReport">

</target>




Note that the order in which init and cleanReport are listed in depends does not matter: this is a set of targets on which target named clean is dependent. This is how the actual order is specified, not by placing clean lexicographically after init and cleanReport. The order here is partial order, it is not linear, and cannot be expressed by position of an item in the file.





In a sense, it is easier when we deal with relational databases - we
know that these do not have any predefined order at all, so that to
specify order we just have to introduce an additional table that would
store this binary relationship.





It is curious that dependencies in Ant build.xml are specified in an
attribute. This does not look like an attribute; more, to produce the
list, one has to parse the value of the attribute, which seems to me an
action more intimate than necessary. Why not have a dependency
subelement, and list all dependencies there, something like this:





<target name="clean">
  <depends>

    <target name="init"/>

    <target name="cleanReport"/>

  <depends>

</target>




Looks very natural - but unfortunately we usually cannot do this, and
all thanks to the peculiar features of SAX, the API that some people
use for retrieving information from XML sources. When you use SAX, what
you get in your handler is the element name (in our case it is target);
there is no information regarding the context in which it is used. So,
in SAX we won't be able to tell the difference between the item <target name="init"/> within dependency element and outside, where this target is actually defined.





Interesting. The only other case when you cannot use a name outside its
declaration is only in shell scripts, where you declare a variable, and
then have to prepend its name with $, dollar sign, every time you use
it.





(This weakness of SAX, inability to determine element's context, is the
cause of introducing otherwise feature that otherwise would be totally
useless: namespaces. While namespaces would not help in the situation
like the one described above, they help to find out what kind of
element is the one that we are parsing if the document contains
elements of different origin that happen to have the same name. Notice
that, if the parser know the context, there is hardly any need to use
namespaces, since it is always obvious what it is about.)





So, is there any real life case when the physical, lexicographic order
of elements in XML would bear any meaning? I don't know. Do you? Please
let me know if you do.