XML namespace control — who can alter a namespace?
by ZetaGecko | Add Your Comments | XML
An issue has arisen in the Atom working group that I wouldn't have thought we'd need to discuss: who can add elements to the Atom namespace? Technically, the namespace spec doesn't speak to this issue, but the spirit of it seems plenty clear to me. Should the Atom spec spell the answer out explicitly? Given that not everyone appears to agree with me about the obviousness of the answer, it wouldn't hurt.
Robert Sayre asked the question thusly: "Do we have a situation similar to HTTP headers and methods, where you can extend it as needed, but you had better coordinate' if you want to make sure everyone interprets it the same way? That's fine. Or do we have an IETF-administered registry?"
XML namespaces create a very different situation from HTTP headers and methods. In HTTP, if two people invent new headers or methods and give them the same name, there's no way to distinguish them. Let's say for example, two people invent new methods, both named REFORMAT, both used for remote editing operations. One of them is used to instruct the webserver to translate the document to a different character encoding. The other instructs the webserver to convert between different HTML or XHTML versions using a specified set of transformations for tags that appear in one and not the other. The first line of either REFORMAT request would look like this:
REFORMAT /path/to/the/document HTTP/1.1
So which kind of REFORMAT is this? In this example, you could probably figure it out by looking at the headers that followed, and it may even be possible to have one REFORMAT request handle both operations, but not every new method or header name collision would be so easy to resolve. Any even if this case, the code to dispatch this request to a handler would have to be delayed till all the request headers had been read (do you send it to this handler, that handler, or one after the other?) making the operation much more complex than it would be if there were only one meaning for REFORMAT.
Now let's consider how this kind of thing affects XML documents. Let's say we have an XML format that defines a document that looks like this:
<foo xmlns="http://foo.bar.com/foo">
<blar>I am the data</blar>
</foo>
Next, somebody decides to add an element name "bleep" that isn't defined in the specification that created the namespace:
<foo xmlns="http://foo.bar.com/foo">
<blar>I am the data</blar>
<bleep>This is some extra "bleep" type data</bleep>
</foo>
No big deal so far, right? But what if some other person also wants to add more data to the document, and they also choose the name "bleep" for their element, either not realizing that the name "bleep" has already been used, or not caring that they're reusing the name? First of all, we wouldn't be able to tell which type of "bleep" was being published in the above case. And second, if somebody wanted both types of "bleep" data in their document, we'd end up with the following:
<foo xmlns="http://foo.bar.com/foo">
<blar>I am the data</blar>
<bleep>This is some extra "bleep" type data</bleep>
<bleep>Here's the second kind of "bleep" data</bleep>
</foo>
...and we wouldn't know which type of "bleep" either of them was. Things get ugly. But using namespaces, each of the "bleeps" can be easily distinguished:
<foo xmlns="http://foo.bar.com/foo" xmlns:b1="http://b1.com/myns" xmlns:b2="http://b2.com/myns" >
<blar>I am the data</blar>
<b1:bleep>This is some extra "bleep" type data</b1:bleep>
<b2:bleep>Here's the second kind of "bleep" data</b2:bleep>
</foo>
The first "bleep" is the kind defined in the namespace "http://b1.com/myns", and the second is the kind defined in the namespace "http://b2.com/myns".
Okay, sorry if I went into too much detail for those of you who already understand XML namespaces. The point is that unlike HTTP methods and headers, XML has a built in way to ensure that the full name of an element is unique (assuming a unique URI is chosen for the namespace name, which is easy). Given that there's a built in method for avoiding naming collisions, would it make any sense at all for someone other than the person or organization that created a particular namespace to add, remove, or modify elements in that namespace? Is there any need to? No. Just create elements in your own namespace and mix them into the document. Altering someone else's namespace would completely violate the entire reason for having namespaces.
This seems obvious enough to me that it shouldn't need to be spelled out explicitly. Until I remember that we live in a world where some dopes like to do everything that isn't explitly forbidden, the more disruptive the better, even if it accomplishes nothing useful, and then stubbornly claim that they had every right to do it. So yeah, it should be explicitly forbidden.
But where? It probably should have been spelled out in the namespace specification--the creator of the namespace has complete authority over changes to the namespace. They can delegate it, but no one can take it without some form of explicit authorization.
But the namespace spec doesn't say that. So yeah, I guess the Atom spec should assert authority of its namespace and specify who can change it--namely the IETF.