What’s still wrong with Atom (and probably will be in 1.0)
by ZetaGecko | Add Your Comments | Atom/RSS
One needn't assume from the title of this post that I'm unhappy with Atom. For the most part, I think the specification is good. But if it were up to me, a few things would be different. That's probably true for everyone in the Working Group, and we probably all have different gripes. Here's my list:
1. No "Aggregation" document type
If a feed is generated by aggregating multiple feeds, it looks just like any other feed until you encounter the atom:source-feed elements inside of the entries. This element contains the metadata from the source feed. If multiple entries in the aggregated feed come from the same source feed, they'll contain duplicate data. For many aggregated feeds, this won't be a problem, because they'll rarely contain multiple entries from the same source feeds. But if one is aggregating a small set of feeds--say all of the feeds on one's intranet--duplication could be very common. I wrote a proposal to create an aggregation document type, which would contain multiple atom:feed elements, each with their own metadata and entries. This would enable grouping of all the entries from each feed, and eliminate duplication of the feed metadata. It would add one extra hierarchical layer to the document, which would require changes to how aggregators handle these documents vs. simple feeds. But the difference would be minimal, aggregators are already going to have ot change for Atom, and they'll have to be able to handle atom:source-feed anyway, so what's the big deal? I'll probably spec a document type outside of the Atom namespace to do this.
2. Ambiguous state of top level child of atom:content when the content is an XML type other than XHTML
Is the top level element something that the content author put there, or was it added by the publishing system as a container and to hold the namespace declaration, perhaps to change of default namespace? In the case of XHTML, we've solved the problem by requiring an xhtml:div to wrap the content, which xhtml:div is not considered part of the content. I wrote that proposal, but I later wrote one to replace it. Instead of a required div for that case only, I proposed that an optional attribute be added to atom:content, which would indicate whether the content had a wrapper element inside it which was not part of the content the author created. This attribute would have given publishers of XHTML content the option of not using the div that we now require, as long as they used some other method of declaring the XHTML namespace.
3. No ability to define a Person construct once and refer to it from multiple places within the document
Thus, if many entries share one author, the author's metadata will likely be repeated many times. One could make tht author the author of the feed, and not include atom:author elements in the entries they wrote (those entries would inherit from the feed), but if there are two authors who wrote many entries, this trick only works for one. I wrote a proposal that Person constructs have an optional id attribute, and that previously defined Person constructs could be referred back to using the syntax <atom:author ref="previously-defined-id" />.
4. Feeds and entries must link to an alternate representation of themselves
...but what if there isn't one? Most of the time there will be, but not always. RSS and Atom are possibly the only common data formats that routinely have alternative representations. But is this not simply an artefact of how feeds came into existence, as methods of pointing to web pages? I can think of no justification for requiring this. [Correction: entries must only link to an alternate if they don't have a content element, making that a much smaller problem...but still an imperfection.]
5. Neither feeds nor entries can have more than one author
They can have multiple contributors, but only one author. This makes processing easier in some cases (where there's only room to list one author), but it doesn't match reality.
6. Entries must contain a summary if they don't have a content element
...but what if the title and link are the entire content of the entry, for example, if one publishes a feed that just links to others' content? In such cases, one is required to do this: <atom:summary />. We're arguing about this right now. This requirement could possibly get removed.
7. The element for feed logos is named "image"
I'd prefer "logo".
8. Entries don't have an "image" element
...only feeds, and it's really a logo, not an "image".
9. The definition of the atom:link element is too open ended
I proposed that it be limited to hyperlinks, meant to be traversed in response to explicit user interaction. This would enable generic handling of links with unrecognized rel (relation or relationship) attribute values.
I could come up with a few other little nit-picks, but that covers everything of significance that I found while looking over the spec.