Prefetching linked content: good or evil?
by ZetaGecko | Add Your Comments | Atom/RSS
All digest formats support some method for referring to content that isn't delivered within the feed. Linked content falls into two categories: content to which the feed refers, and content intended to be displayed within the feed. When a feed reader loads a digest, it can pre-fetch the linked content. But is this desirable?
The argument in favor or prefetching is obvious: if it has already been fetched, it can display more quickly when the user views the item/entry that links to it. Why would this not be a good idea?
It's all a question of bandwidth. You've never heard me say that before, have you? A person who subscribes to more digests than they can keep up with reading will prefetch lots of content that they'll never see. If many people subscribing to a particular feed do this, they would waste a significant amount of the publisher's bandwidth. Even a person who does keep up with all the feeds they subscribe to won't want to read every item, so any content prefetched for items that go unread is wasted.
Sometimes prefetching can cause bandwidth problems not only for the publisher, but also for the client. Say for example a digest links to a 10 MB video. If you're browsing the web with your feed reader running in the background, and it starts downloading the video, your web browsing will likely be slowed down significantly. There's also the potential issue of disk space usage. Hard drives are getting bigger all the time, but caching too much internet content still has the potential to use too much space, especially if a large disk has been divided into small enough partitions.
Finally, prefetching raises the potential for malicious exploitation of the issues noted above.
Weighing the advantages against the disadvantages, one needn't simply decide for or against prefetching: a wide range of policies are possible. For one, prefetching content intended to be displayed in the feed is probably a better idea than prefetching content that the feed simply links to. Second, the user should be able to set a size threshold over or under which content isn't prefetched. Ideally, this should be configurable on a digest-by-digest basis.
Why did I say under or over? Over is the more obvious: if somebody puts a huge video in their feed, you may not want to prefetch it. On the other hand, if you subscribe to a feed that always contains videos, and you always want to see them, you may want to always prefetch them. On the under side, there may be no need to prefetch data that is under a particular size, because presumably it would load quickly enough when its associated item is displayed. If you don't mind a few seconds wait when you view a particular item, you may want to save everyone the bandwidth burden by never loading content for the items you'll never read.
One last comment: many feed elements that link to external content don't support any method of specifying the size of the external content within the feed, and those that do may not do so accurately. When testing the size of linked content against a size threshold, one might choose to trust an advisory content size attribute to decide not to load content, but one may wish to check the size of the content by issuing an HTTP HEAD request before actually loading it.