Yesterday's English to Korean to English experience notwithstanding, an automated system to translate newsfeeds between languages could be extremely useful. Publishers and readers alike would benefit as the audience for newsfeeds expanded. I'm not aware of any existing service to translate newsfeeds, but another option, imperfect as it may be, does exist: the venerable, if sometimes humorous, Babelfish.
Using a tool like CaRP, a newsfeed (or aggregate of multiple newsfeeds) could be converted to HTML, which could then be run through Babelfish. The end result wouldn't be RSS, but it would be the next best thing. While technically it shouldn't be difficult to convert the translation back to RSS, that would probably violate the Babelfish terms of service.
If a system were developed to translate newsfeeds, what features would it need? Here are a few that come to mind:
* It would need to ensure that it's output was valid XML. Since many translation programs pass words they don't understand through unchanged, care would need to be taken to ensure that the resulting text all fit within one character set. Unicode would definitely be the way to go.
* It should prominently state that the text is a machine translation, and that its accuracy is open to question.
* It would need to know which parts of a feed to translate and which to leave as they are. Element and attribute names should obviously remain unchanged, but what about attribute values? Some should not be translated, but there are probably others that should. And there are probably elements and attributes that should be transliterated rather than translated. A careful look should be taken at the file format specification to ensure that the right thing is done in each case.
I wonder how long it will be before some big aggregator adds translated newsfeeds to their offerings.
May 12th, 2004 at 11:36 am
This is basically exactly what I suggested to the creator of SharpReader, back when that was my primary aggregator. Now I'm using Bloglines, and the several foreign-language feeds I read would be a lot easy to deal with if Bloglines had something built in.
All that's required would be to output an HTML page including the new items, and pass that page through a translator. Anyone doing so is welcome to use my own meta-translation service, http://www.faganfinder.com/translate/ , in order to achieve the most language combinations possible.