Tweeting new items from an RSS (or ATOM) feed

Date: Fri Apr 25 2014 Twitter Applications »»»» Feed Processing

I'm developing a new site and decided it would be cool if every new posting was turned into a tweet on twitter.com. Actually it would be cool to post announcements to several social networking sites but twitter is a decent place to start. The site is about collecting blog & podcast posts from other sites, and it fits the site purpose for the site to make such announcements. I have yet to figure out a proper use for twitter but e.g. I have the www.7gen.com configured to tweet all new postings, and that has turned into several people following the 7genblog twitter account (http://twitter.com/7genblog), and I want to create a similar following around wwwatts.net (http://twitter.com/wwwatts).

That site, http://wwwatts.net, is developed using the RSS/Atom feed aggregation toolchain I've developed. The site is built by groovy scripts that spit out HTML files. Hence for wwwatts.net to tweet its new postings it's going to require a command line tool rather than one built into a content management system like is done for www.7gen.com.

While pondering this found on the twitter site a list of applications and services for twittering. Also yahoogling "rss to twitter" showed up several scripts. There are services such as pingvine which do the job I wanted very nicely. And I almost signed up with that until I had an inspiration on developing the tool.

My feed aggregation toolchain almost does the required processing outright, it took only a little bit of extra tweaking to develop an rss-to-twitter tool.

It's simple to detect new postings on a feed if you do one thing. If you archive old postings on the feed, then when upon retrieving a new instance of the feed it's a simple matter to step through the entries and compare whether the entry is in the archived feed. Any entries not in the archive are to be tweeted.

The resulting script, feedtweet.groovy, is pretty darn short and to the point.

The first requirement is a simple method in the twitter.com REST API that makes it trivial to post a tweet with the curl command.

Let's step through the script:

def theFeed = new aggregation.feed(args[0]);
File archiveFile = new File(args[1]);
theFeed.get();

We are to pass in command line arguments specifying the feed to check for new entries (args[0]) and the archive feed (args[1]). The feed class is a wrapper around Rome's SyndFeed class that makes it somewhat groovylike. What it does is read any news feed and turn it into a Java object hierarchy.

The archiveFeed method simply takes a newly retrieved feed and saves new entries into an archive copy of the feed. The original intent of archiveFeed was to support keeping a long history of items that had been posted on a feed, even if the items fall off the end of the feed.

A simple example of normal archiveFeed usage is:

feed theFeed = new feed(args[1]);
File archiveFile = new File(args[1]);
def maxAge = 6 * 30;
theFeed.archive(archiveFile, maxAge);

For this rss-to-twitter service I added a new feature to archiveFeed to specify a closure to be called for each new item on the feed.

def i = 0;
theFeed.archive(archiveFile, maxAge, { entry ->
if (++i < 10) {
def surl = utils.shortURL(entry.link);
println "# to tweet: ${theFeed.title} ${entry.title} ${entry.link} ${surl}"
[ "curl", "-u", tweetID, "-d", "status=${surl} ${entry.title}", tweetURL ].execute();
}
})

This is really simple.. for each new item in the feed the above code is executed. The code keeps a counter to limit the number of tweets sent at a time, twitter.com has warnings against sending too many tweets as they no doubt wish to avoid having overloaded servers. The bit at the bottom is simply the command documented on the twitter REST API page, transmogrified a bit into groovy syntax.

That's it except setting up the feed aggregation tools is not yet documented. (hint, it uses gant to compile and requires having rome and jdom .jar's available)