Francis-Bot goes Fully Automatic!
January 4th, 2009 by David | 1,895 viewsIt lives! It lives! francishates, my twitter bot that I’ve been working on over the holidays, has finally gained sentience!
Well, not exactly sentience. Poor guy can’t even handle grammar properly. But he does know what he hates, and he’ll tell you if you want to listen. Every 6 hours he will update his twitter status with something new that he hates.
I did a lot of thinking about the ‘best’ way to make his posts random, funny and at the same time relevant to a wide audience. My first plan was just to have a script pull random words from the dictionary and use those. Whilst this would be funny, the chances are most words picked would be meaningless to most people most of the time. So the dictionary idea folded. My second idea was to scrape feeds from various sites and analyze those for suitable content. The problem with that was that if I chose the feeds, all Francis would ever talk about would be technology, video games and the BBC. I settled on Digg as something of a lowest common denominator. The popular stories on there have already been vetted by the general public and so are far more likely to be relevant. The people, places and things within those stories stand a higher chance of being familiar to readers. Also, it has a cool API that I wanted to try out.
Mashing the Digg stories into a single twitter post was accomplished with a really nice little POS tagger written in python that I came across. It tags all the words in the text I give it with their semantic meanings: adjectives, verbs, nouns, that sort of thing. Then I do a little bit of magic to select what is (hopefully) a coherent sentence, tack ‘I hate’ on the front and spit it out into twitter be their (awesome) API. The code is far from perfect and will probably get tweaked a little over time to make it more coherent in more cases
So with all the technical stuff out the way, I hope you will choose to follow Francis on his crusade of hate and find yourself amused by it. Enjoy!




