The Twitter streaming API allows access to tweets on Twitter as they are created. I decided to experiment by writing a data-mining script that collects geo-located tweets from the streaming API that contain people saying that they ‘like’ something. For this post I have created two visualisations that aim to make sense of the data collected so far and display it in an easily digestible way.
The script has been running for about 5 days in total and has collected 1873 tweets; It has parsed a lot more tweets than that but a tweet is only added when it meets a certain criteria; which is that it is geo-located from the UK and contains a phrase such as ‘I like’, ‘I love’, ‘I am fond of’, etc. From the data the second most ‘liked’ thing is Twitter itself which is un-surprising; ‘My Life’ is 4th and ‘My vagina’ is 32nd. The most liked things on Twitter are ‘That song’ and ‘This song’ which provides a very un-informative insight into the people on Twitter.
These two images are visualisations of the data. The first image is a ‘Tree Map’ which was quite simple to put together thanks to Google Chart Tools. The larger the square, the more likes and vice versa. The data isn’t perfect; for example, ‘The’ is one of the most popular things but I think overall it works quite well at extracting ‘likes’. The second visualisation is a Processing sketch that plots a users location along with what they have said they like.
It doesn’t really provide any insights geographically but it’s a starting point for something that could have a lot more potential for spotting geographical and cultural trends. For now, I will leave the spider running for another month or so and then post again with some updated visualisations and hopefully some more interesting data from Twitter.