Nobody's Papers
“Why we twitter: understanding Microblogging usage and Communities” by Java et al.

One year and a half ago I spent lot of times reading those foundational papers about Web Information Retrieval where first analysis of usage on search engines (main information retrieval systems for Web) were performed.

Those papers showed that the average user on the web search for information in a very different way, hardly describing their goals, visiting few alternatives and spending not much time.

Some months ago I was thinking about the quality and quantity of information that a search engine shows to the user in order to accomplish their goals and then a discussion with a colleague of mine about twitter made me think that a study focusing on how and why people uses twitter would be very interesting.

Obviously you cannot figure out the reasons because every single user is twitting, but we cannot figure out the real motivations behind every single query submitted to a search engine and we are still trying to have a better understanding.

My first thought was to apply Broder’s taxonomy to tweets, but it’s clear that Broder’s Taxonomy can only be applied to Information Retrieval actions, not Information Creation actions (such as a tweet). An equivalent taxonomy should be found and that path led too far from current interests.

Last days I was thinking again about the information exposed by a search engine and Dani recommended me:

Why We Twitter: Understanding Microblogging Usage and Communities by Java et al. (Akshay Java, Xiaodan Song, Tim Finin and Belle Tseng) http://doi.acm.org/10.1145/1348549.1348556

In this paper the authors perform an analysis over a dataset from twitter collected by the authors during a period of two months focusing on the network properties, geographic distribution of the users and user intention.

According to user intention the authors implement the HITS algorithm in order to find authorities (users with a high number of followers) and hubs (users who follows lot of authorities). This way they identify 3 kind of users:

  1. Those who share information.
  2. Those who seek information.
  3. Those who want to be connected with friends.

A user can overlap in these categories. For example, I have an account in Twitter because I want to stay in touch with my friends and I want to receive information about my interests.

Categories 1 and 2 can be solved with HITS algorithm but category 3 demands a community analysis which is also performed in this paper, showing some examples of overlapping comunities.

A term trends (in fact this was the related part to my research right now) study was also performed, using a log-likelihood measure to find descriptive term for a day.

I think this is a very interesting paper and I’m quite sure that we’re going to see some other studies about twitter and other microblogging communities.

Akshay Java, Xiaodan Song, Tim Finin, & Belle Tseng (2007). Why we twitter: understanding microblogging usage and communities Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, 56-65 : 10.1145/1348549.1348556

Comments