Published 2010 | Version v2
Journal article

You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users

Description

Description

We propose and evaluate a probabilistic framework for estimating a Twitter user's city-level location based purely on the content of the user's tweets, even in the absence of any other geospatial cues. By augmenting the massive human-powered sensing capabilities of Twitter and related microblogging services with content-derived location information, this framework can overcome the sparsity of geoenabled features in these services and enable new locationbased personalized information services, the targeting of regional advertisements, and so on. Three of the key features of the proposed approach are: (i) its reliance purely on tweet content, meaning no need for user IP information, private login information, or external knowledge bases; (ii) a classification component for automatically identifying words in tweets with a strong local geo-scope; and (iii) a lattice-based neighborhood smoothing model for refining a user's location estimate. The system estimates k possible locations for each user in descending order of confidence. On average we find that the location estimates converge quickly (needing just 100s of tweets), placing 51% of Twitter users within 100 miles of their actual location.

Details

Title You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users
Authors
  • Cheng, Zhiyuan
  • Caverlee, James
  • Lee, Kyumin
  • Publisher CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
    Year of publication 2010