Published 2015 | Version v2
Open dataset

Yelp Business Review & Images Dataset

Creators

Description

Description

The Yelp dataset is a subset of businesses, reviews, and user data for use in personal, educational, and academic purposes. It contains 6.9M online reviews for 150k businesses. It also includes more than 200,000 images related to the reviews.

The data consists of multiple sub datasets:

  1. Yelp Business data: Contains business data including location data, attributes, and categories.
  2. Yelp Review data: Contains full review text data including the user_id that wrote the review and the business_id the review is written for.
  3. Yelp User data: User data including the user's friend mapping and all the metadata associated with the user.
  4. Yelp Checkin data: Checkins on a business.
  5. Yelp Tip data: Tips written by a user on a business. Tips are shorter than reviews and tend to convey quick suggestions.
  6. Yelp Photo data: Contains photo data including the caption and classification (one of "food", "drink", "menu", "inside" or "outside").

Available as JSON files, use can use it to teach students about databases, to learn NLP, or for sample production data while you learn how to make mobile apps.

Variables

Name Description
business_id 22 character unique string business id
name The business's name
address The full address of the business
city The city where the business is
state 2 character state code, if applicable
postal code The postal code of the business
latitude Latitude of the reviewed business
longitude Longitude of the reviewed business
stars Star rating of the business, rounded to half-stars
review_count Number of reviews of the business
is_open 0 or 1 for closed or open business, respectively
attributes Business attributes to values, e.g., RestaurantsTakeOut and BusinessParking
categories An array of strings of business categories, e.g, "Mexican", "Burgers", "Gastropubs"
hours An object of key day to value hours, e.g., "Monday": "10:00-21:00"
review_id 22 character unique review id
user_id 22 character unique user id
stars Star rating provided in a rating
date Review date, formatted YYYY-MM-DD
text The review itself
useful Number of useful votes received
funny Number of funny votes received
cool Number of cool votes received
name The user's first name
review_count The number of reviews the user has written
yelping_since When the user joined Yelp, formatted like YYYY-MM-DD
friends An array of the user's friend as user_ids
useful Number of useful votes sent by the user
funny Number of funny votes sent by the user
cool Number of cool votes sent by the user
fans Number of fans the user has
elite The years the user was elite
average_stars Average rating of all reviews provided by a user
compliment_hot Number of hot compliments received by the user
compliment_more Number of more compliments received by the user
compliment_profile Number of profile compliments received by the user
compliment_cute Number of cute compliments received by the user
compliment_list Number of list compliments received by the user
compliment_note Number of note compliments received by the user
compliment_plain Number of plain compliments received by the user
compliment_cool Number of cool compliments received by the user
compliment_funny Number of funny compliments received by the user
compliment_writer Number of writer compliments received by the user
compliment_photos Number of photo compliments received by the user
date A comma-separated list of timestamps for each checkin, each with format YYYY-MM-DD HH:MM:SS
text Text of the tip
date When the tip was written, formatted like YYYY-MM-DD
compliment_count How many compliments a tip has
photo_id 22 character unique photo id
caption The photo caption, if any
label The category the photo belongs to, if any, e.g., "food"

Details

Resource type Open dataset
Title Yelp Business Review & Images Dataset
Creators
  • Yelp, Inc.
  • Research Fields Business Administration Economics Psychology Sociology Political Science Economic & Social History Communication Sciences Educational Research
    Size 8.9 GB ; 6,990,280 reviews ; 200,100 pictures
    Formats JSON format (.json) Comma-separated values (CSV) (.csv)
    License(s) Custom License by Yelp
    External Resource https://www.yelp.com/dataset/download
    Companies Yelp
    Industries Social Media