Published September 24, 2023 | Version v2

The Economist Historical Advertisements - Master Dataset

  • 1. Universität Mannheim
  • 2. KIT

Description

Description

This dataset contains metadata of 512.599 historical advertisements from all 8,840 issues of The Economist magazine, years 1843 to 2014. It is part of a series of datasets related to The Economist Historical Archive (https://www.gale.com/intl/c/the-economist-historical-archive). You will need this Master Dataset, if you want to work with any of the related datasets.

Files

TheEconomistHistoricalArchives-MasterDataset.csv

Files (195.4 MB)

Name Size Download all
md5:29c9c81af2e47a3b1ec6590cd1fc8f36
195.4 MB Preview Download

Variables

Name Description
Filename Unique identifier of this advertisement
URLs TheEconomistPageScans comma separated list of URLs to JPG image files of scanned The Economist pages containing this ad. For multi page ads this can be multiple URLs
Date of Issue Date of The Economist issue (Years-Month-Day)
Bounding Box relative X1 Left-top coordinate of a rectangle identifying the ad on the page, relative to the pixel coordinates of the image from column 2 ("URLs …"). Multiply this value by the width of the image to get the absolute x coordinate. If the ad is a multi page ad, the images from column 2 have to be horizontally concatenated first.
Bounding Box relative Y1 Left-top coordinate
Bounding Box relative X2 Right-bottom coordinate
Bounding Box relative Y2 Right-bottom coordinate
Brand Brand name of advertiser
Brand is generic (e.g. 'Notices') If "True" then this ad doesn't represent a single brand, but a category of ad-like content. Most common categories are "Notices", "Appointments", "Courses".
OCR GoogleVision Advertisement text, based on text recognition using Google Vision API (2021) of the full ad image.
Text Class GoogleVision Based on the OCR text the ad was classified using GoogleVision API (2021). See full list of categories. This column contains a JSON string with a list of text classes and their class probabilities.

Details

Resource type Open dataset
Title The Economist Historical Advertisements - Master Dataset
Creators
  • Kluge, Stefan1 ORCID icon
  • Gehrmann, Leonie1
  • Stahl, Florian1
  • Contributors
  • Kluge, Stefan1
  • Gehrmann, Leonie1
  • Stahl, Florian1
  • Knäble, Merlin2
  • Mädche, Alexander2
  • Research Fields Economic & Social History
    Size 195.4 MB
    Formats Comma-separated values (CSV) (.csv)
    License(s) Creative Commons Attribution 4.0 International
    Companies GALE
    Industries Banking Tobacco Watches Airlines Automotive Insurance
    Countries United States
    Dates of collection October 23, 2023