Published October 23, 2023 | Version v2

The Economist Historical Advertisements - Industry Subset: Banking

  • 1. Universität Mannheim
  • 1. Universität Mannheim

Description

Description

This dataset contains metadata of 92,592 historical advertisements from the banking industry, from all 8,840 issues of The Economist magazine, years 1843 to 2014. It is part of a series of  datasets related to The Economist Historical Archive (https://www.gale.com/intl/c/the-economist-historical-archive).

Files

TheEconomistHistoricalArchives-IndustrySubset-Banking.csv

Files (136.0 MB)

Variables

Name Description
Filename Unique identifier of this advertisement
URLs TheEconomistPageScans comma separated list of URLs to JPG image files of scanned The Economist pages containing this ad. For multi page ads this can be multiple URLs.
Date of Issue Date of The Economist issue (Years-Month-Day)
Ad size (pages) e.g. 1 = one full page, 0.75 = 3/4 of a page, 2 = two pages
Ad size < 1/4 1 if Ad covers less than 25% of the page; 0 if Ad does not cover less than 25% of the page
1/4 <= Ad size < 2/4 1 if Ad covers at least 25% of the page, but less than 50%; 0 if Ad size is not in this range
2/4 <= Ad size < 3/4 size
3/4 <= Ad size < 4/4 size
4/4 <= Ad size < 8/4 size
8/4 <= Ad size size
Bounding Box relative X1 Left-top coordinate of a rectangle identifying the ad on the page, relative to the pixel coordinates of the image from column 2 ("URLs …"). Multiply this value by the width of the image to get the absolute x coordinate. If the ad is a multi page ad, the images from column 2 have to be horizontally concatenated first.
Bounding Box relative Y1 Left-top coordinate
Bounding Box relative X2 Right-bottom coordinate
Bounding Box relative Y2 Right-bottom coordinate
Feature Complexity (JPG file size in kb / Ad Size) More complex images will have higher values.
JPG File Size (Byte) e.g. 186609
OCR GoogleVision Advertisement text, based on text recognition using Google Vision API (2021) of the full ad image.
Brand Brand name of advertiser
Brand is generic (e.g. 'Notices') If "True" then this ad doesn't represent a single brand, but a category of ad-like content. Most common categories are "Notices", "Appointments", "Courses".
Text Class GoogleVision Based on the OCR text the ad was classified using GoogleVision API (2021). See full list of categories. This column contains a JSON string with a list of text classes and their class probabilities.
Category most confident, Level 1 Top level category from Google Vision text analysis for this ad. E.g. "/Finance"
Category most confident, Level 2 e.g. "/Finance/Banking"
Category most confident, Level 3 e.g. "/Finance/Banking/B2B"
Colorfulness (Hasler & Suesstrunk, 2003) Colorfulness of ad, based on this paper.
Color variety (Ke et al., 2006) Color variety of ad, based on this paper.
Brightness_Mean Mean of brightness values of all pixels in ad.
Brightness_SD Standard deviation of brightness values of all pixels.
Red_Mean Mean value of redness of all pixels.
Red_SD Standard deviation of redness of all pixels.
Green_Mean Mean value of greenness of all pixels.
Green_SD Standard deviation of greenness of all pixels.
Blue_Mean Mean value of blueness of all pixels.
Blue_SD Standard deviation of blueness of all pixels.
Text readability Gunning Fog Text readability measure according to Gunning Fog index.
Text readability SMOG Text readability measure according to SMOG
Text readability Flesch Reading Ease Text readability measure according to FLESCH
Text readability Dale Chall Text readability measure according to Dale Chall

Details

Resource type Open dataset
Title The Economist Historical Advertisements - Industry Subset: Banking
Creators
  • Kluge, Stefan1 ORCID icon
  • Contributors
  • Kluge, Stefan1 ORCID icon
  • Gehrmann, Leonie1
  • Stahl, Florian1
  • Research Fields Economic & Social History
    Size 136.0 MB
    Companies GALE
    Industries Banking
    Countries United States
    Dates of collection October 23, 2023