Tweet Annotation Sensitivity Experiment 1
Creators
- 1. LMU Munich
- 2. Munich Center for Machine Learning
- 3. University of Maryland
- 4. RTI International
Description
Description
We drew a stratified sample of 20 tweets, that were pre-annotated in a study by Davidson et al. (2017) for Hate Speech / Offensive Language / Neither. The stratification was done with respect to majority-voted class and level of disagreement.
We then recruited 1000 Prolific workers to annotate each of the 20 tweets. Annotators were randomly selected into one of six experimental conditions, as shown in the following figures. In these conditions, they were asked to assign the labels Hate Speech / Offensive Language / Neither.
In addition, we collected a variety of demographic variables (e.g. age and gender) and some para data (e.g. duration of the whole task, duration per screen).
Variables
| Name | Description |
|---|---|
| id | annotator ID |
| age | Age |
| gender | Gender 1: Female 2: Male 3: Something Else 4: Prefer not to say |
| afam | African-American 0: No 1: Yes |
| asian | Asian-American 0: No 1: Yes |
| hispanic | Hispanic 0: No 1: Yes |
| white | White 0: No 1: Yes |
| race_other | Other race/ethnicity 0: No 1: Yes |
| race_not_say | Prefer not to say race/ethnicity 0: No 1: Yes |
| education | Highest educational attainment 1: Less than high school 2: High school 3: Some college 4: College graduate 5: Master's degree or professional degree (Law, Medicine, MPH, etc.) 6: Doctoral degree (PhD, DPH, EdD, etc.) |
| sexuality | Sexuality 1: Gay or Lesbian 2: Bisexual 3: Straight 4: Something Else |
| english | English first language? 0: No 1: Yes |
| tw_use | Twitter Use 1: Most days 2: Most weeks, but not every day 3: A few times a month 4: A few times a year 5: Less often 6: Never |
| social_media_use | Social Media Use 1: Most days 2: Most weeks, but not every day 3: A few times a month 4: A few times a year 5: Less often 0: Never |
| prolific_hours | Prolific hours worked last month |
| task_fun | Coding work was: fun 0: No 1: Yes |
| task_interesting | Coding work was: interesting 0: No 1: Yes |
| task_boring | Coding work was: boring 0: No 1: Yes |
| task_repetitive | Coding work was: repetitive 0: No 1: Yes |
| task_important | Coding work was: important 0: No 1: Yes |
| task_depressing | Coding work was: depressing 0: No 1: Yes |
| task_offensive | Coding work was: offensive 0: No 1: Yes |
| another_tweettask | Likelihood to do another Tweet related task not at all: Not at all likely somewhat: Somewhat likely very: Very likely |
| another_hatetask | Likelihood to do another Hate Speech related task not at all: Not at all likely somewhat: Somewhat likely very: Very likely |
| page_history | Order in which annotator saw pages |
| date_of_first_access | Datetime of first access |
| date_of_last_access | Datetime of last access |
| duration_sec | Task duration in seconds |
| version | Version of annotation task A: Version A B: Version B C: Version C D: Version D E: Version E F: Version F |
| tw1-20 | Label assigned to Tweet 1-20 hate speech: Hate Speech offensive language: Offensive Language neither: Neither HS nor OL NA: Missing or "don't know" |
| tw_duration_1-20 | Annotation duration in milliseconds Tweet 1-20 |
| num_approvals | Prolific data: number of previous task approvals of annotator |
| num_rejections | Prolific data: number of previous task rejections of annotator |
| prolific_score | Annotator quality score by Prolific |
| countryofbirth | Prolific data: Annotator country of birth |
| currentcountryofresidence | Prolific data: Annotator country of residence |
| employmentstatus | Prolific data: Annotator Employment Status Full-timePart-time Unemployed (and job-seeking) Due to start a new job within the next month Not in paid work (e.g. homemaker, retired or disabled) Other DATA EXPIRED |
| firstlanguage | Prolific data: Annotator first language |
| nationality | Prolific data: Nationality |
| studentstatus | Prolific data: Student status Yes No DATA EXPIRED |
Details
| Resource type | Funded research project dataset |
| Title | Tweet Annotation Sensitivity Experiment 1 |
| Creators |
|
| Research Fields | Other Psychology |
| Size | 1.09 MB |
| Formats | Comma-separated values (CSV) (.csv) |
| External Resource | https://huggingface.co/datasets/soda-lmu/tweet-annotation-sensitivity-1/blob/main/dataset_1st_study.csv |
| Countries | United States |
| Dates of collection | December 2021 |
Additional Details
Related works
- Is cited by
- Conference paper: 10.1007/978-3-031-21707-4_19 (DOI)