AmazonQA
- 1. Carnegie Mellon University
Description
Description
We introduce a new dataset and propose a method that combines information retrieval techniques for selecting relevant reviews (given a question) and "reading comprehension" models for synthesizing an answer (given a question and review). Our dataset consists of 923k questions, 3.6M answers and 14M reviews across 156k products. Building on the well-known Amazon dataset, we collect additional annotations, marking each question as either answerable or unanswerable based on the available reviews.
Variables
| Name | Description |
|---|---|
| questionText | String. The question. |
| questionType | String. Either "yesno" for a boolean question, or "descriptive" for a non-boolean question. |
| review_snippets | List of strings. Extracted review snippets relevant to the question (at most ten). |
| answerText | String. The text for the answer. |
| answerType | String. Type of the answer. |
| helpful | List of two integers. The first integer indicates the number of uses who found the answer helpful. The second integer indicates the total number of responses. |
| asin | String. Unique product ID for the product the question pertains to. |
| qid | Integer. Unique question id for the question (in the entire dataset). |
| category | String. Product category. |
| top_review_wilson | String. The review with the highest wilson score |
| top_review_helpful | String. The review voted as most helpful by the users. |
| is_answerable | Boolean. Output of the answerability classifier indicating whether the question is answerable using the review snippets. |
| top_sentences_IR | List of strings. A list of top sentences (at most 10) based on IR score with the question. |
Details
| Resource type | Open dataset |
| Title | AmazonQA |
| Creators |
|
| Research Fields | Business Administration Economics Psychology Sociology Political Science Economic & Social History Communication Sciences Educational Research Other |
| Size | 4 GB |
| Formats | JSON format (.json) |
| License(s) | Undefined |
| External Resource | https://github.com/amazonqa/amazonqa |
| Companies | Amazon |