Published 2019
| Version v3
Open dataset
Open
MUStARD: Multimodal Sarcasm Detection Dataset
Description
Description
We release the MUStARD dataset which is a multimodal video corpus for research in automated sarcasm discovery. The dataset is compiled from popular TV shows including Friends, The Golden Girls, The Big Bang Theory, and Sarcasmaholics Anonymous. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context, which provides additional information on the scenario where the utterance occurs.
Files
bert-input.txt__100lines.txt
Files
(11.9 kB)
Name | Size | Download all |
---|---|---|
md5:acd065175cb3b51f7880790d1575f20b
|
8.4 kB | Preview Download |
md5:2014df5b783f6fc7be77c3c51b44df1c
|
3.6 kB | Preview Download |
Variables
Name | Description |
---|---|
utterance | The text of the target utterance to classify. |
speaker | Speaker of the target utterance. |
context | List of utterances (in chronological order) preceding the target utterance. |
context_speakers | Respective speakers of the context utterances. |
sarcasm | Binary label for sarcasm tag. |
Details
Resource type | Open dataset |
Title | MUStARD: Multimodal Sarcasm Detection Dataset |
Creators |
|
Size | 11.9 kB |
Formats | JSON format (.json) |
License(s) | no license information available |
External Resource | https://github.com/soujanyaporia/MUStARD#mustard-multimodal-sarcasm-detection-dataset |