Offensive Language Identification Dataset - OLID

This is the homepage for the Offensive Language Identification Dataset (OLID) by Zampieri et al. (2019).
 
OLID contains a collection of annotated tweets using an annotation model that encompasses following three levels:
 
A: Offensive Language Detection
B: Categorization of Offensive Language
C: Offensive Language Target Identification
 
 
The complete dataset will be available for download from this webpage soon.
 
More information about the OLID dataset can be found in the NAACL 2019 paper:
 
If you used OLID, please cite this paper:
 

@inproceedings{zampierietal2019, 
    title={{Predicting the Type and Target of Offensive Posts in Social Media}}, 
    author={Zampieri, Marcos and Malmasi, Shervin and Nakov, Preslav and Rosenthal, Sara and Farra, Noura and Kumar, Ritesh}, 
    booktitle={Proceedings of NAACL}, 
    year={2019}
}