NELA-GT-2018: A Large Multi-Labelled News Dataset for the Study of Misinformation in News Articles

Authors

  • Jeppe Nørregaard Technical University of Denmark
  • Benjamin D. Horne Rensselaer Polytechnic Institute
  • Sibel Adalı Rensselaer Polytechnic Institute

DOI:

https://doi.org/10.1609/icwsm.v13i01.3261

Abstract

In this paper, we present a dataset of 713k articles collected between 02/2018-11/2018. These articles are collected directly from 194 news and media outlets including mainstream, hyper-partisan, and conspiracy sources. We incorporate ground truth ratings of the sources from 8 different assessment sites covering multiple dimensions of veracity, including reliability, bias, transparency, adherence to journalistic standards, and consumer trust. The NELA-GT2018 dataset can be found at https://doi.org/10.7910/DVN/ ULHLCB.

Downloads

Published

2019-07-06

How to Cite

Nørregaard, J., Horne, B. D., & Adalı, S. (2019). NELA-GT-2018: A Large Multi-Labelled News Dataset for the Study of Misinformation in News Articles. Proceedings of the International AAAI Conference on Web and Social Media, 13(01), 630-638. https://doi.org/10.1609/icwsm.v13i01.3261