Release 2016-11-22 of the xLiMe Brexit dataset. == Dataset description == This dataset consists of metadata about: - one thousand tv programs - 14 million social-media posts and - 277 thousand news articles which are (very likely) related to the "Brexit" EU referendum held on June 23rd in the UK. The media items cover various languages (primarily English, German and Spanish). The dataset was gathered in June 2016 as part of the xLiMe project [1]. Besides the metadata about the media items, the dataset has also been enriched with: - 48 thousand ASR annotations (i.e. transcriptions) for 886 of the tv programs (those in English and German) - 289 flag annotations (either the EU or the UK flag) for 12 tv programs - around 55 million entity annotations from texts to DBpedia entities: + 41.4 million annotations for social media posts + 17.9 million annotations for news articles + 171 thousand annotations for ASR transcriptions == Format and loading instructions == The dataset consists of various MongoDB [2] dumps. To import them, install MongoDB (we have used version 3.2) and execute `mongorestore` commands for each of the datadumps you want to load. E.g. for loading only the metadata about the 277 thousand NewsArticleBeans, you should execute the following command: mongorestore --db xlime-brexit --collection NewsArticleBean --gzip brexit-no-fulltext/NewsArticleBean.bson.gz == Contents == - readme.txt: this file - brexit-no-fulltext: folder containing the mongoDB dumps for 14 million microposts and 277 thousand news articles. Due to copyright issues, we are not able to provide the full-text for microposts and news articles. Please contact us [1] if you want to request access to the full-text. - brexit-social-entity-anns: mongoDB dumps for the 41 million DBpedia entity annotations for (entities linked to the full texts of the 14 million) microposts - brexit-news-entity-anns: mongoDB dumps for the 17.9 million DBpedia entity annotations for (entities linked to the full texts of the 277 thousand) news articles. - brexit-tv-asr-anns: mongoDB dumps for one thousand tv programs, 48 thousand ASR annotations and 171 thousand DBpedia entity annotations for (entities linked to the 48 thousand) ASR transcriptions. - brexit-flag-anns: mongoDB dumps for the 289 flag annotations (for 12 of the tv programs) [1] http:://xlime.eu [2] https://www.mongodb.com/