A Comprehensive Dataset of Citations with Identifiers from English Wikipedia (2023)
| Creators |
|
|---|---|
| Publication date | 22-05-2023 |
| Description | This is a dataset of 40.664.485 citations extracted from English Wikipedia February 2023 dump (https://dumps.wikimedia.org/enwiki/20230220/). The dataset is purely based on information from Wikipedia, labelled and annotated datasets will be added in the follow up versions. The source code to extract citations can be found here: https://github.com/albatros13/wikicite. The code is a fork of the earlier project on Wikipedia citation extraction: https://github.com/Harshdeep1996/cite-classifications-wiki. |
| Publisher | Zenodo |
| Organisations |
|
| Document type | Dataset |
| Related publication | Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia |
| DOI | https://doi.org/10.5281/zenodo.7958486 |
| Other links | https://zenodo.org/record/7958486 |
| Permalink to this page | |
