Help | Advanced Search
Computer Science > Computation and Language
Title: automatic quality assessment of wikipedia articles -- a systematic literature review.
Abstract: Wikipedia is the world's largest online encyclopedia, but maintaining article quality through collaboration is challenging. Wikipedia designed a quality scale, but with such a manual assessment process, many articles remain unassessed. We review existing methods for automatically measuring the quality of Wikipedia articles, identifying and comparing machine learning algorithms, article features, quality metrics, and used datasets, examining 149 distinct studies, and exploring commonalities and gaps in them. The literature is extensive, and the approaches follow past technological trends. However, machine learning is still not widely used by Wikipedia, and we hope that our analysis helps future researchers change that reality.
- Download PDF
- Other Formats
References & Citations
- Google Scholar
- Semantic Scholar
BibTeX formatted citation
Bibliographic and Citation Tools
Code, data and media associated with this article, recommenders and search tools.
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .
Systematic Literature Reviews: Home
- Systematic Reviews - what are they?
- Systematic Reviews - recommended reading
A systematic review , or systematic literature review , is a type of literature review that uses systematic methods to collect secondary data, critically appraise research studies, and synthesize findings qualitatively or quantitatively. Systematic reviews formulate research questions that are broad or narrow in scope, and identify and synthesize studies that directly relate to the systematic review question. They are designed to provide a complete, exhaustive summary of current evidence, published and unpublished, that is "methodical, comprehensive, transparent, and replicable."
An understanding of systematic reviews and how to implement them in practice is highly recommended for professionals involved in the delivery of health care, public health, and public policy. Systematic reviews of randomized controlled trials are key to the practice of evidence-based medicine , and a review of existing studies is often quicker and cheaper than embarking on a new study. Contrastingly, systematic reviews of observational studies rank lower in the evidence-based hierarchy. However, another important factor that impacts the quality of the evidence is the accuracy of the methodological design and execution of the systematic review carried out by the authors.
While systematic reviews are often applied in the biomedical or healthcare context, they can be used in other areas where an assessment of a precisely defined subject would be helpful. For example, systematic reviews are becoming increasingly common in management, accounting and finance . Systematic reviews may examine clinical tests, public health interventions, environmental interventions, social interventions, adverse effects and economic evaluations.
'Systematic review' (2020) Wikipedia. Available at: https://en.wikipedia.org/wiki/Systematic_review (Accessed: 20 October 2020).
Assistance offered by the Library
The Library staff are available to assist with your systematic reviews: we can advise on the best databases to use for your particular topic, we provide guidance on searching techniques, we can show you how to set up search alerts so that you keep up to date with the latest research in your field, and we provide assistance in the use of reference management tools. The staff to contact are:
Science & Pharmacy Principal Faculty Services Librarian: Thandiwe Menze Email: [email protected]
Commerce & Law Principal Faculty Services Librarian: Jill Otto Email: [email protected]
Principal Librarian: Faculty Liaison Services (Science & Pharmacy)
- Next: Systematic Reviews - what are they? >>
- Last Updated: Jun 6, 2022 2:35 PM
- URL: https://ru.za.libguides.com/c.php?g=1086489
- About DTU Orbit
Excavating the mother lode of human-generated text: A systematic review of research that uses the Wikipedia corpus
- Department of Applied Mathematics and Computer Science
- Cognitive Systems
- Elon University
- Concordia University
- University of Oulu
Research output : Contribution to journal › Journal article › Research › peer-review
- Information retrieval
- Information extraction
- Natural language processing
- Literature review
Access to Document
- WikilitCorpus - IP&M - Rev3 Accepted author manuscript, 349 KB
- Research Social Sciences 100%
- Systematic Review Social Sciences 100%
- Wikipedia Computer Science 100%
- Research Worker Computer Science 40%
- Research Area Social Sciences 20%
- Computer Science Social Sciences 20%
- Ontology Social Sciences 20%
- Information Retrieval Social Sciences 20%
T1 - Excavating the mother lode of human-generated text: A systematic review of research that uses the Wikipedia corpus
AU - Mehdi, Mohamad
AU - Okoli, Chitu
AU - Mesgari, Mostafa
AU - Nielsen, Finn Årup
AU - Lanamäki, Arto
N2 - Although primarily an encyclopedia, Wikipedia’s expansive content provides a knowledge base that has been continuously exploited by researchers in a wide variety of domains. This article systematically reviews the scholarly studies that have used Wikipedia as a data source, and investigates the means by which Wikipedia has been employed in three main computer science research areas: information retrieval, natural language processing, and ontology building. We report and discuss the research trends of the identified and examined studies. We further identify and classify a list of tools that can be used to extract data from Wikipedia, and compile a list of currently available data sets extracted from Wikipedia.
AB - Although primarily an encyclopedia, Wikipedia’s expansive content provides a knowledge base that has been continuously exploited by researchers in a wide variety of domains. This article systematically reviews the scholarly studies that have used Wikipedia as a data source, and investigates the means by which Wikipedia has been employed in three main computer science research areas: information retrieval, natural language processing, and ontology building. We report and discuss the research trends of the identified and examined studies. We further identify and classify a list of tools that can be used to extract data from Wikipedia, and compile a list of currently available data sets extracted from Wikipedia.
KW - Information retrieval
KW - Information extraction
KW - Natural language processing
KW - Ontologies
KW - Wikipedia
KW - Literature review
U2 - 10.1016/j.ipm.2016.07.003
DO - 10.1016/j.ipm.2016.07.003
M3 - Journal article
SN - 0306-4573
JO - Information Processing & Management
JF - Information Processing & Management