Boolean:	`(bicycle AND helmet) OR (head AND protection)` (always group AND in parenthesis)
Title:	`title:(climate change)`
Author:	`author:("bohr niels" OR "bohr n")` (avoid only full first name)
Phrase:	`"water pump control"` (does not work with wildcards)
Wildcards:	`wom?n pharm*`

More...

Journal article

Excavating the mother lode of human-generated text: A systematic review of research that uses the Wikipedia corpus

In Information Processing and Management — 2017, Volume 53, Issue 2, pp. 505-529

By Mehdi, Mohamad¹; Okoli, Chitu¹; Mesgari, Mostafa²; Nielsen, Finn Årup^3,4; Lanamäki, Arto⁵

From

Concordia University¹

Elon University²

Department of Applied Mathematics and Computer Science, Technical University of Denmark³

Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark⁴

University of Oulu⁵

Abstract

Although primarily an encyclopedia, Wikipedia’s expansive content provides a knowledge base that has been continuously exploited by researchers in a wide variety of domains. This article systematically reviews the scholarly studies that have used Wikipedia as a data source, and investigates the means by which Wikipedia has been employed in three main computer science research areas: information retrieval, natural language processing, and ontology building.

We report and discuss the research trends of the identified and examined studies. We further identify and classify a list of tools that can be used to extract data from Wikipedia, and compile a list of currently available data sets extracted from Wikipedia.

Language:	English
Year:	2017
Pages:	505-529
ISSN:	18735371 and 03064573
Types:	Journal article
DOI:	10.1016/j.ipm.2016.07.003
ORCIDs:	0000-0001-5574-7572 and Nielsen, Finn Årup

Keywords

Information extraction Information retrieval Literature review Natural language processing Ontologies Wikipedia

Excavating the mother lode of human-generated text: A systematic review of research that uses the Wikipedia corpus

DTU Library

Address

Shortcuts

Log in?

Excavating the mother lode of human-generated text: A systematic review of research that uses the Wikipedia corpus

DTU Library

Address

Shortcuts