WebKnox: Web Knowledge Extraction

David Urbansky

AutorIn

David Urbansky

Titel

WebKnox: Web Knowledge Extraction

Zitierfähige Url:

https://nbn-resolving.org/urn:nbn:de:bsz:14-qucosa-23766

Datum der Einreichung

20.01.2009

Datum der Verteidigung

26.01.2009

Abstract (EN)

This thesis focuses on entity and fact extraction from the web. Different knowledge representations and techniques for information extraction are discussed before the design for a knowledge extraction system, called WebKnox, is introduced. The main contribution of this thesis is the trust ranking of extracted facts with a self-supervised learning loop and the extraction system with its composition of known and refined extraction algorithms. The used techniques show an improvement in precision and recall in most of the matters for entity and fact extractions compared to the chosen baseline approaches.

Freie Schlagwörter (DE)

Informationsextraktion, Wissensbasis, Ontologien, Faktextraktion, Entitätenextraktion. Web Mining

Freie Schlagwörter (EN)

information extraction, web mining, entity extraction, fact extraction

Klassifikation (DDC)

004

Klassifikation (RVK)

ST 515

GutachterIn