Volltext-Downloads (blau) und Frontdoor-Views (grau)

Data point selection for genre-aware parsing

  • In the NLP literature, adapting a parser to new text with properties different from the training data is commonly referred to as domain adaptation. In practice, however, the differences between texts from different sources often reflect a mixture of domain and genre properties, and it is by no means clear what impact each of those has on statistical parsing. In this paper, we investigate how differences between articles in a newspaper corpus relate to the concepts of genre and domain and how they influence parsing performance of a transition-based dependency parser. We do this by applying various similarity measures for data point selection and testing their adequacy for creating genre-aware parsing models.

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Ines Rehbein, Felix Bildhauer
URN:urn:nbn:de:bsz:mh39-71193
URL:http://aclweb.org/anthology/W/W17/W17-7614.pdf
ISBN:978-80-88132-04-2
Parent Title (English):Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories (TLT16), Prague, Czech Republic, January 23–24, 2018
Publisher:Charles University
Place of publication:Prague
Editor:Jan Hajič
Document Type:Conference Proceeding
Language:English
Year of first Publication:2017
Date of Publication (online):2018/02/15
Contributing Corporation:Faculty of Mathematics and Physics, Charles University
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:dependency parsing; genre and register variation; parser adaptation
GND Keyword:Automatische Sprachanalyse; Korpus <Linguistik>; Sprachstatistik; Syntaktische Analyse; Textsorte
First Page:95
Last Page:105
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Computerlinguistik
Linguistics-Classification:Korpuslinguistik
Program areas:Grammatik
Program areas:Digitale Sprachwissenschaft
Licence (English):License LogoCreative Commons - Attribution 4.0 International