The Induction of Phonological Structure

Lade...
Vorschaubild
Dateien
Mayer_262292.pdf
Mayer_262292.pdfGröße: 15.11 MBDownloads: 290
Datum
2012
Autor:innen
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Open Access Green
Sammlungen
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Dissertation
Publikationsstatus
Published
Erschienen in
Zusammenfassung

This dissertation explores to what extent phonological structure can be inferred from the distribution of sounds within words. For this purpose, a typologically oriented computational approach is pursued, which rests on techniques from the fields of computational linguistics, data mining and visual analytics. The methods that are presented are considered to be procedural universals which can be applied to any natural language in the same way even though they yield different results for individual languages.
The basic assumption that underlies all methods is that the co-occurrence of sounds in relevant contexts within words of a language is constrained. The restrictions of combinations of sounds lead to a given distribution, which in turn can be used to induce a distinction in the sounds of the language that can be related to natural classes and features in phonological theory. The focus of the present approach is not so much on the statistical methods that are necessary to induce the latent structures, but on the linguistically motivated contexts which manifest the existing constraints most clearly.


The induction of phonological structure from language data is an interesting research topic for various reasons. First of all, it is remarkable that phonological features, which are mostly defined in terms of articulatory or acoustic properties, are also reflected in the distribution of sounds in a language. In this thesis, I complement previous work on learning phonological categories (e.g., Ellison 1994; Goldsmith and Xanthos 2009) with an approach to infer place of articulation distinctions in consonants. The method is based on the principle of similar place avoidance (SPA; Pozdniakov and Segerer 2007), which states that consonants in CVC sequences tend to exhibit different place features. I contribute to earlier work in this research area by showing that this principle is not only active in Semitic languages (with a study of Maltese verbal roots) but also holds for West Germanic languages (with an investigation of the entries in the CELEX database for English, German and Dutch) and a worldwide sample of word forms from the ASJP dataset (Dryer test for universality), leading to the conclusion that it is a statistical universal. Using this principle to infer place distinctions in consonants yields almost perfect results for the ASJP data and the list of Maltese verbal roots. The automatically generated dendrograms closely correspond to the hierarchical structures for natural classes that have been postulated in the phonological literature (e.g., Rice 1994; McCarthy 1994).


In addition, the present thesis complements previous work on the machine learning of phonological structure with a novel method to automatically discriminate vowels and consonants in a language that is not based on N-gram statistics. The substitution approach relies on the frequency of sounds to occur as the discriminating segments in minimal pairs. Although the method does not achieve the same level of accuracy as earlier approaches in this area (e.g., Sukhotin 1962; Ellison 1994; Goldsmith and Xanthos 2009; Kim and Snyder 2013), it shows that a distinction of vowels and consonants can also be inferred from the relation of sounds in absentia.


Second, the induction of phonological structure is considered in the present work as a way to explore a large amount of language data in search for the presence of phonotactic constraints. To this end, I present a visual analytics approach for the detection of vowel harmony patterns that is intended as a proof of concept that a graphically enhanced statistical analysis can make potentially interesting patterns in the data more accessible to human perception. As the matrix visualizations show, languages exhibiting patterns of vowel harmony (or similar phenomena) can be distinguished from languages without such constraints at a glance. The visualization approach can easily be extended to other related phenomena, e.g., consonant harmony (Hansson 2010), synharmonism (Trubetzkoy 1939 [1967]) or any kind of (statistical) phonotactic constraints. The statistical measure on which the vowel harmony visualizations are based can also serve as a typological measure on the basis of which languages can be compared. The ranking of languages according to this measure approximately reflects the intuition about which languages show conspicuous harmony patterns.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
400 Sprachwissenschaft, Linguistik
Schlagwörter
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690MAYER, Thomas, 2012. The Induction of Phonological Structure [Dissertation]. Konstanz: University of Konstanz
BibTex
@phdthesis{Mayer2012Induc-26229,
  year={2012},
  title={The Induction of Phonological Structure},
  author={Mayer, Thomas},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/26229">
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dc:contributor>Mayer, Thomas</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:title>The Induction of Phonological Structure</dcterms:title>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/26229"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/26229/1/Mayer_262292.pdf"/>
    <dcterms:issued>2012</dcterms:issued>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-05T06:58:21Z</dcterms:available>
    <dcterms:abstract xml:lang="eng">This dissertation explores to what extent phonological structure can be inferred from the distribution of sounds within words. For this purpose, a typologically oriented computational approach is pursued, which rests on techniques from the fields of computational linguistics, data mining and visual analytics. The methods that are presented are considered to be procedural universals which can be applied to any natural language in the same way even though they yield different results for individual languages.&lt;br /&gt;The basic assumption that underlies all methods is that the co-occurrence of sounds in relevant contexts within words of a language is constrained. The restrictions of combinations of sounds lead to a given distribution, which in turn can be used to induce a distinction in the sounds of the language that can be related to natural classes and features in phonological theory. The focus of the present approach is not so much on the statistical methods that are necessary to induce the latent structures, but on the linguistically motivated contexts which manifest the existing constraints most clearly.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The induction of phonological structure from language data is an interesting research topic for various reasons. First of all, it is remarkable that phonological features, which are mostly defined in terms of articulatory or acoustic properties, are also reflected in the distribution of sounds in a language. In this thesis, I complement previous work on learning phonological categories (e.g., Ellison 1994; Goldsmith and Xanthos 2009) with an approach to infer place of articulation distinctions in consonants. The method is based on the principle of similar place avoidance (SPA; Pozdniakov and Segerer 2007), which states that consonants in CVC sequences tend to exhibit different place features. I contribute to earlier work in this research area by showing that this principle is not only active in Semitic languages (with a study of Maltese verbal roots) but also holds for West Germanic languages (with an investigation of the entries in the CELEX database for English, German and Dutch) and a worldwide sample of word forms from the ASJP dataset (Dryer test for universality), leading to the conclusion that it is a statistical universal. Using this principle to infer place distinctions in consonants yields almost perfect results for the ASJP data and the list of Maltese verbal roots. The automatically generated dendrograms closely correspond to the hierarchical structures for natural classes that have been postulated in the phonological literature (e.g., Rice 1994; McCarthy 1994).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In addition, the present thesis complements previous work on the machine learning of phonological structure with a novel method to automatically discriminate vowels and consonants in a language that is not based on N-gram statistics. The substitution approach relies on the frequency of sounds to occur as the discriminating segments in minimal pairs. Although the method does not achieve the same level of accuracy as earlier approaches in this area (e.g., Sukhotin 1962; Ellison 1994; Goldsmith and Xanthos 2009; Kim and Snyder 2013), it shows that a distinction of vowels and consonants can also be inferred from the relation of sounds in absentia.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Second, the induction of phonological structure is considered in the present work as a way to explore a large amount of language data in search for the presence of phonotactic constraints. To this end, I present a visual analytics approach for the detection of vowel harmony patterns that is intended as a proof of concept that a graphically enhanced statistical analysis can make potentially interesting patterns in the data more accessible to human perception. As the matrix visualizations show, languages exhibiting patterns of vowel harmony (or similar phenomena) can be distinguished from languages without such constraints at a glance. The visualization approach can easily be extended to other related phenomena, e.g., consonant harmony (Hansson 2010), synharmonism (Trubetzkoy 1939 [1967]) or any kind of (statistical) phonotactic constraints. The statistical measure on which the vowel harmony visualizations are based can also serve as a typological measure on the basis of which languages can be compared. The ranking of languages according to this measure approximately reflects the intuition about which languages show conspicuous harmony patterns.</dcterms:abstract>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/26229/1/Mayer_262292.pdf"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dc:language>eng</dc:language>
    <dc:rights>terms-of-use</dc:rights>
    <dc:creator>Mayer, Thomas</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-05T06:58:21Z</dc:date>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
February 9, 2012
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Begutachtet
Diese Publikation teilen