Guideline for improving the reliability of Google Ngram studies : Evidence from religious terms

Lade...
Vorschaubild
Dateien
Younes_2-bavyrw308i741.pdf
Younes_2-bavyrw308i741.pdfGröße: 1.17 MBDownloads: 341
Datum
2019
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
ArXiv-ID
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Open Access Gold
Sammlungen
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published
Erschienen in
PLoS one. 2019, 14(3), e0213554. eISSN 1932-6203. Available under: doi: 10.1371/journal.pone.0213554
Zusammenfassung

The Google Books Ngram Viewer (Google Ngram) is a search engine that charts word frequencies from a large corpus of books and thereby allows for the examination of cultural change as it is reflected in books. While the tool’s massive corpus of data (about 8 million books or 6% of all books ever published) has been used in various scientific studies, concerns about the accuracy of results have simultaneously emerged. This paper reviews the literature and serves as a guideline for improving Google Ngram studies by suggesting five methodological procedures suited to increase the reliability of results. In particular, we recommend the use of (I) different language corpora, (II) cross-checks on different corpora from the same language, (III) word inflections, (IV) synonyms, and (V) a standardization procedure that accounts for both the influx of data and unequal weights of word frequencies. Further, we outline how to combine these procedures and address the risk of potential biases arising from censorship and propaganda. As an example of the proposed procedures, we examine the cross-cultural expression of religion via religious terms for the years 1900 to 2000. Special emphasis is placed on the situation during World War II. In line with the strand of literature that emphasizes the decline of collectivistic values, our results suggest an overall decrease of religion’s importance. However, religion re-gains importance during times of crisis such as World War II. By comparing the results obtained through the different methods, we illustrate that applying and particularly combining our suggested procedures increase the reliability of results and prevents authors from deriving wrong assumptions.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
150 Psychologie
Schlagwörter
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690YOUNES, Nadja, Ulf-Dietrich REIPS, 2019. Guideline for improving the reliability of Google Ngram studies : Evidence from religious terms. In: PLoS one. 2019, 14(3), e0213554. eISSN 1932-6203. Available under: doi: 10.1371/journal.pone.0213554
BibTex
@article{Younes2019Guide-45613,
  year={2019},
  doi={10.1371/journal.pone.0213554},
  title={Guideline for improving the reliability of Google Ngram studies : Evidence from religious terms},
  number={3},
  volume={14},
  journal={PLoS one},
  author={Younes, Nadja and Reips, Ulf-Dietrich},
  note={Article Number: e0213554}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45613">
    <dc:creator>Younes, Nadja</dc:creator>
    <dc:contributor>Reips, Ulf-Dietrich</dc:contributor>
    <dc:creator>Reips, Ulf-Dietrich</dc:creator>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Younes, Nadja</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/43"/>
    <dcterms:issued>2019</dcterms:issued>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/45613"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/45613/1/Younes_2-bavyrw308i741.pdf"/>
    <dc:language>eng</dc:language>
    <dcterms:abstract xml:lang="eng">The Google Books Ngram Viewer (Google Ngram) is a search engine that charts word frequencies from a large corpus of books and thereby allows for the examination of cultural change as it is reflected in books. While the tool’s massive corpus of data (about 8 million books or 6% of all books ever published) has been used in various scientific studies, concerns about the accuracy of results have simultaneously emerged. This paper reviews the literature and serves as a guideline for improving Google Ngram studies by suggesting five methodological procedures suited to increase the reliability of results. In particular, we recommend the use of (I) different language corpora, (II) cross-checks on different corpora from the same language, (III) word inflections, (IV) synonyms, and (V) a standardization procedure that accounts for both the influx of data and unequal weights of word frequencies. Further, we outline how to combine these procedures and address the risk of potential biases arising from censorship and propaganda. As an example of the proposed procedures, we examine the cross-cultural expression of religion via religious terms for the years 1900 to 2000. Special emphasis is placed on the situation during World War II. In line with the strand of literature that emphasizes the decline of collectivistic values, our results suggest an overall decrease of religion’s importance. However, religion re-gains importance during times of crisis such as World War II. By comparing the results obtained through the different methods, we illustrate that applying and particularly combining our suggested procedures increase the reliability of results and prevents authors from deriving wrong assumptions.</dcterms:abstract>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2019-04-10T11:20:53Z</dcterms:available>
    <dcterms:title>Guideline for improving the reliability of Google Ngram studies : Evidence from religious terms</dcterms:title>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/43"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2019-04-10T11:20:53Z</dc:date>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/45613/1/Younes_2-bavyrw308i741.pdf"/>
    <dc:rights>Attribution 4.0 International</dc:rights>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Ja
Diese Publikation teilen