TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections

Lade...
Vorschaubild
Dateien
Beel_2-nd5ei0v2m07d0.pdf
Beel_2-nd5ei0v2m07d0.pdfGröße: 3.02 MBDownloads: 66
Datum
2017
Autor:innen
Beel, Joeran
Langer, Stefan
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published
Erschienen in
Proceedings of the iConference 2017, Wuhan, China, 2017. Urbana-Champaign: University of Illinois, 2017, pp. 452-459. Available under: doi: 10.9776/17217
Zusammenfassung

TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TF-IDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (click-through rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
term weighting, user modeling, tf-idf, tf-iduf, recommender systems
Konferenz
iConference 2017 : March 22-25,2017, Wuhan, China : Effect, Expand, Evolve : Global Collaboration across the Information Community, 22. März 2017 - 25. März 2017, Wuhan, China
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690BEEL, Joeran, Stefan LANGER, Bela GIPP, 2017. TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections. iConference 2017 : March 22-25,2017, Wuhan, China : Effect, Expand, Evolve : Global Collaboration across the Information Community. Wuhan, China, 22. März 2017 - 25. März 2017. In: Proceedings of the iConference 2017, Wuhan, China, 2017. Urbana-Champaign: University of Illinois, 2017, pp. 452-459. Available under: doi: 10.9776/17217
BibTex
@inproceedings{Beel2017TFIDu-41879,
  year={2017},
  doi={10.9776/17217},
  title={TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections},
  url={http://hdl.handle.net/2142/96756},
  publisher={University of Illinois},
  address={Urbana-Champaign},
  booktitle={Proceedings of the iConference 2017, Wuhan, China, 2017},
  pages={452--459},
  author={Beel, Joeran and Langer, Stefan and Gipp, Bela}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/41879">
    <dc:contributor>Langer, Stefan</dc:contributor>
    <dcterms:issued>2017</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/41879/1/Beel_2-nd5ei0v2m07d0.pdf"/>
    <dc:creator>Beel, Joeran</dc:creator>
    <dcterms:abstract xml:lang="eng">TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TF-IDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (click-through rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.</dcterms:abstract>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:rights>terms-of-use</dc:rights>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/41879/1/Beel_2-nd5ei0v2m07d0.pdf"/>
    <dc:contributor>Beel, Joeran</dc:contributor>
    <dc:creator>Gipp, Bela</dc:creator>
    <dc:creator>Langer, Stefan</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/41879"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-03-21T11:18:08Z</dcterms:available>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:language>eng</dc:language>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-03-21T11:18:08Z</dc:date>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:title>TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections</dcterms:title>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
2018-03-20
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen