A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space

Lade...
Vorschaubild
Dateien
PODS97.pdf
PODS97.pdfGröße: 215.99 KBDownloads: 1019
Datum
1997
Autor:innen
Berchtold, Stefan
Böhm, Christian
Kriegel, Hans-Peter
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published
Erschienen in
Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '97. New York, New York, USA: ACM Press, 1997, pp. 78-86. ISBN 0-89791-910-6. Available under: doi: 10.1145/263661.263671
Zusammenfassung

In this paper, we present a new cost model for nearest neighbor search in high-dimensional data space. We first analyze different nearest neighbor algorithms, present a generalization of an algorithm which has been originally proposed for Quadtrees [13], and show that this algorithm is optimal. Then, we develop a cost model which - in contrast to previous models - takes boundary effects into account and therefore also works in high dimensions. The advantages of our model are in particular: Our model works for data sets with an arbitrary number of dimensions and an arbitrary number of data points, is applicable to different data distributions and index structures, and provides accurate estimates of the expected query execution time. To show the practical relevance and accuracy of our model, we perform a detailed analysis using synthetic and real data. The results of applying our model to Hilbert and X-tree indices show that it provides a good estimation of the query performance, which is considerably better than the estimates by previous models especially for highdimensional data.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
Nearest Neighbor Search, Cost Model, Multidimensional Searching, Multidimensional Index Structures, High-Dimensional Data Space
Konferenz
the sixteenth ACM SIGACT-SIGMOD-SIGART symposium, 11. Mai 1997 - 15. Mai 1997, Tucson, Arizona, United States
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690BERCHTOLD, Stefan, Christian BÖHM, Daniel A. KEIM, Hans-Peter KRIEGEL, 1997. A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space. the sixteenth ACM SIGACT-SIGMOD-SIGART symposium. Tucson, Arizona, United States, 11. Mai 1997 - 15. Mai 1997. In: Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '97. New York, New York, USA: ACM Press, 1997, pp. 78-86. ISBN 0-89791-910-6. Available under: doi: 10.1145/263661.263671
BibTex
@inproceedings{Berchtold1997Model-5738,
  year={1997},
  doi={10.1145/263661.263671},
  title={A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space},
  isbn={0-89791-910-6},
  publisher={ACM Press},
  address={New York, New York, USA},
  booktitle={Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems  - PODS '97},
  pages={78--86},
  author={Berchtold, Stefan and Böhm, Christian and Keim, Daniel A. and Kriegel, Hans-Peter}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/5738">
    <dcterms:title>A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space</dcterms:title>
    <dc:contributor>Keim, Daniel A.</dc:contributor>
    <dcterms:issued>1997</dcterms:issued>
    <dc:creator>Berchtold, Stefan</dc:creator>
    <dc:creator>Keim, Daniel A.</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T15:59:42Z</dc:date>
    <dc:rights>Attribution-NonCommercial-NoDerivs 2.0 Generic</dc:rights>
    <dc:contributor>Kriegel, Hans-Peter</dc:contributor>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5738/1/PODS97.pdf"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T15:59:42Z</dcterms:available>
    <dc:contributor>Böhm, Christian</dc:contributor>
    <dc:creator>Böhm, Christian</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/5738"/>
    <dc:format>application/pdf</dc:format>
    <dcterms:abstract xml:lang="eng">In this paper, we present a new cost model for nearest neighbor search in high-dimensional data space. We first analyze different nearest neighbor algorithms, present a generalization of an algorithm which has been originally proposed for Quadtrees [13], and show that this algorithm is optimal. Then, we develop a cost model which - in contrast to previous models - takes boundary effects into account and therefore also works in high dimensions. The advantages of our model are in particular: Our model works for data sets with an arbitrary number of dimensions and an arbitrary number of data points, is applicable to different data distributions and index structures, and provides accurate estimates of the expected query execution time. To show the practical relevance and accuracy of our model, we perform a detailed analysis using synthetic and real data. The results of applying our model to Hilbert and X-tree indices show that it provides a good estimation of the query performance, which is considerably better than the estimates by previous models especially for highdimensional data.</dcterms:abstract>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5738/1/PODS97.pdf"/>
    <dc:creator>Kriegel, Hans-Peter</dc:creator>
    <dc:language>eng</dc:language>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/2.0/"/>
    <dcterms:bibliographicCitation>First publ. in: Proceedings / ACM PODS Symposium on Principles of Database Systems, Tucson, AZ, 1997, pp. 78-86</dcterms:bibliographicCitation>
    <dc:contributor>Berchtold, Stefan</dc:contributor>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Diese Publikation teilen