Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-3400
Autor(en): Mayer, Christian
Titel: Scalable data retrieval in a mobile environment
Erscheinungsdatum: 2014
Dokumentart: Abschlussarbeit (Diplom)
URI: http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-96047
http://elib.uni-stuttgart.de/handle/11682/3417
http://dx.doi.org/10.18419/opus-3400
Zusammenfassung: Retrieving multidimensional data out of distributed systems becomes increasingly important. But applications of these systems are often not only interested in data vectors that match certain queries. Instead, many applications demand for retrieval of data with high quality. In this thesis, we design a distributed system that can be used by applications to retrieve data of high quality for arbitrary multidimensional queries. Major challenges for the quality-based data retrieval are to 1.) find an appropriate formalization of data quality, 2.) design routing algorithms for queries, that are robust in the presence of high dynamics with respect to the participants of the system and the data on the participants and 3.) handle heterogeneous and high-dimensional data in the system. In order to retrieve data quality, we propose 1.) the measure of confidence for a query that is based on clusters of data. When a participant of the system finds, that its confidence for a query is high, it will assume to possess data of high quality for that query. 2.) Further, we design and implement routing strategies in order to route queries to nodes that can answer them with high confidence. Maintaining exact routing tables for each possible query would be infeasible, so nodes have to model the data that can be reached via neighbours in routing models. Such modelling of data is based on structural properties of the data such as how good the data can be clustered. 3.) In the high-dimensional space, we have to overcome the curse of dimensionality: the structure of data can become invisible in higher dimensions. We address this problem with a method for dimensionality reduction that reduces the dimensions with highest data variance. The evaluation of our approaches shows a high accuracy of query routing, even if our approaches do not make use of scalability bottlenecks like flooding of the query or flooding of routing information. Further, we show that the use of dimensionality reduction in routing has positive influence on the routing accuracy. We think that the methods in our approach can be useful instruments, whenever the task of retrieving data of high quality has to be outsourced to a distributed system.
Enthalten in den Sammlungen:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
DIP_3636.pdf1,71 MBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.