Extending the OLAP Technology for Social Media Analysis
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Sammlungen
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Contemporary decision support and information systems have been fundamental to the smooth operation and growth of successful businesses across the globe for over two decades now. Data warehousing and OLAP are at the core of these systems and have been instrumental in encyclopedic data analysis in multifarious domains like manufacturing industry, retail sector, financial services, transportation, telecommunications, utilities, healthcare, education, research and government. With the emergence of new data problems and domains e.g., spatial, sequence and multimedia data etc., data warehouse systems the underlying technology, methods and techniques have been extended to provide the same standard performance they are known for.
A relatively new problem domain is that of social media that has shaped the last couple of years of the 21st century. The revolution social media has brought about, has impacted almost all walks of life. The ever expanding Internet and cheap hand-held electronic devices have contributed to the popularity of social media and have added millions of users to these web sites. Social media have been playing an important role in politics, disasters, sports, entertainment, health, education, government and business domain. These websites exist by the virtue of users and their activity. The user-generated content on these sites amounts to huge volumes and is generated at high pace and attracts research and commercial interests of many.
The aim of this thesis is to extend the OLAP framework for social media analysis and to provide enabling environment for social business intelligence. Data warehouses and OLAP operate on strictly structured data objects and the pre-established relationships among these objects in order to provide multidimensional analysis e fficiently. While data originating from social media is semi-structured and unstructured and exhibit a degree of dynamism. In this thesis, we bridge the gap between OLAP and social media by enabling the former to operate and deal with the latter by proposing a set of methods from modeling, to storing and querying user-generated data on the social media.
We survey the data models of the social media and propose the corresponding transformations in the multidimensional data modeling landscape. Specifically, we obtained the multidimensional view of the data originating from social media based on the metadata. The underlying dataset is enriched by using numerous methods from Natural Language Processing, Text Mining and Data Mining. These methods include language detection, sentiment analysis, named entity recognition, topic extraction and the classical data mining algorithms like classification and clustering. The outcome of these methods include objects like facts, dimensions, dimensional hierarchies, hierarchy levels and cubes. We resorted to the X-DFM (Extended Dimensional Fact) Modeling as it supports data modeling of the newly discovered and dynamic data elements in the dimensionality landscape. Dimensionality modeling is based on the static dimensions and changing facts principle, however, social media pose the challenge of even changing dimension . We investigate proposals in the literature on storing, maintaining and querying such dynamic dimensions. Our recommendations are based on slowly changing dimensions (SCD) and argue it's applicability with the help of examples. We further propose a three layered business intelligence framework that obtains data from social media and stores it in the data warehouse along with the enterprise business data. The user-generated content from social media undergoes semantic enrichment and is then modeled in accordance with the OLAP standards. Having social media data and enterprise data in this format, makes provisions for social-medium specific analysis, cross-media analysis and business analysis with respect to the social media, e.g., Social OLAP, Social CRM etc.
Taming user-generated data from social media and integrating it into the OLAP environment allows for multidimensional analysis of social media and business from useful and newly discovered perspectives. To the best of our knowledge, other relevant works only focus on a smaller and targeted problem, while our work focuses on multiple problems and applications. However, we do not claim that it covered all aspects of this complex problem and understand the fact that it is unworkable in a single PhD.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
REHMAN, Nafees Ur, 2015. Extending the OLAP Technology for Social Media Analysis [Dissertation]. Konstanz: University of KonstanzBibTex
@phdthesis{Rehman2015Exten-31013, year={2015}, title={Extending the OLAP Technology for Social Media Analysis}, author={Rehman, Nafees Ur}, address={Konstanz}, school={Universität Konstanz} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/31013"> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-05-21T11:03:55Z</dcterms:available> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:issued>2015</dcterms:issued> <dc:creator>Rehman, Nafees Ur</dc:creator> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/31013/3/Rehman_0-290919.pdf"/> <dc:rights>terms-of-use</dc:rights> <dc:contributor>Rehman, Nafees Ur</dc:contributor> <dcterms:title>Extending the OLAP Technology for Social Media Analysis</dcterms:title> <dcterms:abstract xml:lang="eng">Contemporary decision support and information systems have been fundamental to the smooth operation and growth of successful businesses across the globe for over two decades now. Data warehousing and OLAP are at the core of these systems and have been instrumental in encyclopedic data analysis in multifarious domains like manufacturing industry, retail sector, financial services, transportation, telecommunications, utilities, healthcare, education, research and government. With the emergence of new data problems and domains e.g., spatial, sequence and multimedia data etc., data warehouse systems the underlying technology, methods and techniques have been extended to provide the same standard performance they are known for.<br />A relatively new problem domain is that of social media that has shaped the last couple of years of the 21st century. The revolution social media has brought about, has impacted almost all walks of life. The ever expanding Internet and cheap hand-held electronic devices have contributed to the popularity of social media and have added millions of users to these web sites. Social media have been playing an important role in politics, disasters, sports, entertainment, health, education, government and business domain. These websites exist by the virtue of users and their activity. The user-generated content on these sites amounts to huge volumes and is generated at high pace and attracts research and commercial interests of many.<br />The aim of this thesis is to extend the OLAP framework for social media analysis and to provide enabling environment for social business intelligence. Data warehouses and OLAP operate on strictly structured data objects and the pre-established relationships among these objects in order to provide multidimensional analysis e fficiently. While data originating from social media is semi-structured and unstructured and exhibit a degree of dynamism. In this thesis, we bridge the gap between OLAP and social media by enabling the former to operate and deal with the latter by proposing a set of methods from modeling, to storing and querying user-generated data on the social media.<br />We survey the data models of the social media and propose the corresponding transformations in the multidimensional data modeling landscape. Specifically, we obtained the multidimensional view of the data originating from social media based on the metadata. The underlying dataset is enriched by using numerous methods from Natural Language Processing, Text Mining and Data Mining. These methods include language detection, sentiment analysis, named entity recognition, topic extraction and the classical data mining algorithms like classification and clustering. The outcome of these methods include objects like facts, dimensions, dimensional hierarchies, hierarchy levels and cubes. We resorted to the X-DFM (Extended Dimensional Fact) Modeling as it supports data modeling of the newly discovered and dynamic data elements in the dimensionality landscape. Dimensionality modeling is based on the static dimensions and changing facts principle, however, social media pose the challenge of even changing dimension . We investigate proposals in the literature on storing, maintaining and querying such dynamic dimensions. Our recommendations are based on slowly changing dimensions (SCD) and argue it's applicability with the help of examples. We further propose a three layered business intelligence framework that obtains data from social media and stores it in the data warehouse along with the enterprise business data. The user-generated content from social media undergoes semantic enrichment and is then modeled in accordance with the OLAP standards. Having social media data and enterprise data in this format, makes provisions for social-medium specific analysis, cross-media analysis and business analysis with respect to the social media, e.g., Social OLAP, Social CRM etc.<br />Taming user-generated data from social media and integrating it into the OLAP environment allows for multidimensional analysis of social media and business from useful and newly discovered perspectives. To the best of our knowledge, other relevant works only focus on a smaller and targeted problem, while our work focuses on multiple problems and applications. However, we do not claim that it covered all aspects of this complex problem and understand the fact that it is unworkable in a single PhD.</dcterms:abstract> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-05-21T11:03:55Z</dc:date> <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/31013"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/31013/3/Rehman_0-290919.pdf"/> <dc:language>eng</dc:language> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> </rdf:Description> </rdf:RDF>