A multi modal approach to gesture recognition from audio and video data

Lade...
Vorschaubild
Dateien
Zu diesem Dokument gibt es keine Dateien.
Datum
2013
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published
Erschienen in
Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13. New York, New York, USA: ACM Press, 2013, pp. 461-466. ISBN 978-1-4503-2129-7. Available under: doi: 10.1145/2522848.2532592
Zusammenfassung

We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
Konferenz
the 15th ACM, 9. Dez. 2013 - 13. Dez. 2013, Sydney, Australia
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690BAYER, Immanuel, Thierry SILBERMANN, 2013. A multi modal approach to gesture recognition from audio and video data. the 15th ACM. Sydney, Australia, 9. Dez. 2013 - 13. Dez. 2013. In: Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13. New York, New York, USA: ACM Press, 2013, pp. 461-466. ISBN 978-1-4503-2129-7. Available under: doi: 10.1145/2522848.2532592
BibTex
@inproceedings{Bayer2013multi-26475,
  year={2013},
  doi={10.1145/2522848.2532592},
  title={A multi modal approach to gesture recognition from audio and video data},
  isbn={978-1-4503-2129-7},
  publisher={ACM Press},
  address={New York, New York, USA},
  booktitle={Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13},
  pages={461--466},
  author={Bayer, Immanuel and Silbermann, Thierry}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/26475">
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/26475"/>
    <dc:language>eng</dc:language>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:rights>terms-of-use</dc:rights>
    <dc:creator>Silbermann, Thierry</dc:creator>
    <dcterms:title>A multi modal approach to gesture recognition from audio and video data</dcterms:title>
    <dcterms:issued>2013</dcterms:issued>
    <dcterms:bibliographicCitation>Proceedings of the 15th ACM International conference on multimodal interaction : Sydney, NSW, Australia ; December 09 - 13, 2013 / Julien Epps ... (eds.). - New York : ACM, 2013. - S. 461-466. - ISBN 978-1-4503-2129-7</dcterms:bibliographicCitation>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-25T09:52:37Z</dcterms:available>
    <dc:contributor>Silbermann, Thierry</dc:contributor>
    <dc:contributor>Bayer, Immanuel</dc:contributor>
    <dc:creator>Bayer, Immanuel</dc:creator>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-25T09:52:37Z</dc:date>
    <dcterms:abstract xml:lang="eng">We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.</dcterms:abstract>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen