Iqbal, Umar: Articulated Human Pose Estimation in Unconstrained Images and Videos. - Bonn, 2018. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5n-52928
@phdthesis{handle:20.500.11811/7685,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5n-52928,
author = {{Umar Iqbal}},
title = {Articulated Human Pose Estimation in Unconstrained Images and Videos},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2018,
month = dec,

note = {The understanding of the articulated human body pose is of great interest in many scenarios. While humans have an unmatched ability to effortlessly extract and interpret such information in any unconstrained environment, developing computational methods with similar capabilities is a very challenging task. The developed methods have to handle scenes with complex backgrounds, an unknown number of potentially occluded and truncated people, large-scale variations, diverse lighting conditions, and the vast amounts of appearance variation due to complex body articulations and clothing. The noise introduced by the lossy sensing modalities complicates the problem even further. While there has been a lot of work for human pose estimation in constrained environments, very few works have addressed these challenges in the literature. Further, the estimation of the articulated pose of small functional body parts such as hands has often been ignored in the existing works. To this end, this thesis addresses the aforementioned challenges and presents efficient and robust computational methods for the 2D and 3D articulated human body and hand pose estimation in unconstrained real-world scenarios.
First, we address the problem of 2D multi-person body pose estimation. We present an efficient approach that estimates the poses of people in groups or crowd. We demonstrate that the problem can be formulated as a set of local joint-to-person association problems which can be solved efficiently for each person in the image, while also handling occlusions and truncations.
Second, we introduce the challenging case of simultaneous multi-person pose estimation and tracking in videos. The approaches for multi-person pose estimation in images cannot be applied directly to this problem since it also requires to solve person associations over time. To this end, we propose a novel method that jointly models both problems in a single formulation using a spatio-temporal graph. The optimization of the graph using integer linear programming directly provides plausible body pose trajectories for each person. The proposed method does not make any assumptions and performs pose estimation and tracking in fully unconstrained videos. We also present a large scale dataset and a thorough evaluation protocol to evaluate the developed methods quantitatively. Further, we provide an extensive analysis of the performance of state-of-the-art methods and highlight their strengths and weaknesses.
Given the estimated, possibly noisy, 2D pose trajectory of a person, the third direction of this thesis focuses on the refinement of pose trajectory by exploiting the information about human activities. We present an action-conditioned pictorial structure model that predicts and incorporates activity information for body pose refinement.
The fourth direction of this thesis concerns 3D human pose estimation from single images. Given the estimated 2D pose of a person, we present an approach to lift the 2D pose to 3D by using an efficient and robust method for 3D pose retrieval and reconstruction. Unlike existing works, the proposed approach does not require any training images with annotated 3D poses. Since we can estimate 2D poses from any unconstrained image, the proposed method can also reconstruct 3D poses in any unconstrained scenario.
The final part of the thesis concerns the estimation of 3D hand pose from an RGB input. We present a novel 2.5D pose representation which can be estimated reliably from an RGB image and allows to reconstruct the absolute 3D pose of the hand using a novel 3D reconstruction approach. The proposed method can handle severe occlusions, complex hand articulations, and unconstrained images taken from the wild.},

url = {https://hdl.handle.net/20.500.11811/7685}
}

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden:

InCopyright