Loading...
Research Project
NEURAL CORRELATES OF MOTION AND STEREO VISION IN HUMAN POSE AND GAIT DETECTION
Funder
Authors
Publications
Human Pose Estimation by a Series of Residual Auto-Encoders
Publication . Farrajota, Miguel; Rodrigues, João; du Buf, Hans; Alexandre, L. A.; Sanchez, J. S.; Rodrigues, J. M. F.
Pose estimation is the task of predicting the pose of an object in an image or in a sequence of images. Here, we focus on articulated human pose estimation in scenes with a single person. We employ a series of residual auto-encoders to produce multiple predictions which are then combined to provide a heatmap prediction of body joints. In this network topology, features are processed across all scales which captures the various spatial relationships associated with the body. Repeated bottom-up and top-down processing with intermediate supervision for each auto-encoder network is applied. We propose some improvements to this type of regression-based networks to further increase performance, namely: (a) increase the number of parameters of the auto-encoder networks in the pipeline, (b) use stronger regularization along with heavy data augmentation, (c) use sub-pixel precision for more precise joint localization, and (d) combine all auto-encoders output heatmaps into a single prediction, which further increases body joint prediction accuracy. We demonstrate state-of-the-art results on the popular FLIC and LSP datasets.
Multi-scale cortical keypoints for realtime hand tracking and gesture recognition
Publication . Farrajota, Miguel; Saleiro, Mário; Terzic, Kasim; Rodrigues, J. M. F.; du Buf, J. M. H.
Human-robot interaction is an interdisciplinary
research area which aims at integrating human factors, cognitive
psychology and robot technology. The ultimate goal is
the development of social robots. These robots are expected to
work in human environments, and to understand behavior of
persons through gestures and body movements. In this paper
we present a biological and realtime framework for detecting
and tracking hands. This framework is based on keypoints
extracted from cortical V1 end-stopped cells. Detected keypoints
and the cells’ responses are used to classify the junction type.
By combining annotated keypoints in a hierarchical, multi-scale
tree structure, moving and deformable hands can be segregated,
their movements can be obtained, and they can be tracked over
time. By using hand templates with keypoints at only two scales,
a hand’s gestures can be recognized.
Human action recognition in videos with articulated pose information by deep networks
Publication . Farrajota, Miguel; Rodrigues, João; du Buf, J. M. H.
Action recognition is of great importance in understanding human motion from video. It is an important topic in computer vision due to its many applications such as video surveillance, human-machine interaction and video retrieval. One key problem is to automatically recognize low-level actions and high-level activities of interest. This paper proposes a way to cope with low-level actions by combining information of human body joints to aid action recognition. This is achieved by using high-level features computed by a convolutional neural network which was pre-trained on Imagenet, with articulated body joints as low-level features. These features are then used to feed a Long Short-Term Memory network to learn the temporal dependencies of an action. For pose prediction, we focus on articulated relations between body joints. We employ a series of residual auto-encoders to produce multiple predictions which are then combined to provide a likelihood map of body joints. In the network topology, features are processed across all scales which capture the various spatial relationships associated with the body. Repeated bottom-up and top-down processing with intermediate supervision of each auto-encoder network is applied. We demonstrate state-of-the-art results on the popular FLIC, LSP and UCF Sports datasets.
A biological and real-time framework for hand gestures and head poses
Publication . Saleiro, Mário; Farrajota, Miguel; Terzic, Kasim; Rodrigues, J. M. F.; du Buf, J. M. H.
Human-robot interaction is an interdisciplinary research area that aims at the development of social robots. Since social robots are expected to interact with humans and understand their behavior through gestures and body movements, cognitive psychology and robot technology must be integrated. In this paper we present a biological and real-time framework for detecting and tracking hands and heads. This framework is based on keypoints extracted by means of cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. Through the combination of annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated and tracked over time. By using hand templates with lines and edges at only a few scales, a hand’s gestures can be recognized. Head tracking and pose detection are also implemented, which can be integrated with detection of facial expressions in the future. Through the combinations of head poses and hand gestures a large number of commands can be given to a robot.
Organizational Units
Description
Keywords
Contributors
Funders
Funding agency
Fundação para a Ciência e a Tecnologia
Funding programme
Funding Award Number
SFRH/BD/79812/2011