Loading...
Research Project
A neuro-dynamic framework for cognitive robotics: scene representations, behavioural sequences, and learning.
Funder
Authors
Publications
Face and object recognition by 3D cortical representations
Publication . Martins, Jaime Afonso do Nascimento Carvalho; du Buf, J.M.H.; Rodrigues, J.M.F.
This thesis presents a novel integrated cortical architecture with significant
emphasis on low-level attentional mechanisms—based on retinal nonstandard
cells and pathways—that can group non-attentional, bottom-up
features present in V1/V2 into “proto-object” shapes. These shapes are extracted
at first using combinations of specific cell types for detecting corners,
bars/edges and curves which work extremely well for geometrically
shaped objects. Later, in the parietal pathway (probably in LIP), arbitrary
shapes can be extracted from population codes of V2 (or even dorsal V3)
oriented cells that encode the outlines of objects as “proto-objects”. Object
shapes obtained at both cortical levels play an important role in bottom-up
local object gist vision, which tries to understand scene context in less than
70 ms and is thought to use both global and local scene features.
Edge conspicuity maps are able to detect borders/edges of objects and
attribute them a weight based on their perceptual salience, using readily
available retinal ganglion cell colour-opponency coding. Conspicuity maps
are fundamental in building posterior saliency maps—important for both
bottom-up attention schemes and also for Focus-of-Attention mechanisms
that control eye gaze and object recognition.
Disparity maps are also a main focus of this thesis. They are built upon
binocular simple and complex cells in quadrature, using a Disparity-Enery
Model. These maps are fundamental for perception of distance within a
scene and close/far object relationships in doing foreground to background
segregation.
The role of cortical disparity in 3D facial recognition was also explored
when processing faces with very different facial expressions (even extreme
ones), yielding state-of-the-art results when compared to other, non-biological,
computer vision algorithms.
Biological models for active vision: towards a unified architecture
Publication . Terzic, Kasim; Lobato, D.; Saleiro, Mário; Martins, Jaime; Farrajota, Miguel; Rodrigues, J. M. F.; du Buf, J. M. H.
Building a general-purpose, real-time active vision system completely based on biological models is a great challenge. We apply a number of biologically plausible algorithms which address different aspects of vision, such as edge and keypoint detection, feature extraction,optical flow and disparity, shape detection, object recognition and scene modelling into a complete system. We present some of the experiments from our ongoing work, where our system leverages a combination of algorithms to solve complex tasks.
Local gist vision of man-made objects
Publication . Martins, J. C.; Rodrigues, J. M. F.; du Buf, J. M. H.
Attention is usually modelled by sequential fixation of peaks in saliency maps. Those maps code local
conspicuity: complexity, colour and texture. Such features have no relation to entire objects, unless also
disparity and optical flow are considered, which often segregate entire objects from their background.
Recently we developed a model of local gist vision: which types of objects are about where in a scene. This
model addresses man-made objects which are dominated by a small shape repertoire: squares, rectangles,
trapeziums, triangles, circles and ellipses. Only exploiting local colour contrast, the model can detect these
shapes by a small hierarchy of cell layers devoted to low- and mid-level geometry. The model has been
tested successfully on video sequences containing traffic signs and other scenes, and partial occlusions
were not problematic.
Real-Time Object Recognition Based on Cortical Multi-scale Keypoints
Publication . Terzic, Kasim; Rodrigues, J. M. F.; du Buf, J. M. H.
In recent years, a large number of impressive object categorisation algorithms have surfaced, both computational and biologically motivated. While results on standardised benchmarks are impressive, very few of the best-performing algorithms took run-time performance into account, rendering most of them useless for real-time active vision scenarios such as cognitive robots. In this paper, we combine cortical keypoints based on primate area V1 with a state-of-the-art nearest neighbour classifier, and show that such a system can approach state-of-the-art categorisation performance while meeting the real-time constraint.
A biological and real-time framework for hand gestures and head poses
Publication . Saleiro, Mário; Farrajota, Miguel; Terzic, Kasim; Rodrigues, J. M. F.; du Buf, J. M. H.
Human-robot interaction is an interdisciplinary research area that aims at the development of social robots. Since social robots are expected to interact with humans and understand their behavior through gestures and body movements, cognitive psychology and robot technology must be integrated. In this paper we present a biological and real-time framework for detecting and tracking hands and heads. This framework is based on keypoints extracted by means of cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. Through the combination of annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated and tracked over time. By using hand templates with lines and edges at only a few scales, a hand’s gestures can be recognized. Head tracking and pose detection are also implemented, which can be integrated with detection of facial expressions in the future. Through the combinations of head poses and hand gestures a large number of commands can be given to a robot.
Organizational Units
Description
Keywords
Contributors
Funders
Funding agency
European Commission
Funding programme
FP7
Funding Award Number
270247