A neuro-dynamic framework for cognitive robotics: scene representations, behavioural sequences, and learning.

Funder

Organizational Unit

Publications

Face and object recognition by 3D cortical representations

Publication . Martins, Jaime Afonso do Nascimento Carvalho; du Buf, J.M.H.; Rodrigues, J.M.F.

This thesis presents a novel integrated cortical architecture with significant emphasis on low-level attentional mechanisms—based on retinal nonstandard cells and pathways—that can group non-attentional, bottom-up features present in V1/V2 into “proto-object” shapes. These shapes are extracted at first using combinations of specific cell types for detecting corners, bars/edges and curves which work extremely well for geometrically shaped objects. Later, in the parietal pathway (probably in LIP), arbitrary shapes can be extracted from population codes of V2 (or even dorsal V3) oriented cells that encode the outlines of objects as “proto-objects”. Object shapes obtained at both cortical levels play an important role in bottom-up local object gist vision, which tries to understand scene context in less than 70 ms and is thought to use both global and local scene features. Edge conspicuity maps are able to detect borders/edges of objects and attribute them a weight based on their perceptual salience, using readily available retinal ganglion cell colour-opponency coding. Conspicuity maps are fundamental in building posterior saliency maps—important for both bottom-up attention schemes and also for Focus-of-Attention mechanisms that control eye gaze and object recognition. Disparity maps are also a main focus of this thesis. They are built upon binocular simple and complex cells in quadrature, using a Disparity-Enery Model. These maps are fundamental for perception of distance within a scene and close/far object relationships in doing foreground to background segregation. The role of cortical disparity in 3D facial recognition was also explored when processing faces with very different facial expressions (even extreme ones), yielding state-of-the-art results when compared to other, non-biological, computer vision algorithms.

2013Doctoral thesis

Open access

Biological models for active vision: towards a unified architecture

Publication . Terzic, Kasim; Lobato, D.; Saleiro, Mário; Martins, Jaime; Farrajota, Miguel; Rodrigues, J. M. F.; du Buf, J. M. H.

Building a general-purpose, real-time active vision system completely based on biological models is a great challenge. We apply a number of biologically plausible algorithms which address different aspects of vision, such as edge and keypoint detection, feature extraction,optical flow and disparity, shape detection, object recognition and scene modelling into a complete system. We present some of the experiments from our ongoing work, where our system leverages a combination of algorithms to solve complex tasks.

2013Book part

Open access

Local gist vision of man-made objects

Publication . Martins, J. C.; Rodrigues, J. M. F.; du Buf, J. M. H.

Attention is usually modelled by sequential fixation of peaks in saliency maps. Those maps code local conspicuity: complexity, colour and texture. Such features have no relation to entire objects, unless also disparity and optical flow are considered, which often segregate entire objects from their background. Recently we developed a model of local gist vision: which types of objects are about where in a scene. This model addresses man-made objects which are dominated by a small shape repertoire: squares, rectangles, trapeziums, triangles, circles and ellipses. Only exploiting local colour contrast, the model can detect these shapes by a small hierarchy of cell layers devoted to low- and mid-level geometry. The model has been tested successfully on video sequences containing traffic signs and other scenes, and partial occlusions were not problematic.

2011-10Conference object

Open access

Real-Time Object Recognition Based on Cortical Multi-scale Keypoints

Publication . Terzic, Kasim; Rodrigues, J. M. F.; du Buf, J. M. H.

In recent years, a large number of impressive object categorisation algorithms have surfaced, both computational and biologically motivated. While results on standardised benchmarks are impressive, very few of the best-performing algorithms took run-time performance into account, rendering most of them useless for real-time active vision scenarios such as cognitive robots. In this paper, we combine cortical keypoints based on primate area V1 with a state-of-the-art nearest neighbour classifier, and show that such a system can approach state-of-the-art categorisation performance while meeting the real-time constraint.

2013-06Book part

Open access

A biological and real-time framework for hand gestures and head poses

Publication . Saleiro, Mário; Farrajota, Miguel; Terzic, Kasim; Rodrigues, J. M. F.; du Buf, J. M. H.

Human-robot interaction is an interdisciplinary research area that aims at the development of social robots. Since social robots are expected to interact with humans and understand their behavior through gestures and body movements, cognitive psychology and robot technology must be integrated. In this paper we present a biological and real-time framework for detecting and tracking hands and heads. This framework is based on keypoints extracted by means of cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. Through the combination of annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated and tracked over time. By using hand templates with lines and edges at only a few scales, a hand’s gestures can be recognized. Head tracking and pose detection are also implemented, which can be integrated with detection of facial expressions in the future. Through the combinations of head poses and hand gestures a large number of commands can be given to a robot.

2013-07Conference object

Open access