Repository logo
 

Search Results

Now showing 1 - 10 of 19
  • Multi-scale keypoints in V1 and face detection
    Publication . Rodrigues, J. M. F.; du Buf, J. M. H.
    End-stopped cells in cortical area V1, which combine out- puts of complex cells tuned to different orientations, serve to detect line and edge crossings (junctions) and points with a large curvature. In this paper we study the importance of the multi-scale keypoint representa- tion, i.e. retinotopic keypoint maps which are tuned to different spatial frequencies (scale or Level-of-Detail). We show that this representation provides important information for Focus-of-Attention (FoA) and object detection. In particular, we show that hierarchically-structured saliency maps for FoA can be obtained, and that combinations over scales in conjunction with spatial symmetries can lead to face detection through grouping operators that deal with keypoints at the eyes, nose and mouth, especially when non-classical receptive field inhibition is employed. Al- though a face detector can be based on feedforward and feedback loops within area V1, such an operator must be embedded into dorsal and ventral data streams to and from higher areas for obtaining translation-, rotation- and scale-invariant face (object) detection.
  • Integrated multi-scale architecture of the cortex with application to computer vision
    Publication . Rodrigues, J. M. F.; du Buf, J. M. H.
    The main goal of this thesis is to try to understand the functioning of the visual cortex through the development of computational models. In the input layer V1 of the visual cortex there are simple, complex and endstopped cells. These provide a multi-scale representation of objects and scene in terms of lines, edges and keypoints. In this thesis we combine recent progress concerning the development of computational models of these and other cells with processes in higher cortical areas V2 and V4 etc. Three pertinent challenges are discussed: (i) object recognition embedded in a cortical architecture; (ii) brightness perception, and (iii) painterly rendering based on human vision. Specific aspects are Focusof- Attention by means of keypoint-based saliency maps, the dynamic routing of features from V1 through higher cortical areas in order to obtain translation, rotation and size invariance, and the construction of normalized object templates with canonical views in visual memory. Our simulations show that the multi-scale representations can be integrated into a cortical architecture in order to model subsequent processing steps: from segregation, via different categorization levels, until final object recognition is obtained. As for real cortical processing, the system starts with coarse-scale information, refines categorization by using mediumscale information, and employs all scales in recognition. We also show that a 2D brightness model can be based on the multi-scale symbolic representation of lines and edges, with an additional low-pass channel and nonlinear amplitude transfer functions, such that object recognition and brightness perception are combined processes based on the same information. The brightness model can predict many different effects such as Mach bands, grating induction, the Craik-O’Brien-Cornsweet illusion and brightness induction, i.e. the opposite effects of assimilation (White effect) and simultaneous brightness contrast. Finally, a novel application is introduced: painterly rendering has been linked to computer vision, but we propose to link it to human vision because perception and painting are two processes which are strongly interwoven.
  • Painterly rendering using human vision
    Publication . du Buf, J. M. H.; Rodrigues, J. M. F.; Nunes, S.; Almeida, D.; Brito, Vera; Carvalho, J.
    Painterly rendering has been linked to computer vision, but we propose to link it to human vision because perception and painting are two processes that are interwoven. Recent progress in developing computational models allows to establish this link. We show that completely automatic rendering can be obtained by applying four image representations in the visual system: (1) colour constancy can be used to correct colours, (2) coarse background brightness in combination with colour coding in cytochrome-oxidase blobs can be used to create a background with a big brush, (3) the multi-scale line and edge representation provides a very natural way to render fi ner brush strokes, and (4) the multi-scale keypoint representation serves to create saliency maps for Focus-of-Attention, and FoA can be used to render important structures. Basic processes are described, renderings are shown, and important ideas for future research are discussed.
  • Arquitectura do córtex visual com aplicações na visão por computador
    Publication . du Buf, J. M. H.; Rodrigues, J. M. F.
    O estudo da visão humana atrai o interesse de muitos cientistas ao longo dos séculos, como por exemplo em 1704 por Newton na visão a cores e 1910 por Helmholtz na óptica fisiológica. No entanto, as primeiras contribuições na visão computacional começaram por volta de 40 anos atrás quando os primeiros computadores apareceram. Por volta de 1980, David Marr estabeleceu as bases para a moderna teoria de visão computacional.
  • Fine arts edutainment: the amateur painter
    Publication . Almeida, D.; Carvalho, Brito; Rodrigues, J. M. F.; du Buf, J. M. H.; Nunes, S.
    A new scheme for painterly rendering (NPR) has been developed. This scheme is based on visual perception, in particular themulti-scale line/edge representation in the visual cortex. The Amateur Painter (TAP) is the user interface on top of the rendering scheme. It allows to (semi)automatically create paintings from photographs, with different types of brush strokes and colour manipulations. In contrast to similar painting tools, TAP has a set of menus that reflects the procedure followed by a normal painter. In addition, menus and options have been designed such that they are very intuitive, avoiding a jungle of sub-menus with options from image processing that children and laymen do not understand. Our goal is to create a tool that is extremely easy to use, with the possibility that the user becomes interested in painting techniques, styles, and fine arts in general.
  • Face normalization using multi-scale cortical keypoints
    Publication . Cunha, João; Rodrigues, J. M. F.; du Buf, J. M. H.
    Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extractions. Simple, complex and end-stopped cells tuned to different spatial frequencies (scales) and/or orientations provide input for line, edge and keypoint detection. This yields a rich, multi-scale object representation that can be stored in memory in order to identify objects. The multi-scale, keypoint-based saliency maps for Focus-of-Attention can be explored to obtain face detection and normalization, after which face recognition can be achieved using the line/edge representation. In this paper, we focus only on face normalization, showing that multi-scale keypoints can be used to construct canonical representations of faces in memory.
  • Building the what and where systems: multi-scale lines, edges and keypoints
    Publication . Rodrigues, J. M. F.; Almeida, D.; Nunes, S.; Lam, Roberto; du Buf, J. M. H.
    Computer vision for realtime applications requires tremendous computational power because all images must be processed from the first to the last pixel. Ac tive vision by probing specific objects on the basis of already acquired context may lead to a significant reduction of processing. This idea is based on a few concepts from our visual cortex (Rensink, Visual Cogn. 7, 17-42, 2000): (1) our physical surround can be seen as memory, i.e. there is no need to construct detailed and complete maps, (2) the bandwidth of the what and where systems is limited, i.e. only one object can be probed at any time, and (3) bottom-up, low-level feature extraction is complemented by top-down hypothesis testing, i.e. there is a rapid convergence of activities in dendritic/axonal connections.
  • Object categorisations using templates constructed from multi-scale line and edge representations
    Publication . Nunes, S.; Almeida, D.; Rodrigues, J. M. F.; du Buf, J. M. H.
    Object categorisation is linked to detection, segregation and recognition. In the visual system, these processes are achieved in the ventral \what"and dorsal \where"pathways [3], with bottom-up feature extractions in areas V1, V2, V4 and IT (what) in parallel with top-down attention from PP via MT to V2 and V1 (where). The latter is steered by object templates in memory, i.e. in prefrontal cortex with a what component in PF46v and a where component in PF46d.
  • Cortical object segregation and categorization by multi-scale line and edge coding
    Publication . Rodrigues, J. M. F.; du Buf, J. M. H.
    In this paper we present an improved scheme for line and edge detection in cortical area V1, based on responses of simple and complex cells, truly multi-scale with no free parameters. We illustrate the multi-scale representation for visual reconstruction, and show how object segregation can be achieved with coarse-to-finescale groupings. A two-level object categorization scenario is tested in which pre-categorization is based on coarse scales only, and final categorization on coarse plus fine scales. Processing schemes are discussed in the framework of a complete cortical architecture.
  • Artistic rendering of the visual cortex
    Publication . Lam, Roberto; Rodrigues, J. M. F.; du Buf, J. M. H.
    In this paper we explain the processing in the first layers of the visual cortex by simple, complex and endstopped cells, plus grouping cells for line, edge, keypoint and saliency detection. Three visualisations are presented: (a) an integrated scheme that shows activities of simple, complex and end-stopped cells, (b) artistic combinations of selected activity maps that give an impression of global image structure and/or local detail, and (c) NPR on the basis of a 2D brightness model. The cortical image representations offer many possibilities for non-photorealistic rendering.