Name: | Description: | Size: | Format: | |
---|---|---|---|---|
7.79 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
Contour-based object detection and recognition in complex scenes is one
of the most dificult problems in computer vision. Object contours in complex
scenes can be fragmented, occluded and deformed. Instances of the same
class can have a wide range of variations. Clutter and background edges
can provide more than 90% of all image edges. Nevertheless, our biological
vision system is able to perform this task effortlessly. On the other hand, the
performance of state-of-the-art computer vision algorithms is still limited in
terms of both speed and accuracy.
The work in this thesis presents a simple, efficient and biologically motivated
method for contour-based object detection and recognition in complex
scenes. Edge segments are extracted from training and testing images using
a simple contour-following algorithm at each pixel. Then a descriptor is calculated
for each segment using Shape Context, including an offset distance
relative to the centre of the object. A Bayesian criterion is used to determine
the discriminative power of each segment in a query image by means of
a nearest-neighbour lookup, and the most discriminative segments vote for
potential bounding boxes. The generated hypotheses are validated using the
k nearest-neighbour method in order to eliminate false object detections.
Furthermore, meaningful model segments are extracted by finding edge
fragments that appear frequently in training images of the same class. Only
2% of the training segments are employed in the models. These models
are used as a second approach to validate the hypotheses, using a distancebased
measure based on nearest-neighbour lookups of each segment of the hypotheses.
A review of shape coding in the visual cortex of primates is provided. The
shape-related roles of each region in the ventral pathway of the visual cortex
are described. A further step towards a fully biological model for contourbased
object detection and recognition is performed by implementing a model
for meaningful segment extraction and binding on the basis of two biological
principles: proximity and alignment.
Evaluation on a challenging benchmark is performed for both k nearestneighbour
and model-segment validation methods. Recall rates of the proposed
method are compared to the results of recent state-of-the-art algorithms
at 0.3 and 0.4 false positive detections per image.
Description
Dissertação de Mestrado, Engenharia Informática, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2014
Keywords
Object detection Edge fragments Shape context Computer vision Human vision