Repository logo
 

Search Results

Now showing 1 - 2 of 2
  • Autonomous temporal pseudo-labeling for fish detection
    Publication . Veiga, Ricardo; Exposito Ochoa, Iñigo; Belackova, Adela; Bentes, Luis; Parente Silva, João; Semiao, J.; Rodrigues, João
    The first major step in training an object detection model to different classes from the available datasets is the gathering of meaningful and properly annotated data. This recurring task will determine the length of any project, and, more importantly, the quality of the resulting models. This obstacle is amplified when the data available for the new classes are scarce or incompatible, as in the case of fish detection in the open sea. This issue was tackled using a mixed and reversed approach: a network is initiated with a noisy dataset of the same species as our classes (fish), although in different scenarios and conditions (fish from Australian marine fauna), and we gathered the target footage (fish from Portuguese marine fauna; Atlantic Ocean) for the application without annotations. Using the temporal information of the detected objects and augmented techniques during later training, it was possible to generate highly accurate labels from our targeted footage. Furthermore, the data selection method retained the samples of each unique situation, filtering repetitive data, which would bias the training process. The obtained results validate the proposed method of automating the labeling processing, resorting directly to the final application as the source of training data. The presented method achieved a mean average precision of 93.11% on our own data, and 73.61% on unseen data, an increase of 24.65% and 25.53% over the baseline of the noisy dataset, respectively.
  • Fine-grained fish classification from small to large datasets with vision transformers
    Publication . Veiga, Ricardo; Rodrigues, Joao
    Fish species Fine-Grained Visual Classification (FGVC) is important for ecological research, environmental management, and biodiversity monitoring, as accurate fish species identification is crucial for assessing the health of marine ecosystems, monitoring changes in biodiversity, and converting conservation plans into action. Although Convolutional Neural Network (CNN)s have been the conventional approach for FGVC, their effectiveness in differentiating visually similar species is not always satisfactory. The advent of Vision Transformer (ViT)s, in particular the Shifted window (Swin) Transformer, has demonstrated potential in addressing these issues by using sophisticated self-attention and feature extraction techniques. This paper proposes a method of combining the FGVC Plug-in Module (FGVC-PIM) and the Swin Transformer. The FGVC-PIM improves classification by concentrating on the most discriminative image regions, while the Swin Transformer acts as the framework and provides strong hierarchical feature extraction. The performance of the method was assessed on 14 different datasets, which included 19 distinct subsets with varying environmental conditions and image quality. With the proposed method it was achieved state-of-the-art results in 13 of these subsets, exhibiting better accuracy and robustness than previous methods, in 2 subsets (not yet explored by other authors) new baseline results are presented, and in the remaining 4 it was achieved results always above 83%.