Object detection through search with a foveated visual system
Emre Akbas and
Miguel P Eckstein
PLOS Computational Biology, 2017, vol. 13, issue 10, 1-28
Abstract:
Humans and many other species sense visual information with varying spatial resolution across the visual field (foveated vision) and deploy eye movements to actively sample regions of interests in scenes. The advantage of such varying resolution architecture is a reduced computational, hence metabolic cost. But what are the performance costs of such processing strategy relative to a scheme that processes the visual field at high spatial resolution? Here we first focus on visual search and combine object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We develop a foveated object detector that processes the entire scene with varying resolution, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. We compared the foveated object detector against a non-foveated version of the same object detector which processes the entire image at homogeneous high spatial resolution. We evaluated the accuracy of the foveated and non-foveated object detectors identifying 20 different objects classes in scenes from a standard computer vision data set (the PASCAL VOC 2007 dataset). We show that the foveated object detector can approximate the performance of the object detector with homogeneous high spatial resolution processing while bringing significant computational cost savings. Additionally, we assessed the impact of foveation on the computation of bottom-up saliency. An implementation of a simple foveated bottom-up saliency model with eye movements showed agreement in the selection of top salient regions of scenes with those selected by a non-foveated high resolution saliency model. Together, our results might help explain the evolution of foveated visual systems with eye movements as a solution that preserves perceptual performance in visual search while resulting in computational and metabolic savings to the brain.Author summary: A large number of species from primates to shrimps do not see the visual world with uniform spatial detail. An area with heightened sensitivity to spatial detail, known as the fovea in mammals is oriented through eye and head movements to scrutinize regions of interest in the visual environment. But why did many species evolve such foveated architecture for vision? Seeing with high spatial detail everywhere requires greater neuronal machinery and energy consumption, thus the advantage of a foveated visual system is to reduce metabolic costs. But does having a foveated visual system incur a price to the performance of the organism in visual tasks? Here, we show using a computer vision object detection model, that the foveated version of the model can attain similar search performance to its non-foveated version that processes the entire visual field with high spatial detail. The results might help explain the evolution of foveated visual systems with eye movements as a solution that preserves perceptual performance while resulting in computational and metabolic savings to the brain.
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005743 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 05743&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1005743
DOI: 10.1371/journal.pcbi.1005743
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().