Human visual grouping based on within- and cross-area temporal correlations
Yen-Ju Chen,
Zitang Sun and
Shin’ya Nishida
PLOS Computational Biology, 2025, vol. 21, issue 9, 1-26
Abstract:
Perceptual organization in the human visual system involves neural mechanisms that spatially group and segment image areas based on local feature similarities, such as the temporal correlation of luminance changes. Successful segmentation models in computer vision, including graph-based algorithms and vision transformer, leverage similarity computations across all elements in an image, suggest that effective similarity-based grouping should rely on a global computational process. However, whether human vision employs a similarly global computation remains unclear due to the absence of appropriate methods for manipulating similarity matrices across multiple elements within a stimulus. To investigate how “temporal similarity structures” influence human visual segmentation, we developed a stimulus generation algorithm based on Vision Transformer. This algorithm independently controls within-area and cross-area similarities by adjusting the temporal correlation of luminance, color, and spatial phase attributes. To assess human segmentation performance with these generated texture stimuli, participants completed a temporal two-alternative forced-choice task, identifying which of two intervals contained a segmentable texture. The results showed that segmentation performance is significantly influenced by the configuration of both within- and cross-correlation across the elements, regardless of attribute type. Furthermore, human performance is closely aligned with predictions from a graph-based computational model, suggesting that human texture segmentation can be approximated by a global computational process that optimally integrates pairwise similarities across multiple elements.Author Summary: How does the human visual system use temporal information to segment objects in a dynamic scene? When observing ever-changing environments, our brains must determine which regions belong to the same object and which are distinct. However, the mechanisms underlying this process remain poorly understood. In this study, we investigate how “temporal similarity structures”—patterns of correlation over time—affect visual segmentation. We developed a novel method for generating dynamic stimuli with precisely controlled temporal similarity and systematically tested how within-area and cross-area temporal correlations influence segmentation. Participants performed a task in which they identified segmentable textures, and the results showed that segmentation performance improves when regions exhibit strong internal consistency but lower similarity with adjacent regions. Our findings revealed that human visual segmentation relies on a global computational mechanism that integrates temporal similarity cues to distinguish visual structures. Additionally, our stimulus generation framework provides a powerful tool for future research on perceptual organization and mid-level vision.
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013001 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13001&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013001
DOI: 10.1371/journal.pcbi.1013001
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().