Remote Sensing in the Era of Artificial Intelligence of Everything Through Visual Question Answering

Saha, Anirban; Maji, Suman Kumar

Remote Sensing in the Era of Artificial Intelligence of Everything Through Visual Question Answering

Anirban Saha () and Suman Kumar Maji ()
Additional contact information
Anirban Saha: Indian Institute of Technology Patna
Suman Kumar Maji: Indian Institute of Technology Patna

A chapter in Artificial Intelligence of Everything and Sustainable Development, 2025, pp 123-148 from Springer

Abstract: Abstract In the realm of Artificial Intelligence of Everything (AIoE), remote sensing finds application through the uti-lization of observations and data collection, primarily in the form of images, obtained from a distance, often via satellites or other aerial platforms. The processing and extraction of meaningful insights from these remote sensing images (RSIs) present notable challenges, owing to the vast amount of data and the presence of noise and artifacts. To tackle this, there emerges a pressing need for AI models adept at interpreting RSIs and extracting relevant information, catering to both novice users and expert scientists in the field. This chapter delves into the domain of Remote Sensing Visual Question Answering (RSVQA), a subset of AI models rooted in computer vision, tailored to respond to natural language queries about RSIs. While still nascent, RSVQA models, initially pioneered around 2020, hold substantial potential for the development of RSI-centric AI search engines. The chapter offers a comprehensive examination of existing RSVQA techniques, exploring their underlying principles, practical applications, and inherent constraints, while also introducing an innovative RSVQA model. The proposed model comprises five interconnected modules: Input, Pre-processing, Representation, Fusion, and Answering. The Input module receives RSIs captured by various sensors alongside relevant queries in the form of natural language questions. The Pre-processing module incorporates a Denoising sub-module along with an natural language pre-processing sub-module, aimed at refining the input data and preparing it for subsequent processing. The Representation module extracts distinctive features from the inputs via visual and question representation sub-modules, which are then consolidated into a vectorized encoding through the Fusion module. Finally, the Answering module employs a multilayer perceptron (MLP) classifier to decipher the desired output. A comparative analysis enriches the chapter, offering valuable insights for practitioners involved in RSVQA research and development. The chapter culminates by contemplating the future trajectory of RSVQA, taking into account emerging trends, ethical considerations, and potential industry implications, thereby providing a forward-looking perspective on this evolving facet of AIoE.

Keywords: Remote sensing; Visual question answering; Artificial Intelligence of Everything (AIoE); Visual representation; Question representation; Multi-modal data fusion; Classification; Cross-domain adaptation; RSVQA applications (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-981-96-7202-8_8

Ordering information: This item can be ordered from
http://www.springer.com/9789819672028

DOI: 10.1007/978-981-96-7202-8_8

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().