EconPapers    
Economics at your fingertips  
 

Speech-Based Real-World Scene Understanding for Assistive Care of the Visually Impaired

Tarun Sunil, K. Vinod, M. Madhav, Joshua Abraham and G. Jyothish Lal ()
Additional contact information
Tarun Sunil: Amrita Vishwa Vidyapeetham
K. Vinod: Amrita Vishwa Vidyapeetham
M. Madhav: Amrita Vishwa Vidyapeetham
Joshua Abraham: Amrita Vishwa Vidyapeetham
G. Jyothish Lal: Amrita Vishwa Vidyapeetham

A chapter in Machine Learning and Deep Learning Modeling and Algorithms with Applications in Medical and Health Care, 2025, pp 23-37 from Springer

Abstract: Abstract This research introduces a new assistive technology that combines real-time image captioning, voice synthesis, and a model-based keyword-spotting method to help people with visual impairments. In order to identify specified voice instructions as the active trigger and start a camera to record the user’s environment, the system makes use of a lightweight machine learning framework. CLIP, a cutting-edge vision-language model, is used to interpret the visual input and provide contextual textual descriptions of the surroundings. Tacotron 2, a neural text-to-speech algorithm, transforms these captions into natural-sounding speech so that users may hear their environment. The end-to-end pipeline puts usability and low latency first, showing that speech-driven activation, sophisticated picture interpretation, and high-quality audio synthesis can all be combined to provide an easy-to-use assistive tool for practical uses.

Keywords: Assistive technologies; Contrastive language image pre-training; Speech-to-text; Text-to-speech (TTS); Automatic speech recognition (ASR) (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:ssrchp:978-3-031-98728-1_2

Ordering information: This item can be ordered from
http://www.springer.com/9783031987281

DOI: 10.1007/978-3-031-98728-1_2

Access Statistics for this chapter

More chapters in Springer Series in Reliability Engineering from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-10-02
Handle: RePEc:spr:ssrchp:978-3-031-98728-1_2