EconPapers    
Economics at your fingertips  
 

Vision-Language Pre-training from Synthetic Data

Che Liu ()
Additional contact information
Che Liu: Imperial College London

Chapter Chapter 6 in Generative Machine Learning Models in Medical Image Computing, 2025, pp 111-128 from Springer

Abstract: Abstract Recent advancements in Medical Vision-Language Pre-training (MedVLP) demonstrate significant potential, leveraging extensive datasets of medical images and accompanying reports to deliver impressive performance across a wide range of downstream tasks, including both visual-based challenges and those integrating vision and language. However, MedVLP systems require substantial datasets with matched image-text pairs, which are often challenging to procure due to their labor-intensive and costly nature. Additionally, real-world datasets frequently encounter issues such as imbalanced concepts, unpaired image-text samples, and corrupted images. Recent progress in deep generative models, notably from Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) to Stable Diffusion (SD)-based models, has been significant. SD-based models, in particular, excel in conditional generation, a crucial capability for synthesizing medical images with high fidelity. Moreover, the generation of medical reports can be enhanced using language models, especially large language models (LLMs) such as Llama, utilizing conditional generation driven by extensive medical concept definitions sourced from clinical peer-reviewed databases. This article introduces the principal MedVLP methodologies, the role of generative models, and the techniques of conditional generation, alongside an exploration of various downstream tasks employed to assess the effectiveness of MedVLP.

Date: 2025
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-031-80965-1_6

Ordering information: This item can be ordered from
http://www.springer.com/9783031809651

DOI: 10.1007/978-3-031-80965-1_6

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2026-05-21
Handle: RePEc:spr:sprchp:978-3-031-80965-1_6