One Model for Many Fakes: Detecting GAN and Diffusion-Generated Forgeries in Faces, Invoices, and Medical Heterogeneous Data

Mahdi, Mohammed A.; Arshed, Muhammad Asad; Muneer, Amgad

One Model for Many Fakes: Detecting GAN and Diffusion-Generated Forgeries in Faces, Invoices, and Medical Heterogeneous Data

Mohammed A. Mahdi, Muhammad Asad Arshed () and Amgad Muneer
Additional contact information
Mohammed A. Mahdi: Information and Computer Science Department, College of Computer Science and Engineering, University of Ha’il, Ha’il 55476, Saudi Arabia
Muhammad Asad Arshed: School of Systems and Technology, University of Management and Technology, Lahore 54770, Pakistan
Amgad Muneer: Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia

Mathematics, 2025, vol. 13, issue 19, 1-22

Abstract: The rapid advancement of generative models, such as GAN and diffusion architectures, has enabled the creation of highly realistic forged images, raising critical challenges in key domains. Detecting such forgeries is essential to prevent potential misuse in sensitive areas, including healthcare, financial documentation, and identity verification. This study addresses the problem by deploying a vision transformer (ViT)-based multiclass classification framework to identify image forgeries across three distinct domains: invoices, human faces, and medical images. The dataset comprises both authentic and AI-generated samples, creating a total of six classification categories. To ensure uniform feature representation across heterogeneous data and to effectively utilize pretrained weights, all images were resized to 224 × 224 pixels and converted to three channels. Model training was conducted using stratified K-fold cross-validation to maintain balanced class distribution in each fold. Experimental results of this study demonstrate consistently high performance across three folds, with an average training accuracy of 0.9983 (99.83%), validation accuracy of 0.9620 (96.20%), and test accuracy of 0.9608 (96.08%), along with a weighted F1 score of 0.9608 and exceeding 0.96 (96%) for all classes. These findings highlight the effectiveness of ViT architectures for cross-domain forgery detection and emphasize the importance of preprocessing standardization when working with mixed datasets.

Keywords: cross-domain forgery; GAN and diffusion models; multiclass; vision transformer; pretrained models; stratified K-fold; deep learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/19/3093/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/19/3093/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:19:p:3093-:d:1758734

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().