EconPapers    
Economics at your fingertips  
 

Visual Lip Reading Dataset in Turkish

Ali Berkol (), Talya Tümer-Sivri, Nergis Pervan-Akman, Melike Çolak and Hamit Erdem
Additional contact information
Ali Berkol: Defense and Information Systems, BITES, Neighbourhood of Mustafa Kemal, Dumlupınar Avenue, METU Technopolis, Ankara 06530, Turkey
Talya Tümer-Sivri: Defense and Information Systems, BITES, Neighbourhood of Mustafa Kemal, Dumlupınar Avenue, METU Technopolis, Ankara 06530, Turkey
Nergis Pervan-Akman: Defense and Information Systems, BITES, Neighbourhood of Mustafa Kemal, Dumlupınar Avenue, METU Technopolis, Ankara 06530, Turkey
Melike Çolak: Defense and Information Systems, BITES, Neighbourhood of Mustafa Kemal, Dumlupınar Avenue, METU Technopolis, Ankara 06530, Turkey
Hamit Erdem: Electrics and Electronics Department, Başkent University, Baglica Campus, Fatih Sultan District, Ankara 06790, Turkey

Data, 2023, vol. 8, issue 1, 1-8

Abstract: The promised dataset was obtained from daily Turkish words and phrases pronounced by various people in videos posted on YouTube. The purpose of compiling the dataset was to provide a method for the detection of the spoken word by recognizing patterns or classifying lip movements with supervised, unsupervised, and semi-supervised learning, and machine learning algorithms. Most of the datasets related to lip reading consist of people recorded on camera with fixed backgrounds and the same conditions, but the dataset presented here consists of images compatible with machine learning models developed for real-life challenges. It contains a total of 2335 instances taken from TV series, movies, vlogs, and song clips on YouTube. The images in the dataset vary due to factors such as the way people say words, accents, speaking rate, gender, and age. Furthermore, the instances in the dataset consist of videos with different angles, shadows, resolution, and brightness that are not created manually. The most important feature of our lip reading dataset is that we contribute to the non-synthetic Turkish dataset pool, which does not have wide dataset varieties. Machine learning studies can be carried out in many areas, such as education, security, and social life with this dataset.

Keywords: lip reading; visual speech recognition; Turkish dataset; face parts detection (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2306-5729/8/1/15/pdf (application/pdf)
https://www.mdpi.com/2306-5729/8/1/15/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:8:y:2023:i:1:p:15-:d:1026451

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jdataj:v:8:y:2023:i:1:p:15-:d:1026451