EconPapers    
Economics at your fingertips  
 

Two-Stage Machine Learning-Based GWAS for Wool Traits in Central Anatolian Merino Sheep

Yunus Arzık, Mehmet Kizilaslan, Sedat Behrem, Simge Tütenk and Mehmet Ulaş Çınar ()
Additional contact information
Yunus Arzık: Department of Animal Science, Faculty of Veterinary Medicine, Aksaray University, Aksaray 68000, Türkiye
Mehmet Kizilaslan: Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
Sedat Behrem: Department of Animal Science, Faculty of Veterinary Medicine, Aksaray University, Aksaray 68000, Türkiye
Simge Tütenk: Department of Animal Science, Faculty of Agriculture, Ankara University, Ankara 06560, Türkiye
Mehmet Ulaş Çınar: Department of Animal Science, Faculty of Agriculture, Erciyes University, Kayseri 38039, Türkiye

Agriculture, 2025, vol. 15, issue 21, 1-16

Abstract: Wool traits such as fiber diameter, fiber length, and greasy fleece yield are economically significant characteristics in sheep breeding programs. Traditional genome-wide association studies (GWAS) have identified relevant genomic regions but often fail to capture the non-linear and polygenic architecture underlying these traits. In this study, we implemented a two-stage machine learning (ML)-based GWAS framework to dissect the genetic basis of wool traits in Central Anatolian Merino sheep. Phenotypic records were collected from 228 animals, genotyped with the Illumina OvineSNP50 BeadChip. In the first stage, feature selection was conducted using LASSO, Ridge Regression, and Elastic Net, generating a consensus SNP panel per trait. In the second stage, association modeling with Random Forest and Support Vector Regression (SVR) identified the most predictive models (R 2 up to 0.86). Candidate gene annotation highlighted biologically relevant loci: MTHFD2L and EPGN (folate metabolism and keratinocyte proliferation) for fiber diameter; COL5A2 , COL3A1 , ITFG1 , and ELMO1 (extracellular matrix integrity and actin remodeling) for staple length; and FAP , DPP4 , PLCH1 , and NPTX1 (extracellular matrix remodeling, proteolysis, and sebaceous gland function) for greasy fleece yield. These findings demonstrate the utility of ML-enhanced GWAS pipelines in identifying biologically meaningful markers and propose novel targets for genomic selection strategies to improve wool quality and yield in indigenous sheep populations.

Keywords: machine learning; genome-wide association study (GWAS); Central Anatolian Merino sheep; wool traits; Support Vector Regression (SVR); Random Forest (RF); LASSO; candidate genes (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2077-0472/15/21/2287/pdf (application/pdf)
https://www.mdpi.com/2077-0472/15/21/2287/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:15:y:2025:i:21:p:2287-:d:1786391

Access Statistics for this article

Agriculture is currently edited by Ms. Leda Xuan

More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-11-04
Handle: RePEc:gam:jagris:v:15:y:2025:i:21:p:2287-:d:1786391