GASVeM: A New Machine Learning Methodology for Multi-SNP Analysis of GWAS Data Based on Genetic Algorithms and Support Vector Machines
Fidel Díez Díaz,
Fernando Sánchez Lasheras,
Víctor Moreno,
Ferran Moratalla-Navarro,
Antonio José Molina de la Torre and
Vicente Martín Sánchez
Additional contact information
Fidel Díez Díaz: CTIC Technological Centre, W3C Spain Office Host, Ada Byron 39, 33203 Gijón, Spain
Fernando Sánchez Lasheras: Department of Mathematics, Faculty of Sciences, Universidad de Oviedo, 33007 Oviedo, Spain
Víctor Moreno: Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, 08908 Barcelona, Spain
Ferran Moratalla-Navarro: Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, 08908 Barcelona, Spain
Antonio José Molina de la Torre: IBIOMED, University of Leon, Vegazana Campus, 24400 León, Spain
Vicente Martín Sánchez: CIBERESP, University of Leon, Vegazana Campus, 24400 León, Spain
Mathematics, 2021, vol. 9, issue 6, 1-19
Abstract:
Genome-wide association studies (GWAS) are observational studies of a large set of genetic variants in an individual’s sample in order to find if any of these variants are linked to a particular trait. In the last two decades, GWAS have contributed to several new discoveries in the field of genetics. This research presents a novel methodology to which GWAS can be applied to. It is mainly based on two machine learning methodologies, genetic algorithms and support vector machines. The database employed for the study consisted of information about 370,750 single-nucleotide polymorphisms belonging to 1076 cases of colorectal cancer and 973 controls. Ten pathways with different degrees of relationship with the trait under study were tested. The results obtained showed how the proposed methodology is able to detect relevant pathways for a certain trait: in this case, colorectal cancer.
Keywords: machine learning; support vector machines; genetic algorithms; genome-wide association studies; single nucleotide polymorphism; pathways analysis (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/9/6/654/pdf (application/pdf)
https://www.mdpi.com/2227-7390/9/6/654/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:9:y:2021:i:6:p:654-:d:519797
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().