Well Begun Is Half Done: The Impact of Pre-Processing in MALDI Mass Spectrometry Imaging Analysis Applied to a Case Study of Thyroid Nodules
Giulia Capitoli (),
Kirsten C. J. van Abeelen,
Isabella Piga,
Vincenzo L’Imperio,
Marco S. Nobile,
Daniela Besozzi and
Stefania Galimberti
Additional contact information
Giulia Capitoli: Bicocca Bioinformatics Biostatistics and Bioimaging B4 Center, Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy
Kirsten C. J. van Abeelen: Radboud University Medical Center, Department of Internal Medicine, 6525 AJ Nijmegen, The Netherlands
Isabella Piga: Proteomics and Metabolomics Unit, Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy
Vincenzo L’Imperio: Pathology Unit, Fondazione IRCCS San Gerardo dei Tintori, Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy
Marco S. Nobile: Department of Environmental Sciences, Informatics and Statistics, Ca’ Foscari University of Venice, 30100 Venice, Italy
Daniela Besozzi: Department of Informatics, Systems, and Communication, University of Milano-Bicocca, 20126 Milan, Italy
Stefania Galimberti: Bicocca Bioinformatics Biostatistics and Bioimaging B4 Center, Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy
Stats, 2025, vol. 8, issue 3, 1-14
Abstract:
The discovery of proteomic biomarkers in cancer research can be effectively performed in situ by exploiting Matrix-Assisted Laser Desorption Ionization (MALDI) Mass Spectrometry Imaging (MSI). However, due to experimental limitations, the spectra extracted by MALDI-MSI can be noisy, so pre-processing steps are generally needed to reduce the instrumental and analytical variability. Thus far, the importance and the effect of standard pre-processing methods, as well as their combinations and parameter settings, have not been extensively investigated in proteomics applications. In this work, we present a systematic study of 15 combinations of pre-processing steps—including baseline, smoothing, normalization, and peak alignment—for a real-data classification task on MALDI-MSI data measured from fine-needle aspirates biopsies of thyroid nodules. The influence of each combination was assessed by analyzing the feature extraction, pixel-by-pixel classification probabilities, and LASSO classification performance. Our results highlight the necessity of fine-tuning a pre-processing pipeline, especially for the reliable transfer of molecular diagnostic signatures in clinical practice. We outline some recommendations on the selection of pre-processing steps, together with filter levels and alignment methods, according to the mass-to-charge range and heterogeneity of data.
Keywords: pre-processing; MALDI; mass spectrometry; machine learning; feature design; classification performance; thyroid nodules (search for similar items in EconPapers)
JEL-codes: C1 C10 C11 C14 C15 C16 (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2571-905X/8/3/57/pdf (application/pdf)
https://www.mdpi.com/2571-905X/8/3/57/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jstats:v:8:y:2025:i:3:p:57-:d:1699184
Access Statistics for this article
Stats is currently edited by Mrs. Minnie Li
More articles in Stats from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().