Cumulative learning enables convolutional neural network representations for small mass spectrometry data classification

Seddiki, Khawla; Saudemont, Philippe; Precioso, Frédéric; Ogrinc, Nina; Wisztorski, Maxence; Salzet, Michel; Fournier, Isabelle; Droit, Arnaud

Cumulative learning enables convolutional neural network representations for small mass spectrometry data classification

Khawla Seddiki, Philippe Saudemont, Frédéric Precioso, Nina Ogrinc, Maxence Wisztorski, Michel Salzet, Isabelle Fournier () and Arnaud Droit ()
Additional contact information
Khawla Seddiki: Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, Québec, Canada.
Philippe Saudemont: U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM
Frédéric Precioso: INRIA, I3S
Nina Ogrinc: U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM
Maxence Wisztorski: U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM
Michel Salzet: U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM
Isabelle Fournier: U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM
Arnaud Droit: Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, Québec, Canada.

Nature Communications, 2020, vol. 11, issue 1, 1-11

Abstract: Abstract Rapid and accurate clinical diagnosis remains challenging. A component of diagnosis tool development is the design of effective classification models with Mass spectrometry (MS) data. Some Machine Learning approaches have been investigated but these models require time-consuming preprocessing steps to remove artifacts, making them unsuitable for rapid analysis. Convolutional Neural Networks (CNNs) have been found to perform well under such circumstances since they can learn representations from raw data. However, their effectiveness decreases when the number of available training samples is small, which is a common situation in medicine. In this work, we investigate transfer learning on 1D-CNNs, then we develop a cumulative learning method when transfer learning is not powerful enough. We propose to train the same model through several classification tasks over various small datasets to accumulate knowledge in the resulting representation. By using rat brain as the initial training dataset, a cumulative learning approach can have a classification accuracy exceeding 98% for 1D clinical MS-data. We show the use of cumulative learning using datasets generated in different biological contexts, on different organisms, and acquired by different instruments. Here we show a promising strategy for improving MS data classification accuracy when only small numbers of samples are available.

Date: 2020
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.nature.com/articles/s41467-020-19354-z Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-19354-z

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-020-19354-z

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().