Review of spectral clustering algorithms used in proteomics
Shraddha Kumar,
Anuradha Purohit and
Sunita Varma
International Journal of Data Science, 2023, vol. 8, issue 1, 16-38
Abstract:
Tandem mass spectrometry (MS/MS) generates a large number of spectra showing the signal intensity of detected ions as a function of mass-to-charge ratio. Spectral clustering in proteomics is a powerful but under-utilised technique. Based on the similarity of spectra, the spectral clustering algorithms systematically and unerringly classify large numbers of spectra, such that all spectra in a given cluster belong to the same peptide. The data points in the spectral clustering approach are connected and do not require having convex boundaries. Spectral clustering therefore reduces the running time and computation requirements of spectral library and database searches. It enhances peptide identification process and has fuelled the development of many new proteomics algorithms recently. The goal of this review is to provide a clear overview of the most popular spectral clustering algorithms used in proteomics. It describes a systematic analysis of these spectral clustering algorithms, evaluating the benefits and limitations of each approach.
Keywords: proteomics; tandem mass spectrometry; spectral clustering; consensus spectrum; scoring function; mass spectra; data points; spectral similarity; cluster purity; spectral library; normalised dot product. (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=129449 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdsci:v:8:y:2023:i:1:p:16-38
Access Statistics for this article
More articles in International Journal of Data Science from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().