EconPapers    
Economics at your fingertips  
 

Geometry-based distance for clustering amino acids

Samira F. Abushilah, Charles C. Taylor and Arief Gusnanto

Journal of Applied Statistics, 2020, vol. 47, issue 7, 1235-1250

Abstract: Clustering amino acids is one of the most challenging problems in functional and structural prediction of protein. Previous studies have proposed clusters based on measurements of physical and biochemical characteristics of the amino acids such as volume, area, hydrophilicity, polarity, hydrogen bonding, shape, and charge. These characteristics, although important, are less directly related to the protein structure compared to geometrical characteristics such as dihedral angles between amino acids. We propose using the p-value from a test of equality of dihedral-angle distributions as the basis of a distance measure for the clustering. In this novel approach, an energy test is modified to deal with bivariate angular data and the p-value is obtained via a permutation method. The results indicate that the clusters of amino acids have sensible interpretation where Glycine, Proline, and Asparagine each forms a distinct cluster. A simulation study suggests that this approach has good working characteristics to cluster amino acids.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/02664763.2019.1673324 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:japsta:v:47:y:2020:i:7:p:1235-1250

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/CJAS20

DOI: 10.1080/02664763.2019.1673324

Access Statistics for this article

Journal of Applied Statistics is currently edited by Robert Aykroyd

More articles in Journal of Applied Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:japsta:v:47:y:2020:i:7:p:1235-1250