Reliability of Sequence-Alignment Analysis of Social Processes: Monte Carlo Tests of Clustalg Software
Clarke Wilson
Additional contact information
Clarke Wilson: Department of Economics, St Mary's University, Halifax, Canada
Environment and Planning A, 2006, vol. 38, issue 1, 187-204
Abstract:
Sequences of characters are used in many fields to record events or processes that characterize social processes. However, until recently, there have been very few methods available for the analysis of character-sequence data. Alignment algorithms measure similarities between pairs of sequences by inserting gaps into one or the other to create the best possible matching pattern. In this paper the reliability of alignments in the classification of sequential data is examined. Alignment methods were developed in computational biology, but are being considered for applications in other fields such as sociology, geography, and transportation planning. The ClustalG multiple alignment package is used to examine a set of synthetic sequences generated through the use of eight separate generation rules. Through the application of the software to sequential data with a known number of subgroups and known patterns in the sequences, some strategies for conducting the analysis can be compared and evaluated. The most effective strategy for analysing sequential data when the underlying processes that generate the event sequences are not known is to use low gap penalties that permit the maximum numbers of matches.
Date: 2006
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.sagepub.com/doi/10.1068/a3722 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:sae:envira:v:38:y:2006:i:1:p:187-204
DOI: 10.1068/a3722
Access Statistics for this article
More articles in Environment and Planning A
Bibliographic data for series maintained by SAGE Publications ().