EconPapers    
Economics at your fingertips  
 

Similarity Queries for Temporal Toxicogenomic Expression Profiles

Adam A Smith, Aaron Vollrath, Christopher A Bradfield and Mark Craven

PLOS Computational Biology, 2008, vol. 4, issue 7, 1-13

Abstract: We present an approach for answering similarity queries about gene expression time series that is motivated by the task of characterizing the potential toxicity of various chemicals. Our approach involves two key aspects. First, our method employs a novel alignment algorithm based on time warping. Our time warping algorithm has several advantages over previous approaches. It allows the user to impose fairly strong biases on the form that the alignments can take, and it permits a type of local alignment in which the entirety of only one series has to be aligned. Second, our method employs a relaxed spline interpolation to predict expression responses for unmeasured time points, such that the spline does not necessarily exactly fit every observed point. We evaluate our approach using expression time series from the Edge toxicology database. Our experiments show the value of using spline representations for sparse time series. More significantly, they show that our time warping method provides more accurate alignments and classifications than previous standard alignment methods for time series.Author Summary: We are developing an approach to characterize chemicals and environmental conditions by comparing their effects on gene expression with those of well characterized treatments. We evaluate our approach in the context of the Edge (Environment, Drugs, and Gene Expression) database, which contains microarray observations collected from mouse liver tissue over the days following exposure to a variety of treatments. Our approach takes as input an unknown query series, consisting of several gene-expression measurements over time. It then picks out treatments from a database of known treatments that exhibit the most similar expression responses. This task is difficult because the data tends to be noisy, sparse in time, and measured at irregular intervals. We start by reconstructing the unobserved parts of the series using splines. We then align the given query to each database series so that the similarities in their expression responses are maximized. Our approach uses dynamic programming to find the best alignment of each pair of series. Unlike other methods, our approach allows alignments in which the end of one of the two series remains unaligned, if it appears that one series shows more of the expression response than the other. We finally return the best match(es) and alignment(s), in the hope that they will help with the query's eventual characterization and addition to the database.

Date: 2008
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000116 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 00116&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1000116

DOI: 10.1371/journal.pcbi.1000116

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1000116