A Statistical Method for Finding Transcription Factor Binding Sites

Saurabh Sinha and Martin Tompa, University of Washington

Understanding the mechanisms that determine the regulation of gene expression is an important and challenging problem. A fundamental subproblem is to identify DNA-binding sites for unknown regulatory factors, given a collection of genes believed to be coregulated, and given the noncoding DNA sequences near those genes. We present anenumerative statistical method for identifying good candidates for such transcription factor binding sites. Unlike local search techniques such as Expectation Maximization and Gibbs samplers that may not reach a global optimum, the method proposed here is guaranteed to produce the motifs with greatest z -scores. We discuss the results of experiments in which this algorithm was used to locate candidate binding sites in several well studied pathways of S. cerevisiae , as well as gene clusters from some of the hybridization microarray experiments.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.