RNA Modeling Using Gibbs Sampling and Stochastic Context Free Grammars

Leslie Grate, Mark Herbster, Richard Hughey, David Haussler, I. Saira Mian, and Harry Noller

A new method of discovering the common secondary structure of a family of homologous RNA sequences using Gibbs sampling and stochastic context-free grammars is proposed. Given an unaligned set of sequences, a Gibbs sampling step simultaneously estimates the secondary structure of each sequence and a set of statistical parameters describing the common secondary structure of the set as a whole. These parameters describe a statistical model of the family. After the Gibbs sampling has produced a crude statistical model for the family, this model is translated into a stochastic context-free grammar, which is then refined by an Expectation Maximization (EM) procedure to produce a more complete model. A prototype implementation of the method is tested on tRNA, pieces of 16S rRNA and on U5 snRNA with good results.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.