Viatcheslav R. Akmaev, Scott T. Kelley, and Gary D. Stormo
Methods based on the Mutual Information statistic (MI methods) predict structure by looking for statistical correlations between sequence positions in a set of aligned sequences. Although MI methods are often quite effective, these methods ignore the underlying phylogenetic relationships of the sequences they analyze. Thus, they cannot distinguish between correlations due to structural interactions, and spurious correlations resulting from phylogenetic history. In this paper, we introduce a method analogous to MI that incorporates phylogenetic information. We show that this method accurately recovers the structures of well-known RNA molecules. We also demonstrate, with both real and simulated data, that this phylogenetically-based method outperforms standard MI methods, and improves the ability to distinguish interacting from non-interacting positions in RNA. This method is flexible, and may be applied to the prediction of protein structure given the appropriate evolutionary model. Because this method incorporates phylogenetic data, it also has the potential to be improved with the addition of more accurate phylogenetic information, although we show that even approximate phylogenies are helpful.