T. M. Klingler and D. Brutlag
Using an flexible representation of biological sequences, we have performed a comparative analysis of 1208 known tRNA sequences. We believe we our technique is a more sensitive method for detecting structural and functional relationships in sets of aligned sequences because we use a flexible representation (for sequences), as well as a general statistical method that can detect a wide range of relationships between positions in a sequence. Our method utilizes functional classifications of the sequence building-blocks (nucleotide bases and amino acids) based on physical or chemical properties. This flexibility in sequence representation improves the significance of finding sequence relationships mediated by the defining property. For example, using a purine/pyrimidine classification, we can detect base-stacking interactions in sets of nucleotide sequences that form base-paired helices. We use several statistical measures, including Z2-tests, Monte Carlo simulations and an information measure to detect significant correlations in sequences. In this paper we illustrate our method by analyzing a set of tRNA sequences and showing that the correlations our program discovers, in each case, correspond to the known base-pairing and higher order interactions observed in tRNA crystal structures. Furthermore, we show that novel and interesting features of tRNAs are detected when sequence correlations with the charged amino acid (and anticodon) are evaluated. This technique is a powerful method for predicting the structure of RNAs and for analyzing specific functional characteristics.