Lawrence Hunter and Barry Zeeberg
Protein chimerism is a phenomenon involving the combination of multiple ancestral sequences into a single, multi-domain protein through evolution. We propose a novel method for detecting chimeric proteins by analyzing their nucleotide sequence. The method tests for differences in the distributions of synonymous (isoaccepting) codons in different regions of the protein. The test involves the comparison of the ability of varying size hidden Markov models (HMMs) of codon usage to fit the natural sequence, relative to a set of randomized controls. We demonstrate the method on the families of yeast nuclear and mitochondrial amino-acyl tRNA synthetases. The method is potentially useful for the automated screening of entire genomes or large databases.