Ying Xu and Edward C. Uberbacher
This paper presents an algorithm for combining pattern recognition-based exon prediction and database homology search in gene model construction. The goal is to use homologous genes or partial genes existing in the database as reference models while constructing (multiple) gene models from exon candidates predicted by pattern recognition methods. A unified framework for gene modeling is used for genes ranging from situations with strong homology to no homology in the database. To maximally use the homology information available, the algorithm applies homology on three levels: (1) exon candidate evaluation, (2) gene-segment construction with a reference model, and (3) (complete) gene modeling. Preliminary testing has been done on the algorithm. Test results show that (a) perfect gene modeling can be expected when the initial exon predictions are reasonably good and a strong homology exists in the database; (b) homology (not necessarily strong) in general helps improve the accuracy of gene modeling; (c) multiple gene modeling becomes feasible when homology exists in the database for the involved genes.