AAAI Publications, Twenty-Third International FLAIRS Conference

Font Size: 
Structured Motifs Identification in DNA Sequences
Yuridia P. Mejia, Ivan Olmos, Jesus A. Gonzalez

Last modified: 2010-05-06

Abstract


In this paper, we present an algorithm that finds structured motifs in a DNA sequence. A structured motif consists of a central motif and one or two satellite motifs, which may be located to the left and / or right of the central motif. The search of the motifs is performed in two stages: first, the central motifs are located through an exact set matching process, which is implemented by a deterministic finite automaton; in the second stage, the satellite motifs are located from the position of the central motifs at a distance defined as input. This last phase requires two steps: first, a matrix is calculated through a dynamic programming technique using the Levenshtein algorithm. After this, we identify the satellite motifs using the matrix. Based on our results, our method is fast at the moment to search for central patterns (in linear time), and the second phase is most expensive because it is necesary to identify all the possible alignments and after that, perform the alignment with their respective satellite.

Full Text: PDF