Integration of Competing Ancillary Assertions in Genome Assembly

Christian Burks, Rebecca J. Parsons, and Michael L. Engle

Assembly of genomic sequences and maps relies on a primary set of experimental data (e.g., the sequences of individual DNA fragments, or hybridization fingerprints of individual clone inserts), but almost always also relies on several streams of related but distinct kinds of data for completeness and accuracy of the final construction. These secondary data sets, which we term ancillary information, usually contain errors (as do the primary data sets, therefore creating the possibility of conflict between data sets), often arise from different experimental protocols and correspond to different scales of measurement, and occasionally include non-quantitative statements about the data. We present an approach for integration of ancillary assertions in the optimization of genome assembly, based on simultaneous balancing among the primary and secondary data sets, and include specific examples in the context of assembling DNA sequencing fragments to reconstruct a parent sequence.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.