Active conservation of noncoding sequences revealed
by cross species comparsions
Inna Dubchak, Chris Mayor, Michael Brudno, Lior Pachter, Edward M. Rubin,
Kelly A. Frazer
Human and mouse genomic sequence comparisons are being increasingly used to search for evolutionarily conserved gene regulatory elements. Large-scale human/mouse DNA comparison studies have discovered numerous conserved noncoding sequences of which only a fraction has been functionally investigated. A question therefore remains as to whether most of these noncoding sequences are conserved due to functional constraints or are the result of a lack of divergence time?
Based on the supposition that actively conserved human/mouse noncoding sequences will be present in a third mammal while noncoding regions that are similar because of an insufficient accumulation of random mutations will be absent we sequenced ~ 200 kb of orthologous human (5q31), mouse (chromosome 11) and dog (chromosome 4) DNA. The functions of conserved noncoding sequences (syntenous gene regulatory elements) are unaffected by relatively small random insertions or deletions of base pairs and therefore standard local alignment algorithms which identify ungapped conserved regions are not ideally suited for their discovery. For this reason, comparative analysis was performed by generating two-way global alignments [human/dog (H/D), human/mouse (H/M) and mouse/dog (M/D)] and we developed an algorithm to search for blocks of similarity in the alignments. To view the conserved regions in the three two-way sequence alignments simultaneously we developed a new visualization tool, VISTA (Visualization Tool for Alignment).
Comparative analysis revealed that the vast majority of the human/mouse
conserved noncoding sequences identified in the 200 kb region examined
are also present in dog. This is an important finding as it suggests
that a large fraction of the high percent identity noncoding elements identified
through human/mouse DNA comparison studies are conserved due to functional