About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Journal article

Large gene overlaps in prokaryotic genomes: result of functional constraints or mispredictions?

From

Biochemistry and Biotechnology Department1

European Molecular Biological Laboratory, Meyerhofstrasse, 1, 69012 Heidelberg, Germany2

Max Delbrück Centre for Molecular Medicine, Berlin-Buch, Robert-Rössle-Strasse 10, D-13092 Berlin, Germany3

Across the fully sequenced microbial genomes there are thousands of examples of overlapping genes. Many of these are only a few nucleotides long and are thought to function by permitting the coordinated regulation of gene expression. However, there should also be selective pressure against long overlaps, as the existence of overlapping reading frames increases the risk of deleterious mutations.

Here we examine the longest overlaps and assess whether they are the product of special functional constraints or of erroneous annotation. We analysed the genes that overlap by 60 bps or more among 338 fully-sequenced prokaryotic genomes. The likely functional significance of an overlap was determined by comparing each of the genes to its respective orthologs.

If a gene showed a significantly different length from its orthologs it was considered unlikely to be functional and therefore the result of an error either in sequencing or gene prediction. Focusing on 715 co-directional overlaps longer than 60 bps, we classified the erroneous ones into five categories: i) 5'-end extension of the downstream gene due to either a mispredicted start codon or a frameshift at 5'-end of the gene (409 overlaps), ii) fragmentation of a gene caused by a frameshift (163), iii) 3'-end extension of the upstream gene due to either a frameshift at 3'-end of a gene or point mutation at the stop codon (68), iv) Redundant gene predictions (4), v) 5' & 3'-end extension which is a combination of i) and iii) (71).

We also studied 75 divergent overlaps that could be classified as misannotations of group i). Nevertheless we found some convergent long overlaps (54) that might be true overlaps, although an important part of convergent overlaps could be classified as group iii) (124). Among the 968 overlaps larger than 60 bps which we analysed, we did not find a single real one among the co-directional and divergent orientations and concluded that there had been an excessive number of misannotations.

Only convergent orientation seems to permit some long overlaps, although convergent overlaps are also hampered by misannotations. We propose a simple rule to flag these erroneous gene length predictions to facilitate automatic annotation.

Language: Undetermined
Publisher: BioMed Central
Year: 2008
Pages: 335-335
ISSN: 14712164
Types: Journal article
DOI: 10.1186/1471-2164-9-335

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis