About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Journal article

On the total number of genes and their length distribution in complete microbial genomes

From

Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark1

Department of Systems Biology, Technical University of Denmark2

In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes.

Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only similar to 3800 genes, and that a similar discrepancy exists for almost all published genomes.

Language: English
Year: 2001
Pages: 425-428
ISSN: 01689525 , 13624555 and 01689479
Types: Journal article
DOI: 10.1016/S0168-9525(01)02372-1
ORCIDs: 0000-0001-7885-715X , 0000-0003-0316-5866 and 0000-0002-5147-6282
Keywords

cbs

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis