About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Journal article

MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

Edited by An, Lingling

From

Department of Bio and Health Informatics, Technical University of Denmark1

Metagenomics, Department of Bio and Health Informatics, Technical University of Denmark2

National Food Institute, Technical University of Denmark3

Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark4

Genomic Epidemiology, Department of Bio and Health Informatics, Technical University of Denmark5

Department of Systems Biology, Technical University of Denmark6

Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark7

An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations.

MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods.

After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives.

Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper).

A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets.

Language: English
Publisher: Public Library of Science
Year: 2017
Pages: e0176469
ISSN: 19326203
Types: Journal article
DOI: 10.1371/journal.pone.0176469
ORCIDs: Petersen, Thomas Nordahl , Lukjancenko, Oksana , Thomsen, Martin Christen Frølund , Lund, Ole and Aarestrup, Frank Møller

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis