About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Journal article

Application of Whole-Genome Sequences and Machine Learning in Source Attribution of Salmonella Typhimurium

From

Group for Epidemiological Risk Assessment, National Food Institute, Technical University of Denmark1

National Food Institute, Technical University of Denmark2

Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark3

Statens Serum Institut4

Prevention of the emergence and spread of foodborne diseases is an important prerequisite for the improvement of public health. Source attribution models link sporadic human cases of a specific illness to food sources and animal reservoirs. With the next generation sequencing technology, it is possible to develop novel source attribution models.

We investigated the potential of machine learning to predict the animal reservoir from which a bacterial strain isolated from a human salmonellosis case originated based on whole-genome sequencing. Machine learning methods recognize patterns in large and complex data sets and use this knowledge to build models.

The model learns patterns associated with genetic variations in bacteria isolated from the different animal reservoirs. We selected different machine learning algorithms to predict sources of human salmonellosis cases and trained the model with Danish Salmonella Typhimurium isolates sampled from broilers (n = 34), cattle (n = 2), ducks (n = 11), layers (n = 4), and pigs (n = 159).

Using cgMLST as input features, the model yielded an average accuracy of 0.783 (95% CI: 0.77-0.80) in the source prediction for the random forest and 0.933 (95% CI: 0.92-0.94) for the logit boost algorithm. Logit boost algorithm was most accurate (valid accuracy: 92%, CI: 0.8706-0.9579) and predicted the origin of 81% of the domestic sporadic human salmonellosis cases.

The most important source was Danish produced pigs (53%) followed by imported pigs (16%), imported broilers (6%), imported ducks (2%), Danish produced layers (2%), Danish produced cattle and imported cattle (

Language: English
Publisher: John Wiley and Sons Inc.
Year: 2020
Pages: 1693-1705
ISSN: 15396924 and 02724332
Types: Journal article
DOI: 10.1111/risa.13510
ORCIDs: Munck, Nanna Sophia Mucha , Njage, Patrick Murigu Kamau , Leekitcharoenphon, Pimlapas , 0000-0002-5463-7619 and Hald, Tine

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis