PhD Thesis
Sparse Classification - Methods & Applications
Department of Applied Mathematics and Computer Science, Technical University of Denmark1
Visual Computing, Department of Applied Mathematics and Computer Science, Technical University of Denmark2
Statistics and Data Analysis, Department of Applied Mathematics and Computer Science, Technical University of Denmark3
With increasing number of more sophisticated tools to acquire data, we are faced with the important question of what matters in the sea of information at hand. This challenge is becoming more prevalent across virtually all scientific disciplines. Improvements over state of the art methods for analysing such data carry the potential to revolutionize tasks such as medical diagnostics where often decisions need to be based on only a few high-dimensional observations.
This explosion in data dimensionality has sparked the development of novel statistical methods. In contrast, classical statistics build upon the assumption that we have more samples than variables, and the main asymptotic results, such as the central limit theorem, reflect that. As the assumption of having many samples does not hold for modern datasets, we need new tools and methods to find the signal within the dataset which is predictive of the relevant response variable.
The focus in this thesis is on sparse methods where sparse implies that the method selects only a few variables. Different types of data call for different methods. In this thesis the sparse methods we study concern settings where the response variable is ordinal. Such ordinal labeling is common in many fields, for example, medical doctors often summarize their observations into a single class of disease severity, which is known as a medical rating score.
Automation offers the potential to improve both the reliability and objectivity of such tasks. To demonstrate the effectiveness of the sparse methods developed in this thesis, they were applied to both challenging and diverse real-world problems: Predicting the severity of motion disorders from Parkinson’s patients, generating short summaries of content from hundreds of online user reviews and detecting foreign objects from Multispectral X-ray scans.
It may be noted, that to achieve these results, novel optimization approaches and open-source software were implemented.
Language: | English |
---|---|
Publisher: | DTU Compute |
Year: | 2018 |
Series: | Dtu Compute Phd-2018 |
Types: | PhD Thesis |
ORCIDs: | Einarsson, Gudmundur |