About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

PhD Thesis

Data analysis in high-dimensional sparse spaces: Large p, small n problems

From

DTU Data Analysis, Department of Informatics and Mathematical Modeling, Technical University of Denmark1

Department of Informatics and Mathematical Modeling, Technical University of Denmark2

The present thesis considers data analysis of problems with many features in relation to the number of observations (large p, small n problems). The theoretical considerations for such problems are outlined including the curses and blessings of dimensionality, and the importance of dimension reduction.

In this context the trade off between a rich solution which answers the questions at hand and a simple solution which generalizes to unseen data is described. For all of the given data examples labelled output exists and the analyses are therefore limited to supervised settings. Three novel classification techniques for high-dimensional problems are presented: Sparse discriminant analysis, sparse mixture discriminant analysis and orthogonality constrained support vector machines.

The first two introduces sparseness to the well known linear and mixture discriminant analysis and thereby provide low-dimensional projections of data with few non-zero loadings which give improvements in classification. The latter adds a priori information of pairing between observations to the support vector machine and thereby give solutions with less variation and slight improvements in classification.

The classification methods are applied to classifications of fish species, ear canal impressions used in the hearing aid industry, microbiological fungi species, and various cancerous tissues and healthy tissues. In addition, novel applications of sparse regressions (also called the elastic net) to the medical, concrete, and food industries via multi-spectral images for objective and automated systems are presented.

Language: English
Publisher: Technical University of Denmark
Year: 2010
Series: Imm-phd-2009-228
Types: PhD Thesis
ORCIDs: Clemmensen, Line Katrine Harder

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis