Conference paper
On Comparison of Adaptive Regularization Methods
Modeling with flexible models, such as neural networks, requires careful control of the model complexity and generalization ability of the resulting model which finds expression in the ubiquitous bias-variance dilemma. Regularization is a tool for optimizing the model structure reducing variance at the expense of introducing extra bias.
The overall objective of adaptive regularization is to tune the amount of regularization ensuring minimal generalization error. Regularization is a supplement to direct model selection techniques like step-wise selection and one would prefer a hybrid scheme; however, a very flexible regularization may substitute the need for selection procedures.
This paper investigates recently suggested adaptive regularization schemes. Some methods focus directly on minimizing an estimate of the generalization error (either algebraic or empirical), whereas others start from different criteria, e.g., the Bayesian evidence. The evidence expresses basically the probability of the model, which is conceptually different from generalization error; however, asymptotically for large training data sets they will converge.
First the basic model definition, training and generalization is presented. Next, different adaptive regularization schemes are reviewed and extended. Finally, the experimental section presents a comparative study concerning linear models for regression/time series problems.
Language: | English |
---|---|
Publisher: | IEEE |
Year: | 2000 |
Pages: | 221-230 |
Proceedings: | Neural Networks for Signal Processing X |
ISBN: | 0780362780 and 9780780362789 |
ISSN: | 23792329 and 10893555 |
Types: | Conference paper |
DOI: | 10.1109/NNSP.2000.889413 |
ORCIDs: | Larsen, Jan and Hansen, Lars Kai |
Bayesian evidence Bayesian methods Cost function Electronic mail Loss measurement Mathematical model Neural networks Predictive models Signal processing Training data Vectors adaptive regularization methods bias-variance dilemma experiment generalisation (artificial intelligence) generalization large training data sets model complexity model selection neural nets neural networks probability regression statistical analysis step-wise selection time series