About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Journal article

Prediction of rare feature combinations in population synthesis: Application of deep generative modelling

From

Transport, Department of Technology, Management and Economics, Technical University of Denmark1

Transport Demand, Transport, Department of Technology, Management and Economics, Technical University of Denmark2

Department of Technology, Management and Economics, Technical University of Denmark3

Machine Learning, Transport, Department of Technology, Management and Economics, Technical University of Denmark4

Population synthesis is concerned with the generation of agents for agent-based modelling in many fields, such as economics, transportation, ecology and epidemiology. When the number of attributes describing the agents and/or their level of detail becomes large, survey data cannot densely support the joint distribution of the attributes in the population due to the curse of dimensionality.

It leads to a situation where many attribute combinations are missing from the sample data while such combinations exist in the real population. In this case, it becomes essential to consider methods that are able to impute such missing information effectively. In this paper, we propose to use deep generative latent models.

These models are able to learn a compressed representation of the data space, which when projected back to the original space, leads to an effective way of imputing information in the observed data space. Specifically, we employ the Wasserstein Generative Adversarial Network (WGAN) and the Variational Autoencoder (VAE) for a large-scale population synthesis application.

The models are applied to a Danish travel survey with a feature-space of more than 60 variables and trained and tested using cross-validation. A new metric that applies to the evaluation of generative models in an unsupervised setting is proposed. It is based on the ability to generate diverse yet valid synthetic attribute combinations by comparing if the models can recover missing combinations (sampling zeros) while keeping truly impossible combinations (structural zeros) models at a minimum.

For a low-dimensional experiment, the VAE, the marginal sampler and the fully random sampler generate 5%, 21% and 26% more structural zeros per sampling zero when compared to the WGAN. For a high dimensional case, these figures increase to 44%, 2217% and 170440% respectively. This research directly supports the development of agent-based systems and in particular cases where detailed socio-economic or geographical representations are required.

Language: English
Year: 2020
Pages: 102787
ISSN: 18792359 and 0968090x
Types: Journal article
DOI: 10.1016/j.trc.2020.102787
ORCIDs: Garrido, Sergio , Borysov, Stanislav S. , Pereira, Francisco Camara and Rich, Jeppe

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis