About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Journal article

A massively parallel GPU-accelerated model for analysis of fully nonlinear free surface waves

From

Scientific Computing, Department of Informatics and Mathematical Modeling, Technical University of Denmark1

Department of Informatics and Mathematical Modeling, Technical University of Denmark2

Technical University of Denmark3

We implement and evaluate a massively parallel and scalable algorithm based on a multigrid preconditioned Defect Correction method for the simulation of fully nonlinear free surface flows. The simulations are based on a potential model that describes wave propagation over uneven bottoms in three space dimensions and is useful for fast analysis and prediction purposes in coastal and offshore engineering.

A dedicated numerical model based on the proposed algorithm is executed in parallel by utilizing affordable modern special purpose graphics processing unit (GPU). The model is based on a low-storage flexible-order accurate finite difference method that is known to be efficient and scalable on a CPU core (single thread).

To achieve parallel performance of the relatively complex numerical model, we investigate a new trend in high-performance computing where many-core GPUs are utilized as high-throughput co-processors to the CPU. We describe and demonstrate how this approach makes it possible to do fast desktop computations for large nonlinear wave problems in numerical wave tanks (NWTs) with close to 50/100 million total grid points in double/ single precision with 4 GB global device memory available.

A new code base has been developed in C++ and compute unified device architecture C and is found to improve the runtime more than an order in magnitude in double precision arithmetic for the same accuracy over an existing CPU (single thread) Fortran 90 code when executed on a single modern GPU. These significant improvements are achieved by carefully implementing the algorithm to minimize data-transfer and take advantage of the massive multi-threading capability of the GPU device.

Language: English
Publisher: John Wiley & Sons, Ltd
Year: 2011
Pages: 20-36
ISSN: 10970363 and 02712091
Types: Journal article
DOI: 10.1002/fld.2675
ORCIDs: Engsig-Karup, Allan Peter

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis