About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

PhD Thesis

Spectrogram inversion and potential applications for hearing research

From

Department of Electrical Engineering, Technical University of Denmark1

Hearing Systems, Department of Electrical Engineering, Technical University of Denmark2

A common way of analyzing signals in a joint time-frequency domain is found in the spectrogram, which can be interpreted as a multi-channel envelope representation of the signal. The envelope cannot fully represent a signal because it only reflects slow changes in the amplitude of a signal and lacks information regarding its fast variations, the temporal fine structure (TFS).

However, the main hypothesis explored in this thesis is that a spectrogram could be a faithful representation of a signal, that is, TFS information could be recovered by across-channel comparison of envelopes. Based on this consideration, an approach for spectrogram inversion was proposed: time-domain signals were recovered from spectrograms computed using both inner hair-cell envelope (i.e., traditional half-wave rectification followed by low-pass filtering) and Hilbert envelope definitions.

The high accuracy of the inversion scheme (as measured by root mean square error and spectral convergence) implies that the main hypothesis holds true for the designs chosen. Two practical applications of this result were then presented. (1) Spectrograms that are computed using the inner hair-cell (IHC) envelope definition are a reasonable model of the signal processing performed by the human cochlea.

The robustness of the reconstruction from such spectrograms with regards to the properties of the cochlear model showed that, for previously documented IHC models as well as for more restrictive conditions, the TFS-related information is retained by the (modeled) cochlear processing even at high audio frequencies. (2) Using the inversion framework, it is possible to manipulate signals in the modulation domain, while preserving their long-term power spectra.

Thus, this enabled the creation of mixtures of speech and noise where the signal-to-noise ratio in the envelope domain (SNRenv) was directly controlled. Behavioral measures of the intelligibility for such mixtures were compared to predictions from a model of speech intelligibility. Conditions where noise was processed led to modest intelligibility improvements for increased SNRenv, providing v

Language: English
Publisher: Technical University of Denmark, Department of Electrical Engineering
Year: 2013
Types: PhD Thesis

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis