Boolean:	`(bicycle AND helmet) OR (head AND protection)` (always group AND in parenthesis)
Title:	`title:(climate change)`
Author:	`author:("bohr niels" OR "bohr n")` (avoid only full first name)
Phrase:	`"water pump control"` (does not work with wildcards)
Wildcards:	`wom?n pharm*`

Conference paper

Combining Semantic and Acoustic Features for Valence and Arousal Recognition in Speech

In 2012 3rd International Workshop on Cognitive Information Processing (cip) — 2012, pp. 1-6

By Karadogan, Seliz^1,2; Larsen, Jan^1,2

From

Department of Informatics and Mathematical Modeling, Technical University of Denmark¹

Cognitive Systems, Department of Informatics and Mathematical Modeling, Technical University of Denmark²

Abstract

The recognition of affect in speech has attracted a lot of interest recently; especially in the area of cognitive and computer sciences. Most of the previous studies focused on the recognition of basic emotions (such as happiness, sadness and anger) using categorical approach. Recently, the focus has been shifting towards dimensional affect recognition based on the idea that emotional states are not independent from one another but related in a systematic manner.

In this paper, we design a continuous dimensional speech affect recognition model that combines acoustic and semantic features. We design our own corpus that consists of 59 short movie clips with audio and text in subtitle format, rated by human subjects in arousal and valence (A-V) dimensions. For the acoustic part, we combine many features and use correlation based feature selection and apply support vector regression.

For the semantic part, we use the affective norms for English words (ANEW), that are rated also in A-V dimensions, as keywords and apply latent semantics analysis (LSA) on those words and words in the clips to estimate A-V values in the clips. Finally, the results of acoustic and semantic parts are combined.

We show that combining semantic and acoustic information for dimensional speech recognition improves the results. Moreover, we show that valence is better estimated using semantic features while arousal is better estimated using acoustic features.

Language:	English
Publisher:	IEEE
Year:	2012
Pages:	1-6
Proceedings:	3rd International Workshop on Cognitive Information Processing (CIP)
ISBN:	1467318760 , 1467318779 , 1467318787 , 9781467318761 , 9781467318778 and 9781467318785
Types:	Conference paper
DOI:	10.1109/CIP.2012.6232924
ORCIDs:	Larsen, Jan

Keywords

A-V dimensions ANEW Acoustics Databases Emotion recognition Feature extraction LSA Semantics Speech Speech recognition acoustic features acoustic signal processing affective norms for English words arousal recognition arousal-valence dimensions cognitive sciences computer sciences continuous dimensional speech affect recognition model correlation theory correlation-based feature selection emotion recognition emotions recognition feature extraction human subjects latent semantics analysis natural languages regression analysis semantic features short movie clips speech recognition subtitle format support vector machines support vector regression valence recognition

Combining Semantic and Acoustic Features for Valence and Arousal Recognition in Speech

DTU Library

Address

Shortcuts

Log in?

Combining Semantic and Acoustic Features for Valence and Arousal Recognition in Speech

DTU Library

Address

Shortcuts