Ensemble methods for spoken emotion recognition in call-centres

Donn Morrison; Ruili Wang; Liyanage C. De Silva

doi:10.1016/j.specom.2006.11.004

Ensemble methods for spoken emotion recognition in call-centres

Donn Morrison, Ruili Wang, Liyanage C. De Silva

Research output: Journal Publication › Article › peer-review

290 Citations (Scopus)

Abstract

Machine-based emotional intelligence is a requirement for more natural interaction between humans and computer interfaces and a basic level of accurate emotion perception is needed for computer systems to respond adequately to human emotion. Humans convey emotional information both intentionally and unintentionally via speech patterns. These vocal patterns are perceived and understood by listeners during conversation. This research aims to improve the automatic perception of vocal emotion in two ways. First, we compare two emotional speech data sources: natural, spontaneous emotional speech and acted or portrayed emotional speech. This comparison demonstrates the advantages and disadvantages of both acquisition methods and how these methods affect the end application of vocal emotion recognition. Second, we look at two classification methods which have not been applied in this field: stacked generalisation and unweighted vote. We show how these techniques can yield an improvement over traditional classification methods.

Original language	English
Pages (from-to)	98-112
Number of pages	15
Journal	Speech Communication
Volume	49
Issue number	2
DOIs	https://doi.org/10.1016/j.specom.2006.11.004
Publication status	Published - Feb 2007
Externally published	Yes

Keywords

Affect recognition
Emotion recognition
Ensemble methods
Speech databases
Speech processing

ASJC Scopus subject areas

Software
Modelling and Simulation
Communication
Language and Linguistics
Linguistics and Language
Computer Vision and Pattern Recognition
Computer Science Applications

Access to Document

10.1016/j.specom.2006.11.004

Cite this

@article{61537cc6383f4e18a7b3e54143e5cb0b,

title = "Ensemble methods for spoken emotion recognition in call-centres",

abstract = "Machine-based emotional intelligence is a requirement for more natural interaction between humans and computer interfaces and a basic level of accurate emotion perception is needed for computer systems to respond adequately to human emotion. Humans convey emotional information both intentionally and unintentionally via speech patterns. These vocal patterns are perceived and understood by listeners during conversation. This research aims to improve the automatic perception of vocal emotion in two ways. First, we compare two emotional speech data sources: natural, spontaneous emotional speech and acted or portrayed emotional speech. This comparison demonstrates the advantages and disadvantages of both acquisition methods and how these methods affect the end application of vocal emotion recognition. Second, we look at two classification methods which have not been applied in this field: stacked generalisation and unweighted vote. We show how these techniques can yield an improvement over traditional classification methods.",

keywords = "Affect recognition, Emotion recognition, Ensemble methods, Speech databases, Speech processing",

author = "Donn Morrison and Ruili Wang and {De Silva}, {Liyanage C.}",

year = "2007",

month = feb,

doi = "10.1016/j.specom.2006.11.004",

language = "English",

volume = "49",

pages = "98--112",

journal = "Speech Communication",

issn = "0167-6393",

publisher = "Elsevier B.V.",

number = "2",

}

TY - JOUR

T1 - Ensemble methods for spoken emotion recognition in call-centres

AU - Morrison, Donn

AU - Wang, Ruili

AU - De Silva, Liyanage C.

PY - 2007/2

Y1 - 2007/2

N2 - Machine-based emotional intelligence is a requirement for more natural interaction between humans and computer interfaces and a basic level of accurate emotion perception is needed for computer systems to respond adequately to human emotion. Humans convey emotional information both intentionally and unintentionally via speech patterns. These vocal patterns are perceived and understood by listeners during conversation. This research aims to improve the automatic perception of vocal emotion in two ways. First, we compare two emotional speech data sources: natural, spontaneous emotional speech and acted or portrayed emotional speech. This comparison demonstrates the advantages and disadvantages of both acquisition methods and how these methods affect the end application of vocal emotion recognition. Second, we look at two classification methods which have not been applied in this field: stacked generalisation and unweighted vote. We show how these techniques can yield an improvement over traditional classification methods.

AB - Machine-based emotional intelligence is a requirement for more natural interaction between humans and computer interfaces and a basic level of accurate emotion perception is needed for computer systems to respond adequately to human emotion. Humans convey emotional information both intentionally and unintentionally via speech patterns. These vocal patterns are perceived and understood by listeners during conversation. This research aims to improve the automatic perception of vocal emotion in two ways. First, we compare two emotional speech data sources: natural, spontaneous emotional speech and acted or portrayed emotional speech. This comparison demonstrates the advantages and disadvantages of both acquisition methods and how these methods affect the end application of vocal emotion recognition. Second, we look at two classification methods which have not been applied in this field: stacked generalisation and unweighted vote. We show how these techniques can yield an improvement over traditional classification methods.

KW - Affect recognition

KW - Emotion recognition

KW - Ensemble methods

KW - Speech databases

KW - Speech processing

UR - http://www.scopus.com/inward/record.url?scp=33846952503&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2006.11.004

DO - 10.1016/j.specom.2006.11.004

M3 - Article

AN - SCOPUS:33846952503

SN - 0167-6393

VL - 49

SP - 98

EP - 112

JO - Speech Communication

JF - Speech Communication

IS - 2

ER -

Ensemble methods for spoken emotion recognition in call-centres

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this