Naturalistic emotional speech collection paradigm with online game and its psychological and acoustical assessment

Yoshiko Arimoto, Hiromi Kawatsuz, Sumio Ohno, Hitoshi Iida

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

For the purpose of constructing a naturalistic emotional speech database, a novel paradigm of collecting naturalistic emotional speech during a spontaneous Japanese dialog was proposed. The proposed paradigm was assessed by investigating whether the collected speech contains and conveys rich emotions psychologically and acoustically. To encourage speakers to experience and express their natural and vivid emotions, a Massively Multiplayer Online Role-Playing Game (MMORPG) was adopted as a task for speakers. They were asked to play the MMORPG together while discussing strategies to achieve their tasks through a voice chat system. The recording was performed for one hour per speaker. The total recording time was approximately 14 hours. The results of emotional labeling for the collected speech supported the validity of the paradigm showing higher interlabeler agreement than the chance levels. In addition, it was revealed that the paradigm is superior in the quantity of emotional speech to other paradigm by showing a significantly higher rate of labeling instances for our speech material (73%, χ 2 27659:87, p < 0:001) than other speech materials. Finally, an acoustical analysis supported the validity of the paradigm, showing a significant difference between the nonemotional utterances and the emotional utterances (p < 0:05).

Original languageEnglish
Pages (from-to)359-369
Number of pages11
JournalAcoustical Science and Technology
Volume33
Issue number6
DOIs
Publication statusPublished - 2012
Externally publishedYes

Fingerprint

games
emotions
marking
recording

Keywords

  • Acoustic analysis
  • Emotional speech
  • Online game
  • Spoken dialog
  • Voice chat

ASJC Scopus subject areas

  • Acoustics and Ultrasonics

Cite this

Naturalistic emotional speech collection paradigm with online game and its psychological and acoustical assessment. / Arimoto, Yoshiko; Kawatsuz, Hiromi; Ohno, Sumio; Iida, Hitoshi.

In: Acoustical Science and Technology, Vol. 33, No. 6, 2012, p. 359-369.

Research output: Contribution to journalArticle

Arimoto, Yoshiko ; Kawatsuz, Hiromi ; Ohno, Sumio ; Iida, Hitoshi. / Naturalistic emotional speech collection paradigm with online game and its psychological and acoustical assessment. In: Acoustical Science and Technology. 2012 ; Vol. 33, No. 6. pp. 359-369.
@article{b7c003ad25ad404eae24aa6ae26dd12a,
title = "Naturalistic emotional speech collection paradigm with online game and its psychological and acoustical assessment",
abstract = "For the purpose of constructing a naturalistic emotional speech database, a novel paradigm of collecting naturalistic emotional speech during a spontaneous Japanese dialog was proposed. The proposed paradigm was assessed by investigating whether the collected speech contains and conveys rich emotions psychologically and acoustically. To encourage speakers to experience and express their natural and vivid emotions, a Massively Multiplayer Online Role-Playing Game (MMORPG) was adopted as a task for speakers. They were asked to play the MMORPG together while discussing strategies to achieve their tasks through a voice chat system. The recording was performed for one hour per speaker. The total recording time was approximately 14 hours. The results of emotional labeling for the collected speech supported the validity of the paradigm showing higher interlabeler agreement than the chance levels. In addition, it was revealed that the paradigm is superior in the quantity of emotional speech to other paradigm by showing a significantly higher rate of labeling instances for our speech material (73{\%}, χ 2 27659:87, p < 0:001) than other speech materials. Finally, an acoustical analysis supported the validity of the paradigm, showing a significant difference between the nonemotional utterances and the emotional utterances (p < 0:05).",
keywords = "Acoustic analysis, Emotional speech, Online game, Spoken dialog, Voice chat",
author = "Yoshiko Arimoto and Hiromi Kawatsuz and Sumio Ohno and Hitoshi Iida",
year = "2012",
doi = "10.1250/ast.33.359",
language = "English",
volume = "33",
pages = "359--369",
journal = "Acoustical Science and Technology",
issn = "1346-3969",
publisher = "Acoustical Society of Japan",
number = "6",

}

TY - JOUR

T1 - Naturalistic emotional speech collection paradigm with online game and its psychological and acoustical assessment

AU - Arimoto, Yoshiko

AU - Kawatsuz, Hiromi

AU - Ohno, Sumio

AU - Iida, Hitoshi

PY - 2012

Y1 - 2012

N2 - For the purpose of constructing a naturalistic emotional speech database, a novel paradigm of collecting naturalistic emotional speech during a spontaneous Japanese dialog was proposed. The proposed paradigm was assessed by investigating whether the collected speech contains and conveys rich emotions psychologically and acoustically. To encourage speakers to experience and express their natural and vivid emotions, a Massively Multiplayer Online Role-Playing Game (MMORPG) was adopted as a task for speakers. They were asked to play the MMORPG together while discussing strategies to achieve their tasks through a voice chat system. The recording was performed for one hour per speaker. The total recording time was approximately 14 hours. The results of emotional labeling for the collected speech supported the validity of the paradigm showing higher interlabeler agreement than the chance levels. In addition, it was revealed that the paradigm is superior in the quantity of emotional speech to other paradigm by showing a significantly higher rate of labeling instances for our speech material (73%, χ 2 27659:87, p < 0:001) than other speech materials. Finally, an acoustical analysis supported the validity of the paradigm, showing a significant difference between the nonemotional utterances and the emotional utterances (p < 0:05).

AB - For the purpose of constructing a naturalistic emotional speech database, a novel paradigm of collecting naturalistic emotional speech during a spontaneous Japanese dialog was proposed. The proposed paradigm was assessed by investigating whether the collected speech contains and conveys rich emotions psychologically and acoustically. To encourage speakers to experience and express their natural and vivid emotions, a Massively Multiplayer Online Role-Playing Game (MMORPG) was adopted as a task for speakers. They were asked to play the MMORPG together while discussing strategies to achieve their tasks through a voice chat system. The recording was performed for one hour per speaker. The total recording time was approximately 14 hours. The results of emotional labeling for the collected speech supported the validity of the paradigm showing higher interlabeler agreement than the chance levels. In addition, it was revealed that the paradigm is superior in the quantity of emotional speech to other paradigm by showing a significantly higher rate of labeling instances for our speech material (73%, χ 2 27659:87, p < 0:001) than other speech materials. Finally, an acoustical analysis supported the validity of the paradigm, showing a significant difference between the nonemotional utterances and the emotional utterances (p < 0:05).

KW - Acoustic analysis

KW - Emotional speech

KW - Online game

KW - Spoken dialog

KW - Voice chat

UR - http://www.scopus.com/inward/record.url?scp=84868370645&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84868370645&partnerID=8YFLogxK

U2 - 10.1250/ast.33.359

DO - 10.1250/ast.33.359

M3 - Article

AN - SCOPUS:84868370645

VL - 33

SP - 359

EP - 369

JO - Acoustical Science and Technology

JF - Acoustical Science and Technology

SN - 1346-3969

IS - 6

ER -