Using Multiple Level Fusion for Improving Performance of Speaker Recognition

Liu Di; Cho Siu Yeung; Sun Dongmei; Qiu Zhengding

doi:10.1080/1023697X.2011.10668243

Using Multiple Level Fusion for Improving Performance of Speaker Recognition

Liu Di, Cho Siu Yeung, Sun Dongmei, Qiu Zhengding

Department of Electrical and Electronic Engineering

Research output: Journal Publication › Article › peer-review

Abstract

In this paper, a multiple level fusion framework to apply into the automatic speaker recognition system in order to improve its performance is presented. Based on the framework, different multiple level fusion methods, such as a strong multiple level fusion and three weak multiple level fusions, are defined in this paper. To examine the effectiveness of the proposed framework, two-feature combination scheme would be considered. After investigating the availability of strong and weak multiple level fusions for this scheme, the framework adopts a weak multiple level fusion method which combines two level fusions, ie matching-score fusion and decisionmaking fusion. In the matching-score level, a commonly used method called the score vector fusion is adopted. In the decision-making level, the kernel combination, also known as Multiple Kernel Learning is chosen. These two techniques can be embedded into many automatic speaker recognition systems. Throughout the evaluation by NIST 2001 corpus, two sets of experiments were conducted that the results of the two-feature combination scheme by the multiple level fusions are better than the traditional matching-score level fusion and unimodal methods. It is demonstrated that the multiple level fusion framework is an effective method to fuse the features for speaker recognition applications.

Original language	English
Pages (from-to)	39-48
Number of pages	10
Journal	Transactions Hong Kong Institution of Engineers
Volume	18
Issue number	4
DOIs	https://doi.org/10.1080/1023697X.2011.10668243
Publication status	Published - 2011

Keywords

Decision-making Level Fusion
Feature Level Fusion
Fusion Techniques
Matching-score Level Fusion
Multiple Kernel Learning
Multiple Level Fusion
Speaker Recognition

ASJC Scopus subject areas

General Engineering

Access to Document

10.1080/1023697X.2011.10668243

Cite this

@article{3b5c2bcb37ac4ee1a8ec2cf538f8e103,

title = "Using Multiple Level Fusion for Improving Performance of Speaker Recognition",

abstract = "In this paper, a multiple level fusion framework to apply into the automatic speaker recognition system in order to improve its performance is presented. Based on the framework, different multiple level fusion methods, such as a strong multiple level fusion and three weak multiple level fusions, are defined in this paper. To examine the effectiveness of the proposed framework, two-feature combination scheme would be considered. After investigating the availability of strong and weak multiple level fusions for this scheme, the framework adopts a weak multiple level fusion method which combines two level fusions, ie matching-score fusion and decisionmaking fusion. In the matching-score level, a commonly used method called the score vector fusion is adopted. In the decision-making level, the kernel combination, also known as Multiple Kernel Learning is chosen. These two techniques can be embedded into many automatic speaker recognition systems. Throughout the evaluation by NIST 2001 corpus, two sets of experiments were conducted that the results of the two-feature combination scheme by the multiple level fusions are better than the traditional matching-score level fusion and unimodal methods. It is demonstrated that the multiple level fusion framework is an effective method to fuse the features for speaker recognition applications.",

keywords = "Decision-making Level Fusion, Feature Level Fusion, Fusion Techniques, Matching-score Level Fusion, Multiple Kernel Learning, Multiple Level Fusion, Speaker Recognition",

author = "Liu Di and Yeung, {Cho Siu} and Sun Dongmei and Qiu Zhengding",

note = "Funding Information: This work is partially funded by Grant No 60773015 of National Science Foundation of China, No 4102051 of Beijing Natural Science Foundation, and No 2009JBZ006 of the Fundamental Research Funds for the Central Universities.",

year = "2011",

doi = "10.1080/1023697X.2011.10668243",

language = "English",

volume = "18",

pages = "39--48",

journal = "Transactions Hong Kong Institution of Engineers",

issn = "1023-697X",

publisher = "Hong Kong Institution of Engineers",

number = "4",

}

TY - JOUR

T1 - Using Multiple Level Fusion for Improving Performance of Speaker Recognition

AU - Di, Liu

AU - Yeung, Cho Siu

AU - Dongmei, Sun

AU - Zhengding, Qiu

N1 - Funding Information: This work is partially funded by Grant No 60773015 of National Science Foundation of China, No 4102051 of Beijing Natural Science Foundation, and No 2009JBZ006 of the Fundamental Research Funds for the Central Universities.

PY - 2011

Y1 - 2011

N2 - In this paper, a multiple level fusion framework to apply into the automatic speaker recognition system in order to improve its performance is presented. Based on the framework, different multiple level fusion methods, such as a strong multiple level fusion and three weak multiple level fusions, are defined in this paper. To examine the effectiveness of the proposed framework, two-feature combination scheme would be considered. After investigating the availability of strong and weak multiple level fusions for this scheme, the framework adopts a weak multiple level fusion method which combines two level fusions, ie matching-score fusion and decisionmaking fusion. In the matching-score level, a commonly used method called the score vector fusion is adopted. In the decision-making level, the kernel combination, also known as Multiple Kernel Learning is chosen. These two techniques can be embedded into many automatic speaker recognition systems. Throughout the evaluation by NIST 2001 corpus, two sets of experiments were conducted that the results of the two-feature combination scheme by the multiple level fusions are better than the traditional matching-score level fusion and unimodal methods. It is demonstrated that the multiple level fusion framework is an effective method to fuse the features for speaker recognition applications.

AB - In this paper, a multiple level fusion framework to apply into the automatic speaker recognition system in order to improve its performance is presented. Based on the framework, different multiple level fusion methods, such as a strong multiple level fusion and three weak multiple level fusions, are defined in this paper. To examine the effectiveness of the proposed framework, two-feature combination scheme would be considered. After investigating the availability of strong and weak multiple level fusions for this scheme, the framework adopts a weak multiple level fusion method which combines two level fusions, ie matching-score fusion and decisionmaking fusion. In the matching-score level, a commonly used method called the score vector fusion is adopted. In the decision-making level, the kernel combination, also known as Multiple Kernel Learning is chosen. These two techniques can be embedded into many automatic speaker recognition systems. Throughout the evaluation by NIST 2001 corpus, two sets of experiments were conducted that the results of the two-feature combination scheme by the multiple level fusions are better than the traditional matching-score level fusion and unimodal methods. It is demonstrated that the multiple level fusion framework is an effective method to fuse the features for speaker recognition applications.

KW - Decision-making Level Fusion

KW - Feature Level Fusion

KW - Fusion Techniques

KW - Matching-score Level Fusion

KW - Multiple Kernel Learning

KW - Multiple Level Fusion

KW - Speaker Recognition

UR - http://www.scopus.com/inward/record.url?scp=84855275973&partnerID=8YFLogxK

U2 - 10.1080/1023697X.2011.10668243

DO - 10.1080/1023697X.2011.10668243

M3 - Article

AN - SCOPUS:84855275973

SN - 1023-697X

VL - 18

SP - 39

EP - 48

JO - Transactions Hong Kong Institution of Engineers

JF - Transactions Hong Kong Institution of Engineers

IS - 4

ER -

Using Multiple Level Fusion for Improving Performance of Speaker Recognition

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this