TY - JOUR
T1 - Using Multiple Level Fusion for Improving Performance of Speaker Recognition
AU - Di, Liu
AU - Yeung, Cho Siu
AU - Dongmei, Sun
AU - Zhengding, Qiu
N1 - Funding Information:
This work is partially funded by Grant No 60773015 of National Science Foundation of China, No 4102051 of Beijing Natural Science Foundation, and No 2009JBZ006 of the Fundamental Research Funds for the Central Universities.
PY - 2011
Y1 - 2011
N2 - In this paper, a multiple level fusion framework to apply into the automatic speaker recognition system in order to improve its performance is presented. Based on the framework, different multiple level fusion methods, such as a strong multiple level fusion and three weak multiple level fusions, are defined in this paper. To examine the effectiveness of the proposed framework, two-feature combination scheme would be considered. After investigating the availability of strong and weak multiple level fusions for this scheme, the framework adopts a weak multiple level fusion method which combines two level fusions, ie matching-score fusion and decisionmaking fusion. In the matching-score level, a commonly used method called the score vector fusion is adopted. In the decision-making level, the kernel combination, also known as Multiple Kernel Learning is chosen. These two techniques can be embedded into many automatic speaker recognition systems. Throughout the evaluation by NIST 2001 corpus, two sets of experiments were conducted that the results of the two-feature combination scheme by the multiple level fusions are better than the traditional matching-score level fusion and unimodal methods. It is demonstrated that the multiple level fusion framework is an effective method to fuse the features for speaker recognition applications.
AB - In this paper, a multiple level fusion framework to apply into the automatic speaker recognition system in order to improve its performance is presented. Based on the framework, different multiple level fusion methods, such as a strong multiple level fusion and three weak multiple level fusions, are defined in this paper. To examine the effectiveness of the proposed framework, two-feature combination scheme would be considered. After investigating the availability of strong and weak multiple level fusions for this scheme, the framework adopts a weak multiple level fusion method which combines two level fusions, ie matching-score fusion and decisionmaking fusion. In the matching-score level, a commonly used method called the score vector fusion is adopted. In the decision-making level, the kernel combination, also known as Multiple Kernel Learning is chosen. These two techniques can be embedded into many automatic speaker recognition systems. Throughout the evaluation by NIST 2001 corpus, two sets of experiments were conducted that the results of the two-feature combination scheme by the multiple level fusions are better than the traditional matching-score level fusion and unimodal methods. It is demonstrated that the multiple level fusion framework is an effective method to fuse the features for speaker recognition applications.
KW - Decision-making Level Fusion
KW - Feature Level Fusion
KW - Fusion Techniques
KW - Matching-score Level Fusion
KW - Multiple Kernel Learning
KW - Multiple Level Fusion
KW - Speaker Recognition
UR - http://www.scopus.com/inward/record.url?scp=84855275973&partnerID=8YFLogxK
U2 - 10.1080/1023697X.2011.10668243
DO - 10.1080/1023697X.2011.10668243
M3 - Article
AN - SCOPUS:84855275973
SN - 1023-697X
VL - 18
SP - 39
EP - 48
JO - Transactions Hong Kong Institution of Engineers
JF - Transactions Hong Kong Institution of Engineers
IS - 4
ER -