Speaker Recognition Using Neural Networks And Conventional Classifiers

Kevin R. Farrell, Richard Mammone, Khaled T. Assaleh

Research output: Contribution to journalLetter

159 Citations (Scopus)

Abstract

An evaluation of various classifiers for text-independent speaker recognition is presented. In addition, a new classifier is examined for this application. The new classifier is called the modified neural tree network (MNTN). The MNTN is a hierarchical classifier that combines the properties of decision trees and feedforward neural networks. The MNTN differs from the standard NTN in both the new learning rule used and the pruning criteria. The MNTN is evaluated for several speaker recognition experiments. These include closed- and open-set speaker identification and speaker verification. The database used is a subset of the TIMIT database consisting of 38 speakers from the same dialect region. The MNTN is compared with nearest neighbor classifiers, full-search, and tree-structured vector quantization (VQ) classifiers, multilayer perceptions (MLP's), and decision trees. For closed-set speaker identification experiments, the full-search VQ classifier and MNTN demonstrate comparable performance. Both methods perform significantly better than the other classifiers for this task. The MNTN and full-search VQ classifiers are also compared for several speaker verification and open-set speaker-identification experiments. The MNTN is found to perform better than full-search VQ classifiers for both of these applications. In addition to matching or exceeding the performance of the VQ classifier for these applications, the MNTN also provides a logarithmic saving for retrieval.

Original languageEnglish (US)
Pages (from-to)194-205
Number of pages12
JournalIEEE Transactions on Speech and Audio Processing
Volume2
Issue number1
DOIs
StatePublished - Jan 1 1994

Fingerprint

classifiers
Classifiers
Neural networks
vector quantization
Vector quantization
Decision trees
Experiments
Feedforward neural networks
learning
set theory
Multilayers
retrieval

All Science Journal Classification (ASJC) codes

  • Software
  • Electrical and Electronic Engineering
  • Computer Vision and Pattern Recognition
  • Acoustics and Ultrasonics

Cite this

@article{a725cc8c4bfe4ea589226d709230acfa,
title = "Speaker Recognition Using Neural Networks And Conventional Classifiers",
abstract = "An evaluation of various classifiers for text-independent speaker recognition is presented. In addition, a new classifier is examined for this application. The new classifier is called the modified neural tree network (MNTN). The MNTN is a hierarchical classifier that combines the properties of decision trees and feedforward neural networks. The MNTN differs from the standard NTN in both the new learning rule used and the pruning criteria. The MNTN is evaluated for several speaker recognition experiments. These include closed- and open-set speaker identification and speaker verification. The database used is a subset of the TIMIT database consisting of 38 speakers from the same dialect region. The MNTN is compared with nearest neighbor classifiers, full-search, and tree-structured vector quantization (VQ) classifiers, multilayer perceptions (MLP's), and decision trees. For closed-set speaker identification experiments, the full-search VQ classifier and MNTN demonstrate comparable performance. Both methods perform significantly better than the other classifiers for this task. The MNTN and full-search VQ classifiers are also compared for several speaker verification and open-set speaker-identification experiments. The MNTN is found to perform better than full-search VQ classifiers for both of these applications. In addition to matching or exceeding the performance of the VQ classifier for these applications, the MNTN also provides a logarithmic saving for retrieval.",
author = "Farrell, {Kevin R.} and Richard Mammone and Assaleh, {Khaled T.}",
year = "1994",
month = "1",
day = "1",
doi = "https://doi.org/10.1109/89.260362",
language = "English (US)",
volume = "2",
pages = "194--205",
journal = "IEEE Transactions on Speech and Audio Processing",
issn = "1063-6676",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "1",

}

Speaker Recognition Using Neural Networks And Conventional Classifiers. / Farrell, Kevin R.; Mammone, Richard; Assaleh, Khaled T.

In: IEEE Transactions on Speech and Audio Processing, Vol. 2, No. 1, 01.01.1994, p. 194-205.

Research output: Contribution to journalLetter

TY - JOUR

T1 - Speaker Recognition Using Neural Networks And Conventional Classifiers

AU - Farrell, Kevin R.

AU - Mammone, Richard

AU - Assaleh, Khaled T.

PY - 1994/1/1

Y1 - 1994/1/1

N2 - An evaluation of various classifiers for text-independent speaker recognition is presented. In addition, a new classifier is examined for this application. The new classifier is called the modified neural tree network (MNTN). The MNTN is a hierarchical classifier that combines the properties of decision trees and feedforward neural networks. The MNTN differs from the standard NTN in both the new learning rule used and the pruning criteria. The MNTN is evaluated for several speaker recognition experiments. These include closed- and open-set speaker identification and speaker verification. The database used is a subset of the TIMIT database consisting of 38 speakers from the same dialect region. The MNTN is compared with nearest neighbor classifiers, full-search, and tree-structured vector quantization (VQ) classifiers, multilayer perceptions (MLP's), and decision trees. For closed-set speaker identification experiments, the full-search VQ classifier and MNTN demonstrate comparable performance. Both methods perform significantly better than the other classifiers for this task. The MNTN and full-search VQ classifiers are also compared for several speaker verification and open-set speaker-identification experiments. The MNTN is found to perform better than full-search VQ classifiers for both of these applications. In addition to matching or exceeding the performance of the VQ classifier for these applications, the MNTN also provides a logarithmic saving for retrieval.

AB - An evaluation of various classifiers for text-independent speaker recognition is presented. In addition, a new classifier is examined for this application. The new classifier is called the modified neural tree network (MNTN). The MNTN is a hierarchical classifier that combines the properties of decision trees and feedforward neural networks. The MNTN differs from the standard NTN in both the new learning rule used and the pruning criteria. The MNTN is evaluated for several speaker recognition experiments. These include closed- and open-set speaker identification and speaker verification. The database used is a subset of the TIMIT database consisting of 38 speakers from the same dialect region. The MNTN is compared with nearest neighbor classifiers, full-search, and tree-structured vector quantization (VQ) classifiers, multilayer perceptions (MLP's), and decision trees. For closed-set speaker identification experiments, the full-search VQ classifier and MNTN demonstrate comparable performance. Both methods perform significantly better than the other classifiers for this task. The MNTN and full-search VQ classifiers are also compared for several speaker verification and open-set speaker-identification experiments. The MNTN is found to perform better than full-search VQ classifiers for both of these applications. In addition to matching or exceeding the performance of the VQ classifier for these applications, the MNTN also provides a logarithmic saving for retrieval.

UR - http://www.scopus.com/inward/record.url?scp=0028204659&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028204659&partnerID=8YFLogxK

U2 - https://doi.org/10.1109/89.260362

DO - https://doi.org/10.1109/89.260362

M3 - Letter

VL - 2

SP - 194

EP - 205

JO - IEEE Transactions on Speech and Audio Processing

JF - IEEE Transactions on Speech and Audio Processing

SN - 1063-6676

IS - 1

ER -