PDF

Keywords

Speaker Recognition
speaker verification
speaker selection
deep learning

Abstract

Speaker recognition is one of the field topics widely used in the field of speech technology, many research works has been conducted and little progress has been made in the past five to six years, and due to the advancement of deep learning techniques in most areas of machine learning, it has been replaced previous research methods in speaking recognition and verification. The topic of deep learning is now the most advanced solution to verifying and identifying a speaker's identity. The algorithms used are (x-vectors) and (i-vectors) which are considered the baseline in modern work. The aim of this study is to review deep learning methods applied in identifying speakers and tasks for validating older solutions (Gaussian mixture model, Gaussian mixture super vector model and i-vector model) to new solutions using deep neural networks (deep belief network, deep corrective learning network). ) As well as the types of metrics to verify the speaker (cosine distance, probabilistic linear discrimination analysis) as well as the databases used for neural network training (TIMIT, VCTK, VoxCeleb2, LibriSpeech).
https://doi.org/10.33899/edusj.2021.129802.1150
  PDF