PDF

Keywords

Keywords: Language Classification

،
؛Mel-frequency cepstral coefficients (MFCC)
؛Convolutional Neural Networks
؛Deep Learning

Abstract

Language classification systems are used to classify spoken language from a particular phoneme sample and are usually the first step of many spoken language processing tasks, such as automatic speech recognition (ASR) systems Without automatic language detection, spoken speech cannot be properly analyzed and grammar rules cannot be applied, causing failures Subsequent speech recognition steps. We propose a language classification system that solves the problem in the image field, rather than the sound field. This research identified and implemented several low-level features using Mel Frequency Cepstral Coefficients, which extract traits from speech files of four languages (Arabic, English, French, Kurdish) from the database (M2L_Dataset) as the data source used in this research. A Convolutional Neuron Network is used to operate on spectrogram images of the available audio snippets. In extensive experiments, we showed that our model is applicable to a range of noisy scenarios and can easily be extended to previously unknown languages, while maintaining classification accuracy. We released our own code and extensive training package for language classification systems for the community. CNN algorithm was applied in this research to classify and the result was perfect, as the classification accuracy reached 97% between two languages if the sample length was only one second, but if the sample length was two seconds, the classification accuracy reached 98%. While the classification among three languages, the classification accuracy reached 95% if the sample length was only one second, but if the sample length was two seconds, the classification accuracy reached 96%.
https://doi.org/10.33899/edusj.2022.132223.1200
  PDF