Cross-lingual deep learning model for gender detection

Kavita Jhajharia, Ginika Mahajan, Dhondi Samrudh, Koustubh Patel

Abstract


Speech recognition is transforming the way humans interact with technology and automatic gender recognition is an essential part of this evolution. This study develops a multilingual deep learning (DL) model for gender detection using three audio datasets: RAVDESS (English), Berlin EmoDB (German), and IITKGP-SEHSC (Hindi). These datasets provide linguistic diversity, enabling the development of a multi-lingual gender identification model. The mel-frequency cepstral coefficients (MFCC) and VGGish embeddings and other audio features were used to process raw audio data into something meaningful. The findings show the machine learning (ML) models (random forest (RF) and extreme gradient boosting) achieved good results in the monolingual (98.26% using Hindi and 96.90% using cross-lingual) setup. In DL models, convolutional neural network (CNN) outperformed other models in both monolingual and cross-lingual scenarios, with 99.33% accuracy for Hindi and 98.11% accuracy in cross-lingual setup. These findings show how well DL works for gender detection in multilingual and emotionally complex settings. It outperforms traditional models. The experiment describes the potential of DL in speech-based artificial intelligence (AI) applications, which enhances the performance in real-life scenarios.

Keywords


Deep learning; Gender detection; Machine learning; Neural networks; Voice recognition

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v15i3.10985

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats

Bulletin of Electrical Engineering and Informatics (BEEI)
ISSN: 2089-3191, e-ISSN: 2302-9285
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).