Due to the growing interest in speech recognition technologies, several datasets of speech acquired under uncontrolled conditions have been proposed in recent years. The majority of the datasets available to the community are in English, which reduces the possibility of developing and evaluating recognition technologies in languages other than English. In this paper we try to reduce this language-related gap by proposing a dataset for Arabic language speech recognition. The dataset is made available to the community and contains 100 speakers of both genders. Experiments with some of the latest speaker recognition approaches have been performed both with and without a suitable training on the Arabic language. Results suggest that, to effectively develop recognition technologies in other languages, suitable data for that language are necessary to allow at least a transfer learning approach. In particular, such data is crucial when short utterances are considered.
Bianco, S., Celona, L., Khalifa, I., Napoletano, P., Petrovsky, A., Piccoli, F., et al. (2022). ArabCeleb: Speaker Recognition in Arabic. In AIxIA 2021 – Advances in Artificial Intelligence - 20th International Conference of the Italian Association for Artificial Intelligence, Virtual Event, December 1–3, 2021, Revised Selected Papers (pp.338-347). Cham : Springer International [10.1007/978-3-031-08421-8_23].
ArabCeleb: Speaker Recognition in Arabic
Bianco, Simone;Celona, Luigi
;Khalifa, Intissar;Napoletano, Paolo;Piccoli, Flavio;Schettini, Raimondo;
2022
Abstract
Due to the growing interest in speech recognition technologies, several datasets of speech acquired under uncontrolled conditions have been proposed in recent years. The majority of the datasets available to the community are in English, which reduces the possibility of developing and evaluating recognition technologies in languages other than English. In this paper we try to reduce this language-related gap by proposing a dataset for Arabic language speech recognition. The dataset is made available to the community and contains 100 speakers of both genders. Experiments with some of the latest speaker recognition approaches have been performed both with and without a suitable training on the Arabic language. Results suggest that, to effectively develop recognition technologies in other languages, suitable data for that language are necessary to allow at least a transfer learning approach. In particular, such data is crucial when short utterances are considered.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.