Bicocca Open Archive

Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts. © 2013 SPIE and IS&T.

Artese, M., Bianco, S., Gagliardi, I., Gasparini, F. (2013). Audio stream classification for multimedia database search. In Multimedia Content and Mobile Devices. SPIE [10.1117/12.2006478].

Audio stream classification for multimedia database search

Artese, M;BIANCO, SIMONE;Gagliardi, I;GASPARINI, FRANCESCA^Ultimo

2013

Abstract

Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts. © 2013 SPIE and IS&T.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				poster + paper
			
	Parole chiave
	
				Audio classification; Multimedia database; Applied Mathematics; Computer Science Applications1707 Computer Vision and Pattern Recognition; Electrical and Electronic Engineering; Electronic, Optical and Magnetic Materials; Condensed Matter Physics
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				Multimedia Content and Mobile Devices
			
	Anno del convegno
	
				2013
			
	Curatori della monografia
	
				Snoek, CGM
			
	Titolo degli atti
	
				Multimedia Content and Mobile Devices
			
	ISBN del volume degli atti
	
				9780819494405
			
	Collana o serie
	
				PROCEEDINGS OF SPIE, THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING
			
	Data di pubblicazione
	
				2013
			
	Numero del volume
	
				8667
			
	Article number
	
				86670G
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1117/12.2006478
			
	Fulltext
	
				none
			
	Citazione
	
				Artese, M., Bianco, S., Gagliardi, I., Gasparini, F. (2013). Audio stream classification for multimedia database search. In Multimedia Content and Mobile Devices. SPIE [10.1117/12.2006478].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/56717

Citazioni

1

0

Social impact