Bicocca Open Archive

Active vision is critical for navigating complex, unstructured environments like agricultural fields, where occlusions, diverse scales, and unknown elements can obscure task-relevant information. This paper investigates the use of deep learning architectures to estimate information gain and expected loss in continuous, multidimensional observation spaces from sequential camera inputs of small environment segments.In such environments, local estimations can be composed on-the-fly to predict the contribution of successive viewpoints, guiding active exploration strategies to efficiently cover the entire area. We compared multi-task architectures with various prediction heads for state estimation, information gain, expected loss, and best view prediction from observation sequences. Our results show that entropy-minimizing and loss-maximizing strategies outperform random sampling, with accuracy improvements of up to 11.9%. However, training multiple model heads simultaneously presented challenges, with convergence issues and training instability depending on the optimization problem formulation.Future work will explore adaptive multi-task training strategies, the impact of dataset size, and whole environment mapping. Our findings demonstrate the potential of deep learning in optimal sampling for complex environments, highlighting the integration of uncertainty estimation models with active vision systems as a promising direction for enhancing decision-making processes in real-world applications.

Masiero, E., Bursic, S., Trianni, V., Vizzari, G., Ognibene, D. (2024). In Search of Compositional Multi-Task Deep Architectures for Information Theoretic Field Exploration. In 20th IEEE International Conference on Automation Science and Engineering, CASE 2024 (pp.612-617). IEEE Computer Society [10.1109/CASE59546.2024.10711675].

In Search of Compositional Multi-Task Deep Architectures for Information Theoretic Field Exploration

Masiero E.;Bursic S.;Trianni V.;Vizzari G.;Ognibene D.

2024

Abstract

Active vision is critical for navigating complex, unstructured environments like agricultural fields, where occlusions, diverse scales, and unknown elements can obscure task-relevant information. This paper investigates the use of deep learning architectures to estimate information gain and expected loss in continuous, multidimensional observation spaces from sequential camera inputs of small environment segments.In such environments, local estimations can be composed on-the-fly to predict the contribution of successive viewpoints, guiding active exploration strategies to efficiently cover the entire area. We compared multi-task architectures with various prediction heads for state estimation, information gain, expected loss, and best view prediction from observation sequences. Our results show that entropy-minimizing and loss-maximizing strategies outperform random sampling, with accuracy improvements of up to 11.9%. However, training multiple model heads simultaneously presented challenges, with convergence issues and training instability depending on the optimization problem formulation.Future work will explore adaptive multi-task training strategies, the impact of dataset size, and whole environment mapping. Our findings demonstrate the potential of deep learning in optimal sampling for complex environments, highlighting the integration of uncertainty estimation models with active vision systems as a promising direction for enhancing decision-making processes in real-world applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Deep learning; Multi-task learning
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				20th IEEE International Conference on Automation Science and Engineering, CASE 2024 - 28 August 2024 through 1 September 2024
			
	Anno del convegno
	
				2024
			
	Titolo degli atti
	
				20th IEEE International Conference on Automation Science and Engineering, CASE 2024
			
	ISBN del volume degli atti
	
				9798350358513
			
	Collana o serie
	
				IEEE INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING
			
	Data di pubblicazione
	
				2024
			
	Pagina iniziale
	
				612
			
	Pagina finale
	
				617
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1109/CASE59546.2024.10711675
			
	Fulltext
	
				none
			
	Citazione
	
				Masiero, E., Bursic, S., Trianni, V., Vizzari, G., Ognibene, D. (2024). In Search of Compositional Multi-Task Deep Architectures for Information Theoretic Field Exploration. In 20th IEEE International Conference on Automation Science and Engineering, CASE 2024 (pp.612-617). IEEE Computer Society [10.1109/CASE59546.2024.10711675].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/529022

Citazioni

0

ND

Social impact