Active vision is critical for navigating complex, unstructured environments like agricultural fields, where occlusions, diverse scales, and unknown elements can obscure task-relevant information. This paper investigates the use of deep learning architectures to estimate information gain and expected loss in continuous, multidimensional observation spaces from sequential camera inputs of small environment segments.In such environments, local estimations can be composed on-the-fly to predict the contribution of successive viewpoints, guiding active exploration strategies to efficiently cover the entire area. We compared multi-task architectures with various prediction heads for state estimation, information gain, expected loss, and best view prediction from observation sequences. Our results show that entropy-minimizing and loss-maximizing strategies outperform random sampling, with accuracy improvements of up to 11.9%. However, training multiple model heads simultaneously presented challenges, with convergence issues and training instability depending on the optimization problem formulation.Future work will explore adaptive multi-task training strategies, the impact of dataset size, and whole environment mapping. Our findings demonstrate the potential of deep learning in optimal sampling for complex environments, highlighting the integration of uncertainty estimation models with active vision systems as a promising direction for enhancing decision-making processes in real-world applications.

Masiero, E., Bursic, S., Trianni, V., Vizzari, G., Ognibene, D. (2024). In Search of Compositional Multi-Task Deep Architectures for Information Theoretic Field Exploration. In 20th IEEE International Conference on Automation Science and Engineering, CASE 2024 (pp.612-617). IEEE Computer Society [10.1109/CASE59546.2024.10711675].

In Search of Compositional Multi-Task Deep Architectures for Information Theoretic Field Exploration

Masiero E.;Bursic S.;Vizzari G.;Ognibene D.
2024

Abstract

Active vision is critical for navigating complex, unstructured environments like agricultural fields, where occlusions, diverse scales, and unknown elements can obscure task-relevant information. This paper investigates the use of deep learning architectures to estimate information gain and expected loss in continuous, multidimensional observation spaces from sequential camera inputs of small environment segments.In such environments, local estimations can be composed on-the-fly to predict the contribution of successive viewpoints, guiding active exploration strategies to efficiently cover the entire area. We compared multi-task architectures with various prediction heads for state estimation, information gain, expected loss, and best view prediction from observation sequences. Our results show that entropy-minimizing and loss-maximizing strategies outperform random sampling, with accuracy improvements of up to 11.9%. However, training multiple model heads simultaneously presented challenges, with convergence issues and training instability depending on the optimization problem formulation.Future work will explore adaptive multi-task training strategies, the impact of dataset size, and whole environment mapping. Our findings demonstrate the potential of deep learning in optimal sampling for complex environments, highlighting the integration of uncertainty estimation models with active vision systems as a promising direction for enhancing decision-making processes in real-world applications.
paper
Deep learning; Multi-task learning
English
20th IEEE International Conference on Automation Science and Engineering, CASE 2024 - 28 August 2024 through 1 September 2024
2024
20th IEEE International Conference on Automation Science and Engineering, CASE 2024
9798350358513
2024
612
617
none
Masiero, E., Bursic, S., Trianni, V., Vizzari, G., Ognibene, D. (2024). In Search of Compositional Multi-Task Deep Architectures for Information Theoretic Field Exploration. In 20th IEEE International Conference on Automation Science and Engineering, CASE 2024 (pp.612-617). IEEE Computer Society [10.1109/CASE59546.2024.10711675].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/529022
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact