We propose a new representation of distance information that is independent from any specific acquisition device, based on the size of portrayed subjects. In this alternative description, each pixel of an image is associated with the size, in real life, of what it represents. Using our proposed representation, datasets acquired with different devices can be effortlessly combined to build more powerful models, and monocular distance estimation can be performed on images acquired from devices that were never used during training. To assess the advantages of the proposed representation, we used it to train a fully convolutional neural network that predicts with pixel-precision the size of different subjects depicted in the image, as a proxy for their distance. Experimental results show that our representation, allowing the combination of heterogeneous training datasets, makes it possible for the trained network to gain better results at test time.
Bianco, S., Buzzelli, M., Schettini, R. (2019). A unifying representation for pixel-precise distance estimation. MULTIMEDIA TOOLS AND APPLICATIONS, 78(10), 13767-13786 [10.1007/s11042-018-6568-2].
A unifying representation for pixel-precise distance estimation
Bianco, Simone;Buzzelli, Marco
;Schettini, Raimondo
2019
Abstract
We propose a new representation of distance information that is independent from any specific acquisition device, based on the size of portrayed subjects. In this alternative description, each pixel of an image is associated with the size, in real life, of what it represents. Using our proposed representation, datasets acquired with different devices can be effortlessly combined to build more powerful models, and monocular distance estimation can be performed on images acquired from devices that were never used during training. To assess the advantages of the proposed representation, we used it to train a fully convolutional neural network that predicts with pixel-precision the size of different subjects depicted in the image, as a proxy for their distance. Experimental results show that our representation, allowing the combination of heterogeneous training datasets, makes it possible for the trained network to gain better results at test time.File | Dimensione | Formato | |
---|---|---|---|
2018d_CAMERA_A_unifying_representation_for_pixel_precise_distance_estimation.pdf
Solo gestori archivio
Descrizione: Articolo principale
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
4.48 MB
Formato
Adobe PDF
|
4.48 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.