Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models

Rota, C; Buzzelli, M; van de Weijer, J

doi:10.1007/978-3-031-73254-6_3

In this paper, we address the problem of enhancing perceptual quality in video super-resolution (VSR) using Diffusion Models (DMs) while ensuring temporal consistency among frames. We present StableVSR, a VSR method based on DMs that can significantly enhance the perceptual quality of upscaled videos by synthesizing realistic and temporally-consistent details. We introduce the Temporal Conditioning Module (TCM) into a pre-trained DM for single image super-resolution to turn it into a VSR method. TCM uses the novel Temporal Texture Guidance, which provides it with spatially-aligned and detail-rich texture information synthesized in adjacent frames. This guides the generative process of the current frame toward high-quality and temporally-consistent results. In addition, we introduce the novel Frame-wise Bidirectional Sampling strategy to encourage the use of information from past to future and vice-versa. This strategy improves the perceptual quality of the results and the temporal consistency across frames. We demonstrate the effectiveness of StableVSR in enhancing the perceptual quality of upscaled videos while achieving better temporal consistency compared to existing state-of-the-art methods for VSR. The project page is available at https://github.com/claudiom4sir/StableVSR.

Rota, C., Buzzelli, M., van de Weijer, J. (2025). Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models. In Computer Vision – ECCV 2024 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XII Conference proceedings (pp.36-53). Springer Cham [10.1007/978-3-031-73254-6_3].

Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models

Rota, Claudio;Buzzelli, Marco;van de Weijer, Joost

2025

Abstract

In this paper, we address the problem of enhancing perceptual quality in video super-resolution (VSR) using Diffusion Models (DMs) while ensuring temporal consistency among frames. We present StableVSR, a VSR method based on DMs that can significantly enhance the perceptual quality of upscaled videos by synthesizing realistic and temporally-consistent details. We introduce the Temporal Conditioning Module (TCM) into a pre-trained DM for single image super-resolution to turn it into a VSR method. TCM uses the novel Temporal Texture Guidance, which provides it with spatially-aligned and detail-rich texture information synthesized in adjacent frames. This guides the generative process of the current frame toward high-quality and temporally-consistent results. In addition, we introduce the novel Frame-wise Bidirectional Sampling strategy to encourage the use of information from past to future and vice-versa. This strategy improves the perceptual quality of the results and the temporal consistency across frames. We demonstrate the effectiveness of StableVSR in enhancing the perceptual quality of upscaled videos while achieving better temporal consistency compared to existing state-of-the-art methods for VSR. The project page is available at https://github.com/claudiom4sir/StableVSR.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				poster + paper
			
	Parole chiave
	
				Video super-resolution, Perceptual quality, Temporal consistency, Diffusion models
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				ECCV 2024 18th European Conference - September 29–October 4, 2024
			
	Anno del convegno
	
				2024
			
	Curatori della monografia
	
				Leonardis, A; Ricci, E; Roth, S; Russakovsky, O; Sattler, T; Varol, G
			
	Titolo degli atti
	
				Computer Vision – ECCV 2024
18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XII
Conference proceedings
			
	ISBN del volume degli atti
	
				9783031732539
			
	Collana o serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Data ahead of print o Data prima pubblicazione Online
	
				28-nov-2024
			
	Data di pubblicazione
	
				2025
			
	Numero del volume
	
				15070 LNCS
			
	Pagina iniziale
	
				36
			
	Pagina finale
	
				53
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-031-73254-6_3
			
	URL alternativo
	
				https://link.springer.com/chapter/10.1007/978-3-031-73254-6_3
			
	Fulltext
	
				reserved
			
	Citazione
	
				Rota, C., Buzzelli, M., van de Weijer, J. (2025). Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models. In Computer Vision – ECCV 2024
18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XII
Conference proceedings (pp.36-53). Springer Cham [10.1007/978-3-031-73254-6_3].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Rota-2025-ECCV-VoR.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Tutti i diritti riservati Dimensione 9 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	9 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/527021

Citazioni

ND

ND

Bicocca Open Archive

Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models

Rota, Claudio;Buzzelli, Marco;van de Weijer, Joost

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

Social impact

Bicocca Open Archive

Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models

Rota, Claudio;Buzzelli, Marco;van de Weijer, Joost

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Citazioni

Social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)