Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints

workflow

Abstract. Animating hand-drawn sketches using traditional tools is challenging and complex. Sketches provide a visual basis for explanations, and animating these sketches offers an experience of real-time scenarios. We propose an approach for animating a given input sketch based on a descriptive text prompt. Our method utilizes a parametric representation of the sketch’s strokes. Unlike previous methods, which struggle to estimate smooth and accurate motion and often fail to preserve the sketch’s topology, we leverage a pre-trained text-to-video diffusion model with SDS loss to guide the motion of the sketch’s strokes. We introduce length-area (LA) regularization to ensure temporal consistency by accurately estimating the smooth displacement of control points across the frame sequence. Additionally, to preserve shape and avoid topology changes, we apply a shape-preserving As-Rigid-As-Possible (ARAP) loss to maintain sketch rigidity. Our method surpasses state-of-the-art performance in both quantitative and qualitative evaluations.

Comparison

BibTeX

@conference{Rai2025,
title = {Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints},
author = {Gaurav Rai and Ojaswa Sharma },
url = {https://graphics-research-group.github.io/ESA/},
year = {2025},
date = {2025-02-28},
booktitle = {International Conference on Computer Graphics Theory and Applications (GRAPP)},
keywords = {},
pubstate = {forthcoming},
tppubtype = {conference}
}