Semantic-Aware, Physics-Informed, Geometry-Grounded
Weather Video Synthesis

Chenghao Qian1, Nedko Savov2, Lingdong Kong3, Yeying Jin3, Rui Song5,
Wenjing Li1,4, Zhun Zhong4, Jiaqi Ma5, Gustav Markkula1, Luc Van Gool2

1University of Leeds, UK 2INSAIT, Sofia University “St. Kliment Ohridski” 3National University of Singapore, Singapore 4Hefei University of Technology, China 5University of California, Los Angeles, USA

Weather synthesis examples across diverse scenes
Weather synthesis examples. Given an original input video, our method synthesizes diverse weather effects with precise, interpretable control over the type (snowy or rainy), event (static or dynamic), duration, wind direction & strength, and overall severity — yielding diverse and realistic atmospheric appearance and particle dynamics.

Weather synthesis aims to add weather effects to input videos while preserving scene identity, structure, and motion. Existing methods lack diversity in weather appearance and effective control over weather dynamics. Most rely on underspecified text prompts, and general-purpose video editors—optimized for clean, aesthetic outputs—tend to suppress heavy weather, making dense particle effects difficult to generate.

We propose a Semantic-Aware, Physics-Informed, and Geometry-Grounded framework that steers an off-the-shelf video editor to synthesize diverse global appearances and detailed particle dynamics. We factorize synthesis into three conditional signals, each a distinct and stable source of guidance: semantics specifies what the weather should look like, dynamics governs how it evolves over time, and geometry determines where it should appear in the scene.

Experiments show our method produces diverse, physically and visually realistic weather effects. Moreover, our synthesized data significantly improves the robustness of autonomous-driving semantic segmentation under adverse weather.

Keywords: Weather Synthesis  ·  Video Editing  ·  Particle Simulation

What if snow fell on Rome and rain over desert?

See it in motion

Diverse, temporally coherent weather effects synthesized on real footage.

A structured tri-prior interface

We decompose weather synthesis into three complementary conditioning signals that jointly steer a frozen video diffusion model—no finetuning required.

🌎
Semantics

What it looks like

A VLM parses scene semantics and an LLM reasons about weather-specific effects from user intent, anchoring the target appearance and producing a refined description for generation.

❄️
Dynamics

How it evolves

A particle field of anisotropic Gaussians is evolved under gravity, wind, and turbulence, yielding physically plausible motion that serves as explicit cues for synthesis.

📏
Geometry

Where it appears

Particles are gravity-aligned to the scene and projected with camera intrinsics/extrinsics into particle-augmented depth, ensuring consistent trajectories and accurate placement.

Full framework architecture
Architecture. From an input video, three modules build structured conditioning—semantic-aware appearance anchoring (VLM/LLM reasoning into an appearance anchor), physics-informed dynamic simulation (a Gaussian particle field under gravity, wind, and turbulence), and geometry-grounded video synthesis (geometry assets, alignment, and particle projection). The resulting semantics, dynamics, and geometry signals jointly steer the video diffusion model to produce the synthesized frames.

Diverse, realistic — and useful downstream

Beyond visual fidelity, our synthesized data acts as a scalable engine for safety-critical corner cases.

Comparisons & downstream benefit. Our method produces denser, more scene-consistent particle effects than text-only baselines, and the resulting synthetic data improves perception robustness under adverse weather.

BibTeX

@inproceedings{qian2026weathervid,
  title     = {Semantic-Aware, Physics-Informed, Geometry-Grounded Weather Video Synthesis},
  author    = {Qian, Chenghao and Savov, Nedko and Kong, Lingdong and Jin, Yeying and
               Song, Rui and Li, Wenjing and Zhong, Zhun and Ma, Jiaqi and
               Markkula, Gustav and Van Gool, Luc},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year      = {2026}
}