Semantic-Aware, Physics-Informed, Geometry-Grounded Weather Video Synthesis

Weather synthesis examples across diverse scenes

Weather synthesis examples. Given an original input video, our method synthesizes diverse weather effects with precise, interpretable control over the type (snowy or rainy), event (static or dynamic), duration, wind direction & strength, and overall severity — yielding diverse and realistic atmospheric appearance and particle dynamics.

Abstract

Weather synthesis aims to add weather effects to input videos while preserving scene identity, structure, and motion. Existing methods lack diversity in weather appearance and effective control over weather dynamics. Most rely on underspecified text prompts, and general-purpose video editors—optimized for clean, aesthetic outputs—tend to suppress heavy weather, making dense particle effects difficult to generate.

We propose a Semantic-Aware, Physics-Informed, and Geometry-Grounded framework that steers an off-the-shelf video editor to synthesize diverse global appearances and detailed particle dynamics. We factorize synthesis into three conditional signals, each a distinct and stable source of guidance: semantics specifies what the weather should look like, dynamics governs how it evolves over time, and geometry determines where it should appear in the scene.

Experiments show our method produces diverse, physically and visually realistic weather effects. Moreover, our synthesized data significantly improves the robustness of autonomous-driving semantic segmentation under adverse weather.

Keywords: Weather Synthesis · Video Editing · Particle Simulation

What if snow fell on Rome and rain over desert?

Method

A structured tri-prior interface

We decompose weather synthesis into three complementary conditioning signals that jointly steer a frozen video diffusion model—no finetuning required.

🌎

Semantics

What it looks like

A VLM parses scene semantics and an LLM reasons about weather-specific effects from user intent, anchoring the target appearance and producing a refined description for generation.

❄️

Dynamics

How it evolves

A particle field of anisotropic Gaussians is evolved under gravity, wind, and turbulence, yielding physically plausible motion that serves as explicit cues for synthesis.

📏

Geometry

Where it appears

Particles are gravity-aligned to the scene and projected with camera intrinsics/extrinsics into particle-augmented depth, ensuring consistent trajectories and accurate placement.

Architecture. From an input video, three modules build structured conditioning—semantic-aware appearance anchoring (VLM/LLM reasoning into an appearance anchor), physics-informed dynamic simulation (a Gaussian particle field under gravity, wind, and turbulence), and geometry-grounded video synthesis (geometry assets, alignment, and particle projection). The resulting semantics, dynamics, and geometry signals jointly steer the video diffusion model to produce the synthesized frames.

Results

Diverse, realistic — and useful downstream

Beyond visual fidelity, our synthesized data acts as a scalable engine for safety-critical corner cases.

Comparisons & downstream benefit. Our method produces denser, more scene-consistent particle effects than text-only baselines, and the resulting synthetic data improves perception robustness under adverse weather.

Citation

BibTeX