Skip to content

xbyym/StableWorld

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 

Repository files navigation

StableWorld

Towards Stable and Consistent Long Interactive Video Generation

Ying Yang1, Zhengyao Lv1,2, Tianlin Pan1,3, Haofan Wang4, Binxin Yang5, Hubery Yin6, Chen Li6, Ziwei Liu6, Chenyang Si1

1PRLab, Nanjing University   ·   2The University of Hong Kong   ·   3University of Chinese Academy of Sciences   ·   4LibLib.ai   ·   5WeChat, Tencent Inc.   ·   6Nanyang Technological University


🎥 Video Demo

Video

If the embedded YouTube player is blocked in your region/network, please visit the project page for mirrored videos.


🚀 Release

  • Paper released
  • Project page released
  • Demo videos released
  • Code released

🔥 Why Interactive Video Generation Becomes Unstable?

Long-horizon interactive video generation often suffers from spatial drift and scene collapse.
We find that a major source of instability is error accumulation, especially within the same scene: small drifts accumulate under the same viewpoint, eventually leading to the collapse of the entire scene.

This perspective is different from the commonly discussed error accumulation caused by the train–inference mismatch, and we find that scene collapse is largely driven by this factor.

As illustrated below:

Error accumulation under a fixed viewpoint

Small drifts may seem negligible at first, but when repeatedly accumulated under the same viewpoint, they gradually grow and eventually lead to severe scene collapse.

StableWorld addresses this issue at the root by continuously filtering out degraded frames while retaining geometrically consistent ones, preventing drift from compounding over time.


🧩 The StableWorld Framework

StableWorld is a simple yet effective Dynamic Frame Eviction Mechanism that is model-agnostic and can be plugged into different interactive generation frameworks (e.g., Matrix-Game, Open-Oasis, Hunyuan-GameCraft) to improve stability, temporal consistency, and generalization.


🎬 Visual Results

We provide extensive interactive demonstrations across multiple frameworks:

  • Matrix-Game 2.0
  • Open-Oasis
  • Hunyuan-GameCraft
  • Ultra-long video generation (thousands of frames)
  • Self-Forcing (autoregressive video)

Please see the project page for full videos and side-by-side comparisons:
👉 https://sd-world.github.io/


📚 Citation

If you find this work helpful, please consider citing:

@article{stableworld2026,
  title={StableWorld: Towards Stable and Consistent Long Interactive Video Generation},
  author={Ying Yang and Zhengyao Lv and Tianlin Pan and Haofan Wang and Binxin Yang and Hubery Yin and Chen Li and Ziwei Liu and Chenyang Si},
  journal={arXiv preprint arXiv:2601.15281},
  year={2026}
}

About

StableWorld: Towards Stable and Consistent Long Interactive Video Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages