Junyi Chen SOTAMak1r

Talk is cheap. Show me the code. 🧑‍💻

Image/Video Generation:
- VINO: unified visual generator / omni-model (understanding & generation) / world model
- iMontage: many to many image generation
- Infinite-Forcing: long video generation
- Tiny-Sora2: audio-video generation
World Models, 3D/4D Generation:
- DeepVerse: interactive 4d video generation
- Aether: unified 4d reconstruction and generation
- GVGEN: text/image-to-3DGS in 7 seconds
- GST: unified novel view synthesis and camera pose estimation
- MeshCraft: text/image-to-mesh in 3 seconds (800 face level)
- CONE: text-to-nerf
3D/4D Reconstruction:
- Pi3: feed-forward 3d/4d reconstruction w/o reference view
- WinT3R: feed-forward 3d/4d reconstruction in a streaming manner
- GigaGS: city-level 3d surface reconstruction
- CoSurfGS: large-scale 3d surface reconstruction with distributed learning
- OmniWorld: multi-modal 4d dataset

Provide feedback