Skip to content
View SOTAMak1r's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SOTAMak1r

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SOTAMak1r/README.md

Hi, I'm Jacob Junyi Chen 👋

Talk is cheap. Show me the code. 🧑‍💻

  • Image/Video Generation:

    • VINO: unified visual generator / omni-model (understanding & generation) / world model
    • iMontage: many to many image generation
    • Infinite-Forcing: long video generation
    • Tiny-Sora2: audio-video generation
  • World Models, 3D/4D Generation:

    • DeepVerse: interactive 4d video generation
    • Aether: unified 4d reconstruction and generation
    • GVGEN: text/image-to-3DGS in 7 seconds
    • GST: unified novel view synthesis and camera pose estimation
    • MeshCraft: text/image-to-mesh in 3 seconds (800 face level)
    • CONE: text-to-nerf
  • 3D/4D Reconstruction:

    • Pi3: feed-forward 3d/4d reconstruction w/o reference view
    • WinT3R: feed-forward 3d/4d reconstruction in a streaming manner
    • GigaGS: city-level 3d surface reconstruction
    • CoSurfGS: large-scale 3d surface reconstruction with distributed learning
    • OmniWorld: multi-modal 4d dataset

Pinned Loading

  1. VINO-code VINO-code Public

    A Unified Visual Generator with Interleaved OmniModal Context

    Python 231 3

  2. DeepVerse DeepVerse Public

    DeepVerse: 4D Autoregressive Video Generation as a World Model

    Python 231 10

  3. Infinite-Forcing Infinite-Forcing Public

    Forked from guandeh17/Self-Forcing

    Infinite-Forcing: Towards Infinite-Long Video Generation

    Python 154 5

  4. InternRobotics/Aether InternRobotics/Aether Public

    [ICCV 2025 & ICCV 2025 RIWM Outstanding Paper] Aether: Geometric-Aware Unified World Modeling

    Python 598 10