A Datacenter Scale Distributed Inference Serving Framework
-
Updated
Jul 2, 2026 - Rust
A Datacenter Scale Distributed Inference Serving Framework
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
OpenClaw-RL: Train any agent simply by talking
A GPU cluster manager for high-performance AI model serving (vLLM, SGLang) and on-demand SSH-accessible GPU instances.
A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enabling zero-shot voice cloning from short audio references.
LLM model quantization (compression) toolkit with HW acceleration support for Nvidia, AMD, Intel GPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
Open Source Continuous Inference Benchmark Research Platform — Kimi K2.7-Code, MiniMax M3, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
MOVA: Towards Scalable and Synchronized Video–Audio Generation
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。
Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton
An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale
Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across vLLM, TRT-LLM, TokenSpeed, SGLang, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat history, tokenization caching, Responses API, embeddings, WASM plugins, MCP, and multi-tenant auth.
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Add a description, image, and links to the sglang topic page so that developers can more easily learn about it.
To associate your repository with the sglang topic, visit your repo's landing page and select "manage topics."