Skip to content
View KevinZeng08's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report KevinZeng08

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. inclusionAI/cuLA inclusionAI/cuLA Public

    CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

    Python 527 65

  2. SandAI-org/MagiAttention SandAI-org/MagiAttention Public

    A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

    Python 864 59

  3. flashinfer-ai/flashinfer flashinfer-ai/flashinfer Public

    FlashInfer: Kernel Library for LLM Serving

    Python 5.9k 1.1k

  4. hao-ai-lab/FastVideo hao-ai-lab/FastVideo Public

    A unified inference and post-training framework for accelerated video generation.

    Python 3.8k 370

  5. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 85.1k 18.8k

  6. agent-gpu-skills agent-gpu-skills Public

    Forked from slowlyC/agent-gpu-skills

    Personal Extensions for Agentic GPU Programming Skills

    Python 7 1