Skip to content
View kobe0938's full-sized avatar
🫨
🫨
  • Stanford University
  • Santa Clara
  • 20:52 (UTC -07:00)

Organizations

@mlfoundations @RetroCode-Org

Block or report kobe0938

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kobe0938/README.md

Hi, I'm Kobe πŸ‘‹

πŸš€ Currently Maintaining/Contributing


πŸ› οΈ Previous Projects

Agents & Evaluation

LLM Inference & Serving Infra

Others

  • Continuum β€” Multi-turn LLM agent scheduling with KV-cache time-to-live for efficient serving. Contributor. [paper]
  • VidGen β€” Diffusion + autoregressive models for interactive video/game generation (Diffusive AI).
  • LAG β€” Research experiments.
  • citation-verifier β€” Verifying citations produced by LLM agents (TypeScript).

Pinned Loading

  1. LMCache/LMCache LMCache/LMCache Public

    LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

    Python 10k 1.4k

  2. vllm-project/production-stack vllm-project/production-stack Public

    vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

    Python 2.4k 425

  3. LMCache/lmcache-agent-trace LMCache/lmcache-agent-trace Public

    Agent application/benchmark/workload traces should be placed here.

    Python 14 6

  4. Inference-Engine-Arena/inference-engine-arena Inference-Engine-Arena/inference-engine-arena Public archive

    Postman & Chatbot Arena for inference benchmarking.

    Python 15