What’s your go-to local model right now?
Unsloth AI
Technology, Information and Internet
San Francisco, California 41,624 followers
Making AI accessible for everyone! 🦥
About us
Making open-source AI more accessible.
- Website
-
https://unsloth.ai
External link for Unsloth AI
- Industry
- Technology, Information and Internet
- Company size
- 11-50 employees
- Headquarters
- San Francisco, California
- Type
- Privately Held
- Founded
- 2023
- Specialties
- artificial intelligence, ai, llms, language models, finetuning, and open-source
Employees at Unsloth AI
Locations
-
Primary
Get directions
San Francisco, California 94114, US
Updates
-
1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5 We gave 3 models the same prompt and compared one-shot outputs. 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s. Which output do you like best? GGUF: https://lnkd.in/gGTazQW5 Guide: https://lnkd.in/grTPKWeY
-
You can now run Kimi K2.7 Code locally! 🌘 We shrank the 1T model to 325GB (-48%) via Dynamic 2-bit where important layers are upcasted. Run at >40 tok/s on 330GB RAM/VRAM setups. Run full precision on 610 GB. Guide: https://lnkd.in/gWURBXEn GGUF: https://lnkd.in/gHxy9x8u
-
-
MiniMax M3 can now be run locally! 🔥 The new 428B (23B active) open model has 1M context and performs on par with Gemini 3.1 Pro. Run the Dynamic 2-bit GGUF on 138GB RAM/VRAM or 3-bit on 165GB. GGUFs on Hugging Face: https://lnkd.in/gfA3Z36r Guide: https://lnkd.in/ggkJkXat The MiniMax GGUFs and implementation are experimental.
-
-
Unsloth AI reposted this
Google DiffusionGemma can now run at 2000+ tokens/sec! ⚡ We made local DiffusionGemma inference 1.8× faster. Run it on 18GB RAM via Unsloth AI Studio. GitHub: https://lnkd.in/gyaDBTxK Guide: https://lnkd.in/gTMpbiEH
-
2-bit Google Gemma 4 12B GGUF, only 4.66 GB on disk, managed to cite 15 sites from a single prompt. 🔥 Try this locally on >6GB RAM via Unsloth Studio GitHub: https://lnkd.in/dcqhW9Vv
-
You can now train 120B+ parameter models locally on a laptop! 🔥 We collabed with NVIDIA and Microsoft to bring LLM training on the 128GB unified memory RTX Spark laptop! GitHub: https://lnkd.in/dcqhW9Vv
-
-
Unsloth AI reposted this
I’ll be speaking and hosting two RL workshops at Microsoft Build next week! 🚀 I’ll cover 2026 RL fundamentals and show how to RL on your local laptop with AMD and Unsloth AI. RL Workshop - Tue, Jun 2: https://lnkd.in/g9QGcCW7 RL Workshop - Wed, Jun 3: https://lnkd.in/g99V6adm I’ll also be in a panel discussion with Mark Saroufim and Rob Ferguson on Wed, Jun 3.
-
-
4-bit Qwen3.6 MTP GGUF managed to search 70+ sites from a single prompt. Try this locally on 20GB RAM via Unsloth Studio. Unsloth now supports automatic MTP + speculative decoding for supported models. Unsloth also now auto-selects the best MTP settings for your specific device (Mac, CPU, GPU etc.) GitHub: https://lnkd.in/dcqhW9Vv
-
Unsloth joins the PyTorch Ecosystem. Joining the PyTorch Ecosystem is an exciting step. We’ve already collaborated with the PyTorch team on several projects. The ecosystem will help us reach more people in the PyTorch community and give us greater access to resources, support, and opportunities to collaborate. Beyond that, nothing changes. Unsloth will remain as an independent open-source project, separate from the PyTorch Foundation. We’ll keep building open-source projects and releasing new features, models, optimizations, bug fixes, our desktop app, and broader hardware support - all while continuing to listen closely to feedback from you guys. And of course, your contributions will remain an essential part of what we build. Blog: https://lnkd.in/gEneZHyi
-