Modal’s cover photo
Modal

Modal

Software Development

New York City, New York 26,342 followers

AI needs a new infrastructure layer. We're building it.

About us

Customers rely on Modal for instant GPU access, sub-second container starts, and native storage, so it's simple to serve low-latency inference, fine-tune models, and access production-ready sandboxes at scale. Every era of computing came with new workloads that previous infrastructure couldn't serve: mainframes, databases, the cloud. Each time, the company that rebuilt the layer underneath defined the decade. AI is no different, except it touches everything instead of one slice. The window to build is open right now.

Website
https://modal.com
Industry
Software Development
Company size
51-200 employees
Headquarters
New York City, New York
Type
Privately Held
Specialties
Serverless GPUs, LLM Inference, LLM Fine-Tuning, Generative Model Inference, Generative Model Training, Computational Biology, Audio Generation, Image Generation, Video Generation, Web Scraping, Batch Jobs, Batch Embeddings, Scaling Out, AI Agents, Reinforcement Learning, Sandboxes, and Background Agents

Products

Employees at Modal

View 187 employees at Modal

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

See all employees

Locations

Updates

  • View organization page for Modal

    26,342 followers

    The most demanding problems in life sciences need more than a capable model, they need infrastructure that scales. Today we're announcing our integration with Claude Science, bringing Modal's elastic compute to researchers when they need it. We're committing up to $100K in compute to support academic life sciences research. Apply by July 15. Read more: https://lnkd.in/gf3X5T26

  • Modal reposted this

    We OCR'd 100,000 pages with open-source vision models on Modal in under an hour for about $225. The same job with GPT-5.5 would have cost close to $6,000. The lower-cost proprietary models were cheaper, but they hallucinated enough that it just wasn’t a worthwhile comparison. We wanted to know what it would take to OCR a large document corpus using open-source vision models instead of paying for proprietary APIs. What surprised us the most was that the cheapest GPU per second wasn't the cheapest GPU for the job, in the end. L4s and H100s landed at roughly the same cost per page in some runs, but L4s took 5-6x longer. Timeline should be considered part of the cost. We’ve got a full write-up with methodology, model comparisons, and a side-by-side viewer to judge output quality yourself > https://lnkd.in/gv7VUyT8

    • No alternative text description for this image
  • View organization page for Modal

    26,342 followers

    Low-latency inference demands a new serving primitive: Servers. Modal Servers are designed for applications where every millisecond counts, like LLM inference for interactive agents. Servers give you a regionalized, autoscaling pool of HTTP server replicas behind Modal’s routing layer with the deployment ergonomics, fast feedback loops, and autoscaling you know and love. We get into the specifics of how we built this in our new post: https://lnkd.in/gmDFzzGS

  • Modal reposted this

    We just launched a new product which lets customers set up managed private LLM endpoints in a few clicks (using the UI) or keystrokes (using the CLI). Super excited about this one, in particular because we're taking a slightly different approach than others. We think Modal's big differentiation is in our infrastructure. We built our own file system, container runtime, scheduler, etc. This lets us do some stuff other providers can't do – we can scale up extremely fast (using GPU snapshotting), we can run capacity all over the world for low latency, and many other things. This means we can be super open about the code running inside the containers. When you deploy an endpoint using Modal, you have full access to the code inside. We're running the latest version of vLLM/SGLang rather than a proprietary black box, and you can tweak it yourself. Our approach also lets us work in the open as we improve performance of LLM inference. We are active contributors to projects like vLLM, SGLang, and FA4. We do that because we think everyone benefits from those improvements! However, we do think Modal is the best place to deploy the code, which is why we're launching Auto Endpoints today!

  • Modal reposted this

    📢 We're partnering with Modal to offer a new development and exhibition opportunity for artists with sustained engagements in artificial intelligence and the arts. This global open call seeks proposals for creative projects that demonstrate the intentional use of AI to further artistic expression. Generative media, enabled by models trained at network-scale and proliferated by agentic systems, has opened an expansive toolkit for artists. As aesthetic practices evolve, how can artists help us interpret the new realities unlocked through scaling computation? Against a prevailing narrative where inference equates to automation, homogenization, and slop, we think there’s another story. One where artists harness the inherent creativity of inference to expand perspectives, create novel forms, and suggest alternative ways of sensing the world. Selected artists will receive: 📌 $2,000 honorarium 📌 Up to $5,000 in materials and production stipends 📌 Up to $20,000 value in Modal credits to use for custom model training and cloud GPU compute. 𝗗𝗘𝗔𝗗𝗟𝗜𝗡𝗘: 𝗝𝘂𝗹𝘆 𝟭𝟱 Full details at https://modal.art/

Similar pages

Browse jobs

Funding

Modal 4 total rounds

Last Round

Series B

US$ 87.0M

See more info on crunchbase