DEV Community: Daily Context

Fable is Back, Baby.

dev.to staff — Wed, 01 Jul 2026 14:53:47 +0000

Fable is back. The Commerce Department announced yesterday it has lifted the export controls it slapped on Anthropic's newest model, and access returns this week. However, nobody's spelled out the fine print yet. Early reports point to credit gating and ID checks.

The reprieve isn't the story, though — the speed is. A frontier model vanished from every cloud on an unexpected government order. Washington leaned on OpenAI days earlier, too, pushing it to limit its flagship to vetted partners. That's not a one-off. U.S. policy on AI tools is shifting fast.

Here's the takeaway for builders. The whims of the federal government now outrank any lab's roadmap, and one letter from Commerce proved it. Stop counting on a steady drip of bigger, better models. The open-weight models nobody can revoke start looking a lot more strategic. Power could be drifting away from the big U.S. labs.

So we wait with bated breath for Fable to come back — and our productivity right along with it.

The Vertical Turn

Kara Silverman — Wed, 01 Jul 2026 14:41:07 +0000

Look at tomorrow's track list and count the rooms that have absolutely nothing to do with engineering. Go ahead, I'll wait.

Thursday’s schedule is running 12 parallel tracks. The AI in Healthcare, AI in Finance, and AI in GTM are brand new this year. On top of that, there are entirely dedicated audiences for founders, researchers, and data buyers. Trust me, that is not a scheduling decision. That is a deliberate segmentation decision that reflects where this industry is headed.

For three years, this conference spoke to one very specific audience: the AI engineer. The person writing the agent, wiring the retrieval pipeline, running the eval suite. That audience built a discipline from scratch and gave it a name. But this year, the organizers looked at who was actually hitting the registration page. People from companies like Vanguard, CVS Health, Intuitive Surgical, Two Sigma, Capital One, and more. It’s a marked expansion and a recognition of AI engineers existing not just at tech companies but in other areas of work as well.

Tomorrow's Healthcare track is running sessions on diagnostics, drug discovery, clinical workflows, and regulatory constraints. Finance covers trading, risk assessment, fraud detection, and compliance. GTM sits right next to both of them. These aren’t fluff panels about "the future of AI.” They are actual, practical working sessions for people who need AI to clear a compliance review, reduce false positives in a fraud model, or shorten a sales cycle. These are people who do not care what framework you used.

This is what real adoption looks like when it stops being theoretical. It’s not a bigger model or flashier demo. It’s a room full of clinicians, traders, and go-to-market operators who showed up because the tool finally got specific enough to matter to them.

From Harness Engineering to Evals: What’s Trending at AI Engineer

Ben Halpern — Wed, 01 Jul 2026 14:34:35 +0000

I’m at the AI Engineer conference in San Francisco this week. The event has every major brand-name sponsor you’d expect, a lineup of internet-famous project maintainers on stage, and a massive schedule covering which more or less has something for everyone. It’s easy to get lost in the noise. I spent my time trying to figure out what themes are actually real.

With dozens of tracks and thousands of builders, the ecosystem looks incredibly fractured. But if you look at what engineers are actually putting into production, the chaos collapses into a clear pattern. The industry is moving past simple chat interfaces and treating large language models like central processing units inside a larger, highly structured software architecture—essentially an LLM Operating System.

I cataloged everything I was seeing, dug into the technical tracks, and came away with these six themes. This is not my endorsement, and I have not separated the hype from the real. Take these brief summaries as jumping-off points to help you go deeper if any of these ideas trigger your curiosity.

1. The Shift to Repository-Scale “Software Factories”

For the last few years, AI in development was basically tab-complete. You wrote a line of code, an assistant suggested the next few tokens, and you moved on.

That single-file approach is quickly becoming obsolete. The focus has shifted to repository-scale, multi-agent systems—what people are calling Software Factories.

Instead of writing lines of code alongside an AI assistant, developers are managing fleets of agents that operate across entire codebases. For example, Uber shared details on uReview, their internal code review engine. It uses agents to autonomously review pull requests, spin up localized test suites, catch edge cases, and commit fixes back to the branch before a human even looks at it.

To make this reliable, engineers are plugging compilers and linters directly into the agent’s feedback loop. If the generated code fails to compile, the raw error output is fed right back into the system prompt. The model reads its own error, fixes the bug, and re-runs the check autonomously.

2. Hardening Systems with “Harness Engineering”

There’s a common realization on the conference floor right now: “Everyone is building an agent harness, but nobody calls it that.”

LLMs are inherently probabilistic and non-deterministic. Software infrastructure, however, requires predictable inputs and outputs. To fix this, teams are formalizing a core systems discipline: Harness Engineering.

The “harness” is the strict software wrapper built around a model to enforce constraints, manage state, and prevent infinite execution loops.

+--------------------------------------------------------+
| THE AGENT HARNESS |
+--------------------------------------------------------+
| 1. Durable Execution (State preservation & retries) |
+--------------------------------------------------------+
| 2. Structured Outputs (Schema enforcement / Pydantic) |
+--------------------------------------------------------+
| 3. Dynamic Guardrails (Input/Output sanitization) |
+--------------------------------------------------------+

Instead of letting an agent run unmonitored, developers are using toolchains like Temporal or Inngest to implement durable execution. If an agent is running a complex, multi-hour workflow and hits a network timeout, the harness preserves its memory and state. The process can resume exactly where it failed without repeating expensive API calls. Paired with libraries like Pydantic or Instructor to force strict JSON schema compliance, the harness makes unpredictable models behave like stable infrastructure.

3. Computer Use vs. Custom APIs

For decades, integration meant writing custom API connectors or scraping endpoints. A major theme this year is Computer Use—building agents that navigate software exactly like a human operator does: by looking at a screen, moving a mouse, and typing commands.

Enabled by better vision-language models (VLMs), these systems don’t need structured backend APIs. They take continuous screenshots of a graphical user interface (GUI), parse the visual layout to locate fields and buttons, and execute precise pixel coordinates.

This has forced a shift in local developer setups. Engineers are building isolated, sandboxed terminals and open-source desktop companions (like OpenClaw) that give background agents their own virtual environments. This lets agents spin up local servers and debug files in isolation without taking over the engineer’s active screen and keyboard.

4. Context Engineering & “Tokenmaxxing”

Context windows have scaled to millions of tokens, but dumping an entire codebase into a prompt is an expensive, high-latency anti-pattern.

Time-to-first-token and API costs are the real bottlenecks today. Because of this, developers are focusing heavily on Context Engineering—treating the context window as a highly optimized, dynamic memory cache rather than a static text dump.

The optimization strategy generally follows a three-layer approach:

Prefix Caching: Inference engines like vLLM cache the Key-Value (KV) states of static system instructions or documentation headers. Subsequent requests reuse this cache, significantly cutting down latency and cost.
Context Compression: Middleware layers are introduced to run semantic compression algorithms, pruning irrelevant tokens and summarizing messy chat logs before sending data to the provider.
Graph RAG & Hybrid Retrieval: Instead of pulling raw text blocks indiscriminately, systems use structured knowledge graphs to pass only high-signal data into the active context window.
Finish reading at link.dev.to/aie39.

5. Moving Past “Vibe-Based” Evaluations

If there is one clear operational shift, it’s that vibe-based engineering is dead. Reviewing a few outputs, deciding they look reasonable, and shipping them to production is no longer an acceptable practice.

The core focus of the Evals community is on automated, multi-step simulation benchmarks. Evaluating an agent now requires spinning up an isolated virtual environment—a temporary sandbox with mock databases and network access—and letting the agent attempt a complex task. The evaluation framework doesn't grade the style of the response; it checks if the task was completed successfully, notes how many steps it took, and verifies that no security protocols were broken.

Engineers are also moving away from the “Persona Trap”—giving a model a prompt like “You are a senior staff engineer.” Studies shared at the event show this approach evaluates a stylistic vibe rather than a rigorous technical capability, often introducing silent biases that degrade performance. The standard now is rigid, task-oriented testing.

6. Secure Micro-Sandboxes for Runtime Safety

Giving an agent the authority to write code, modify files, and run terminal commands introduces severe security risks.

Platform engineers are tackling this by focusing on the underlying execution layer. The industry standard has normalized around Micro-Sandboxes. Agent-generated code is executed inside lightweight, ephemeral micro-VMs (like those from E2B or Docker) that spin up in milliseconds, handle the specific computation, and are immediately destroyed to prevent container escape or persistent file system tampering.

There is also a major push toward credential masking. When agents need access to enterprise databases or third-party tools, engineers are using new delegation layers like the AAuth protocol. This grants the agent mission-bounded authority to call a tool, but prevents the agent from ever seeing or interacting with the raw API keys, neutralizing prompt injection leaks.

The Bottom Line

It’s easy to skim these topics, feel a wave of FOMO, and think you’re already lagging behind if you aren’t running a fleet of micro-sandboxes or an autonomous software factory.

Don’t buy into the hype. You don’t need to overhaul your entire stack by next Monday.

The real takeaway from all the noise at Moscone is actually pretty reassuring: AI is just becoming regular software infrastructure. The developers who build useful things over the next few years won't be the ones chasing every flashing new model drop or complex multi-agent framework. They’ll be the ones applying basic, boring engineering principles—making their inputs predictable, testing their code rigorously, and keeping their environments secure.

If you're looking for a place to start, don’t overcomplicate it. Pick a single, repetitive workflow in your day-to-day. Wrap a clean, defensive code harness around it, build a straightforward evaluation script to check its work, and see what happens. Inspiration is great, but pragmatism is what actually ships.

Optimizing for Agents with llms.txt

Ryan Palo — Wed, 01 Jul 2026 14:14:25 +0000

If you’ve spent any time poking around the AIE World’s Fair 2026 website, you may have come across the llms.md page. If you’ve clicked the link, you may have an idea of the page’s purpose already. There is a distinct shift from energetic copy text, neat layouts, and advanced styling to focused, accurate details and well-labeled links. This whole page, while being a good resource for you, is not designed for you. It’s designed for AI.

It follows a standard proposal called the “/llms.txt file,” which you can read about at llmstxt.org. In fact, even though the AIEWF 2026 site has its page at llms.md, if you visit llms.txt instead, it handles it smoothly, redirecting to the llms.md page. The idea is that, rather than filling up an AI’s context window with footers, navs, sidebars, and styles, it is a focused and simple entry point for AIs visiting a site. The entry llms.txt (or .md) contains the minimum necessary information to use the site as well as clear links pointing to where the AI can get more information specifically about the topics it needs. A common choice is to link to “/llms-full.txt,” which has much more detail (which the AIEWF 2026 site does). Another recommendation from the standard itself is to make all content pages mirror a simple text version with an appended “.md” suffix for the same reasons (e.g., “posts.html” becomes “posts.html.md”).

The recommended layout looks something like this: one H1 header with the overall site/project title, a blockquote with a short summary of the most important information, zero or more markdown blocks with more details, followed by zero or more markdown sections with H2 headers containing lists of URLs pointing to further detail.

To be clear, these minimal, AI-first text files are simply a proposal, not a standard, and there is mixed adoption. Most chat-based tools have not committed to looking for llms.txt by default. Some people claim that they are “a solution in search of a problem.” This is reasonable, as the primary selling point is reduction of context clutter, and context efficiency is currently a popular research topic. (Look at how many talks at AIE 2026 are about context!) However, these files are relatively painless to autogenerate — although most aren’t quite as pretty as ai.engineer’s — and coding tools like Cursor and others actually do claim to reference them, especially when looking up library documentation. Their use does seem to be trending upward. Google even announced it as a new Lighthouse signal under the new Agentic Browsing category in May of this year, and it seems likely more tooling and standardization are coming in this area.

Every source on the subject, however, seems to agree on one thing that is the most important: llms.txt files are not replacements for current standards like robots.txt or sitemap.xml files. They are best when used together. A quick conversation with Gemini revealed that it was able to make use of the AIEWF 2026 llms.md file because it was already indexed by Google and showed up specifically in the search results. So, actions like listing it in your sitemap, ensuring crawlers can see it, and linking to it from landing pages will go a long way toward helping the AIs that use your site.

Computer Use Is Still The Best Demo In AI. That’s A Problem.

Ryan Swift — Wed, 01 Jul 2026 13:56:06 +0000

Computer use is still the most engaging demo in AI today. Typing a request in plain language and then seeing an agent independently navigate an obtuse website, test code end-to-end, or complete a form feels like witnessing an automated future. There's something off to me here, though.

Computer and browser use demos highlight a limitation in how we are currently designing AI interactions. There are undeniable engineering achievements behind browser and computer use models, workflows, and tools. But these workflows feel inherently retrofitted. They're an attempt to force a fundamentally new paradigm into a legacy form factor. Perhaps dramatically, it feels like the modern equivalent of tapping out text messages in Morse code.

For three years, typed chat has served as our default gateway to AI. I'm not convinced that typed chat will be the long-term interface for advanced intelligence. Using voice dictation to control computer use agents feels like a step in the correct direction. But forcing AI agents to mimic human mouse movements, keystrokes, and inputs is a temporary bridge rather than the final destination.

As the AI industry moves beyond initial adoption, new approaches to interface and interaction design should be a focus. As great as the demos are, I hope next year's frontier demos are wildly different from what we have today.

Bottleneck Resolution is, In Fact, All the Rage in AI Engineering

Ben Halpern — Wed, 01 Jul 2026 13:51:51 +0000

The AI Engineer World's Fair here in San Francisco is fundamentally a conference for practitioners — devs who need to be productive today. While it naturally attracts folks operating on the absolute cutting edge, at the end of the day, most of us are just developers trying to ship. I touched on this in my observations yesterday.

Before arriving, I wrote abstractly about navigating unavoidable bottlenecks in an era of infinite code. This week, I'm seeing that theory play out in real time across the keynotes. This morning, OpenClaw creator Peter Steinberger took the stage to talk about bottlenecks in a deeply pragmatic way. The AI space is inherently glitzy, but the dialogue is finally grounding itself in the messy, practical in-between parts where things actually get done.

Bottleneck resolution is boring, but it's good boring. I don't need keynotes that promise a future where friction magically vanishes; I want the patterns and engineering discipline required to solve the constraints we have right now. It's the exact same thing we've always wanted from traditional software eras — and it's exactly where we've arrived with AI.

Visitors get some serious puppy love

Iain Thomson — Wed, 01 Jul 2026 13:47:36 +0000

There may be robots aplenty on the AIE expo floor, but many delegates have been drawn to a more mammalian exhibit. In the expo hall, the charity Puppie Love is letting visitors get some serious petting time with its young canine charges.

Visitors spend time with the dogs, but grabbing one is strictly forbidden. However, the puppies are keen to play, and the fluffball to the right of the picture should be named Hugging Face, given the amount of face licking going on.

Two of the puppies have been adopted, but a representative said the process is more difficult as many people are traveling to the show. Attendees can't just pop a pup into their swag bag, as the adoption paperwork understandably takes a few weeks.

The charity tours trade shows across the U.S. seeking adoptees for its young charges, with considerable success in finding happy families for those in need.

Token Town

Ryan Palo — Wed, 01 Jul 2026 13:44:58 +0000

Many of the sessions from Tuesday, especially on the main stage, revolved around the idea of software factories. Coordinating agents with other agents that check even more agents' work, with humans in an overseer position. Currently, many organizations’ systems are not quite factory-ready, with agents doing work but only as directed and subsequently verified by humans. A critical part of the software factory paradigm is its ability to provide trust by being able to review itself. Those larger orchestration and evaluation tasks may need to use more powerful models in order to provide autonomous output while building the necessary trust with the humans in and above the loop. But do all of the agents in the factory have the same requirements?

Sarah Sachs from Notion gave an incredible talk about just that, and if you missed it, make a note to go watch the recording. The main stage was livestreamed; you could watch it right now. Her main point was this: Most of us don't work for a frontier AI lab. A large fraction of the things we do, both in our own work and in our products, don't require the biggest, hottest, most token-hungry models. If we lock ourselves into one AI vendor or one hard-coded AI model, we're doing ourselves, our shareholders, and our customers a disservice.

At this point, even the models that aren't bleeding edge from any vendor are pretty decent and good enough to do simple to medium tasks like summarization. Using Opus to summarize a Slack thread is a bad choice. You likely don't need any model to convert a file from one type to another. As AI is getting better, it's also getting more expensive. Price is only going up, as are other social and environmental costs, and "simply throwing more tokens at it" is not a viable move for most companies.

Sarah's advice was to build smartly and modularly enough to avoid vendor and model lock-in. Switch models and vendors on a dime when the billing shifts to make the right choice for your customers. Try out open-weight models, especially for those midrange tasks where "good enough" is good enough. Balance the needs of all the different agents in your “factory,” make the "tokenomics" work for you, and it will help keep the costs a little more balanced for everybody else, too.

AI is going loopy, but in a good way

Iain Thomson — Wed, 01 Jul 2026 13:38:05 +0000

As you’d expect, the opening keynote of the AI Engineer World’s Fair was kicked off by one of its co-founders, @swyx (Shawn Wang), and he was in a poetic mood.

“In the beginning, there was the token, then there was the chat,” he said. “Then we're going to use tools, then we learn to set goals of skipping a few steps, and these days for all of the automations for all of the products. There's a lot of loops happening.”

At a basic level, AI agent loops work by having the system evaluate its own intermediate output — checking it against the task's success criteria or running it through an evaluator step — rather than simply returning the first response. If the evaluation indicates the task isn't complete, the system makes additional calls to the LLM, incorporating tool results or prior output, and repeats until the task is judged done, without needing a human to intervene at each step.

These loops can also compound over time: As employees correct and refine the system's outputs, those interactions can be captured to improve future performance — not just within a single task, but across the organization's use of the system.

Microsoft CEO Satya Nadella made this case in a LinkedIn post two weeks ago, framing it as something companies should own rather than cede to AI vendors: "This loop becomes the new IP of the firm. I think of it as a hill climbing machine. And unlike most assets, it compounds."

Unsurprisingly, Pablo Castro, a distinguished engineer at Microsoft, agrees with his boss. He claimed that as loops have been deployed within the company, they have produced major increases in the accuracy of AI outputs, and it has built a component called Agent Optimizer in its Azure AI Foundry platform for customers to use.

“We want to make it easy for you to integrate them into agents you're building,” he explained.

“We do this from our agent platform that starts in GitHub, where we all go and build as a contextualization system, so you can ground your agents. When it comes to hosting some mobility and management, we do all of these in Foundry. We offer thousands of models in our model catalog there, so you can pick whatever is the right model for the right task, and we will keep adding more every day.”

‘There has never been a better time to be an engineer’

During OpenAI’s keynote, Romain Huet, its head of developer experience, said that technologies like loops have dramatically increased productivity within the company. Previously, OpenAI was putting out new models every 15 months, but now it is taking six weeks.

“There has never been a better time to be an engineer, because engineering was never about writing code. Engineering has always been about solving problems for yourself and for other people as well,” he enthused.

“It's about taking the latest science and combining it with design, taste, judgment, and most of all, imagination to make something that people can actually use. We think it's a return to the roots of engineering and the technology we're building on is accelerating, getting faster and faster.”

Peter Steinberger, founder of OpenClaw, was brought on stage to also sing the praises of loops, saying they have completely changed the way he works. In January, Steinberger recounted, he was juggling 10 terminal windows at a time. Now he has a management system that clears away the easy ones and allows him to deal with harder problems.

This doesn’t entirely solve bottlenecks in a project, however, he explained. “Last year, I was primarily constrained by tokens — I fixed it by joining OpenAI,” which caused some audience mirth. “Now I'm primarily constrained by attention, and unlike tokens or compute, I can't simply add more of it. So the most important skill today is deciding where to spend it.”

Training loops are absolutely crucial, OpenAI’s Alexander Embiricos, the head of product for Codex, argued in his section of the keynote. Users need to train the system about not only the work that is required, but why it has to be done. The end results allow faster prototyping and give a major productive boost.

Speeding up the learning process

In his keynote, just to show that there were no hard feelings, Thom Wolf, co-founder of Hugging Face, brought on stage someone who he’d tried to hire, but who had turned him down. Olive Son is the research lead at MiniMax, a company Wolf described as “one of the top of what we call the AI dragons in China.”

Last month, MiniMax released M3, an open weight large language model that can work from text, image, and video inputs. Son said that the video angle was particularly important, claiming that M3 could actually process YouTube videos and learn from them.

“We know that a lot of labs run into problems doing that — the model would collapse after a couple of training steps with both text and visual understanding, but we managed to solve that problem,” she explained.

“We did a lot of work on VIP (Value-Implicit Pre-training), and we did a lot of work on the data that we're actually training. For example, we do what we call interleaved data — it's actually natural data — but we keep the images and videos in instead of masking it out, and we do some pretty good cleaning and masking on the data, and we do very good reward modeling, so that we train it from the first step and it scales up a lot.”

This kind of model was envisaged by Yan Junjie before the company even started, Olive said. It’s the most recent open weight multimodal model — meaning people can use it, but not access the training data, training code, or inference operators. But the support of the open source community was important, she said, and could make the model get better than it is now — and MiniMax can use that knowledge to improve future model builds.

Security track makes its debut

Finally, the first keynote session concluded with the announcement of an entirely new track for the Fair. Randall Degges, vice president of engineering and developer relations at AI security company Snyk, said the conference would now have a dedicated security track.

He acknowledged that there are security problems with the technology, but said that these are often overhyped — not least by governments. There were cheers from the audience when he cited the recent blocking of Anthropic's Fable 5 and Mythos 5, and the limited release of OpenAI GPT 5.6.

“As part of our ongoing engagement with the U.S. government, we previewed our plans and the models’ capabilities ahead of today’s launch,” OpenAI said, through gritted teeth, one suspects, last Friday. "At their request, we are starting with a limited preview for a small group of trusted partners whose participation has been shared with the government, before releasing more broadly.”

Degges said the talk in the track should reassure many, as well as provide innovative new ways to use AI to lock out security gremlins.

DevRel in the Age of AI Is A Search for Meaning

Sam Bhagwat — Wed, 01 Jul 2026 13:22:46 +0000

When I was in college, I read an essay called "The Work of Art in the Age of Mechanical Reproduction" by a German literary critic named Walter Benjamin.

Written in 1936, Benjamin's essay talks about how new media like photography or film removes an artwork's traditional "aura," by which he meant its unique presence in a particular time and place.

About 20 years later, I became famous (infamous?) by printing out hundreds of thousands of copies of a book I wrote, called "Principles of Building AI Agents," and handing it out at every developer and AI event.

One reason this works: AI drives the cost of content creation to zero. In an age of LinkedIn slop, people react well to words written by a human and printed out on dead trees.

This is also why cinematographic videos, color grading and all, are having a renaissance. Countersignaling works.

So DevRel in the age of AI is about aura-maxxing. Creating art so specific to this time, this place, and this medium that a reader needs no annotation. They know it's for them.

It's this weird barbell world, part Zara, part Tiffany. AI increases the rate of tech change dramatically (at least three to four times by my estimation), so media decays incredibly fast. So, luxury in-person experiences (hackathons, events, meetups) become more valuable than ever.

Why? Well, here's an analogy Benjamin would have appreciated: During periods of high inflation, everyone tries to get their hands on metals like gold or silver that don't depreciate.

A premier live event is hard currency. It only happened once, and you were in the room.

The other thing that's more valuable now than ever is what I'd call informed trend pieces.

In my past life in the React web ecosystem, @swyx always had a reputation for giving the most think-y conference talks. He didn't talk about how to use RSC or how RSC worked under the hood. He talked about what RSC meant about the future of client-server interactions.

And with an increase in the rate of change comes an increase in our human need for meaning.

So here you are: at the premier in-person AI event, listening to talks created for this moment in time, trying to figure out what's going on, and what it all means.

Sam Bhagwat is the founder/CEO of Mastra, author of “Principles of Building AI Agents,” and previously the cofounder of Gatsby. He spoke on Tuesday at 4pm: “Every Harness Will Become A Claw”. Say hi if you see him!

The Agentic, Ironclad Onion

Ryan Palo — Wed, 01 Jul 2026 13:20:55 +0000

As AI agents work under increasingly less human supervision, the need for a trustworthy, secure work platform and configuration for them is critical to avoiding ever-evolving security threats. The basics revolve around one tenet: Deny all permissions by default, giving it only the permissions it needs, on every level of the system possible. The best security is defense in layered depth, with each layer being as hardened as possible.

The purpose of this article is not to give you a bulletproof step-by-step hardening plan after which you will have an Unhackable Agent™️. The goal is for this to be an agentic security starting point, to give you an idea of what to consider, what mindsets are helpful, and spark an interest in all of the extremely bright security-based presentations happening at AI Engineer World's Fair 2026 where you can learn more. If there's any you missed on Tuesday, don't forget to have your agent remind you to check them out after they get uploaded on YouTube.

Here's where we start:

The Agent is an Adversary

This is the mindset you should work from. It's not because the agent is inherently malicious by design (put your tinfoil hat away for now). But, even with the best input filtering and defensive system prompting, there's always the chance that someone will find a way to inject a clever jailbreak into the context of your agent. Much like Dr. Jekyll and Mr. Hyde, your helpful, productive agent is one serum injection attack away from behaving like an attacker instead. As always, hoping that it won't happen doesn't count as a strategy.

Kernel-Based Protections

We'll start at the base: sandboxing the OS runtime. This is important: Containers aren't a sandbox. It's simply not enough to throw your agent into a container and declare victory. Containers provide process isolation and were designed for deployment consistency, not for containing an adversary that is trying to get out. So, a much safer starting point is using a dedicated VM or microVM where only the bare minimum of system calls are safe-listed, and filesystem access can be controlled at the kernel level based on process, not user permissions. Your agent doesn't need to be mounting disks or reconfiguring networks (probably). So let's block that at the kernel level.

Network-Based Protections

Eventually, to be maximally productive, your agent will likely need to reach out to the internet, or, at least, an internal network. A network layer of security is the required next step, and, luckily, this layer is more familiar. Web developers have been developing with the mindset that the internet is effectively a radioactive, zombie-infested, toxic wasteland that sometimes benign users come through for a few decades now. By default, all network access should be denied, and specific domains, requests, and patterns should be allowed on an as-needed and controlled basis. Always keep in mind that a prompt injection could live in any text the agent consumes including the text found on the web pages as it looks up information, so make sure you trust all of the safe-listed sources.

Policy-Based Protections

There may be additional safety and business rules you want to enforce that are on top of both of these lower layers. These may be things like API or tool quotas to avoid cost overruns or DDoS-ing your API. Your agent may be able to make network requests, and it may have permission to send POST requests to your API, but it probably shouldn't be able to send unlimited requests by default. As always, the best default is to deny. Only allow agents to make these calls or use these tools if they've been configured according to the policy you've decided you're comfortable with. While these policy-based checks can add a few milliseconds of latency to your agent, they allow more compound control over its higher-level actions.

Auth-Based Protections

It's a beautiful thing how short-lived the hype cycle for "It's just OpenClaw, bro, give it all the same permissions you have and let it rip" was. An agent should not have all of the authorization you have, and it shouldn't be able to authenticate as you. If you have a personal agent that summarizes your emails and responds to bug reports, it doesn't need your bank account credentials or your AWS token. Treat it like its own entity, and not like you would treat another human on your team. Treat it like a sleeper agent that could activate and become evil at any given moment. Give it its own accounts and tokens, and, again, scope those accounts and tokens to the bare minimum it needs to perform its functions. Most OS sandbox solutions take this down to the OS level and give the agent in the sandbox its own user account as an added layer of security.

Ephemeral Runtimes

At a certain point, the security-grounded mindset is one that thinks, "You know, it's eventually a certainty that this agent will get prompt-injected or otherwise download something malicious from the internet." It doesn't even have to be your agent's fault. A package on npm could get hacked (pause for shocked silence). An email attachment could have something malicious tucked away in an image.

Runtimes should be as ephemeral as possible. It should be easy, and possibly even routine, to fully throw away your agent in its runtime and spin a fresh one up.

If something bad does happen, being able to nuke it into oblivion (or package it up in deep freeze for security analysis) is a pretty good mitigation strategy starting point. Ideally, you would spin up the environment, have the agent do its task, save out any artifacts, and then destroy the environment once it's done.

Monitoring

The sibling of "hope is not a strategy" is "not knowing isn't an excuse." At every stage of operation, all of your agent's actions, decisions, sources, and artifacts should be logged, measured, and, as a callback to the Policy section, possibly have limits enforced on them. If your agent uses the curl tool every 30 seconds, or runs a bash command 90,000 times today, that's something you should know about. If its token, CPU, or memory usage spikes outside of normal ranges, that's a problem. Hopefully, your OS and Policy protections should save you from those issues, but you absolutely want to know if something doesn't behave as expected, in the same way you would keep logs and metrics on any other piece of security-critical, high-access software. Metrics, logs, and artifacts are excellent tools to have available to you for preventive defense as well as disaster recovery and root cause analysis.

Paranoia as a Service

In all honesty, the most successful security mindset to adopt is one of slight (nondebilitating) paranoia: At any given time, what's the worst possible thing the agent could do or compromise if an adversary successfully gained control of it? It's our job as builders and users of these new and growing technologies to hope for the best but prepare for the worst, the same way we always have.

Trust but verify when using AI for fixing security flaws

Iain Thomson — Wed, 01 Jul 2026 13:20:20 +0000

AI might seem like a magic bullet for fixing security issues, but it's not that simple, warned Eugene Yan, a member of technical staff at Anthropic, during the newly inaugurated security track at AI Engineer World's Fair. The effectiveness of AI in finding and fixing flaws is doubling every five months, he said, pointing to Mozilla releasing a 423-patch bundle in April. This was more patches than were released in all of 2025.

But while agents are good at finding and fixing flaws, the human element is still needed, say many security professionals. This is both to check that the AI has done the work properly and to make sure that seemingly low-risk bugs can't be strung together to make a serious exploit that AI might not spot.

To fix this, Yan proposed a six-stage program. "We found that most teams converge in approximately these six steps, and a big chunk of my thoughts will be about these," he told the crowd.

First, a threat-finding stage identifies a potential flaw and transfers it to phase two, a sandbox, to see if proof-of-concept code can exploit the issue. The third stage is a discovery phase in which the sample is checked against past issues that may have been fixed.

Stage four is an independent verification, which is designed to further filter out false positive results, and then the results are triaged to avoid flooding out human checkers. Then a patch is developed, and the code is kicked back to the discovery engine.

The end result, he argued, will be much more secure code that still maintains human oversight — while making the lives of security staff a lot easier. Of course, as AI systems improve further, that may not be the case forever if the current rate of engine improvements continues.