DEV Community

# reliability

General discussions on building and maintaining reliable software systems.

Posts

πŸ‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
An AI agent design that refuses to act on what it merely assumes

An AI agent design that refuses to act on what it merely assumes

1
3 min read
Automatic Error Recovery in AI Agent Networks

Automatic Error Recovery in AI Agent Networks

2 min read
Survey: 93% of large North-American IT shops have hit an AI-coding incident

Survey: 93% of large North-American IT shops have hit an AI-coding incident

2 min read
Chaos Engineering for Teams That Aren't Netflix

Chaos Engineering for Teams That Aren't Netflix

3 min read
Error Budgets in Practice: A No-BS Guide

Error Budgets in Practice: A No-BS Guide

2 min read
relysam v2.0.0 a core Reliability Engineering platform with AI/ML enhancements

relysam v2.0.0 a core Reliability Engineering platform with AI/ML enhancements

2 min read
Choosing Your First SLI: A Decision Framework for New SRE Teams

Choosing Your First SLI: A Decision Framework for New SRE Teams

2 min read
Synthetic Monitoring Best Practices: What to Monitor and How Often

Synthetic Monitoring Best Practices: What to Monitor and How Often

6 min read
Synthetic Monitoring vs Real User Monitoring (RUM): The Difference

Synthetic Monitoring vs Real User Monitoring (RUM): The Difference

4 min read
What Is Synthetic Monitoring? The Complete Guide

What Is Synthetic Monitoring? The Complete Guide

6 min read
Ten 95% Reliable Agents Chained Together Give You a 60% System. Microservices Solved This a Decade Ago.

Ten 95% Reliable Agents Chained Together Give You a 60% System. Microservices Solved This a Decade Ago.

2
4 min read
Your MCP Agent is Logging "Sucess: true" While the task never ran

Your MCP Agent is Logging "Sucess: true" While the task never ran

1
3 min read
Three AI providers went down on the same day. Here's the architecture that didn't care.

Three AI providers went down on the same day. Here's the architecture that didn't care.

5 min read
Surviving the region you run in: failover on Aurora DSQL, and what the demo proves

Surviving the region you run in: failover on Aurora DSQL, and what the demo proves

5 min read
Sliding-Window Spend Guard: the $47K Loop Per-Call Caps Miss

Sliding-Window Spend Guard: the $47K Loop Per-Call Caps Miss

11 min read
πŸ‘‹ Sign in for the ability to sort posts by relevant, latest, or top.