Skip to main content

SLO Management With AI

5 min read
SreDevops

Sre

AI can recommend SLO targets. You balance user expectation, engineering capacity, and business risk.

Devops

Error budgets drive prioritization. AI reports; you decide what to fix first.

SLO Management With AI

TL;DR

  • AI can track SLOs, compute error budgets, and suggest targets from historical data. Useful for visibility.
  • What AI can't do: decide what "good enough" means, negotiate with product, or balance reliability vs. feature velocity. That's human.
  • Use AI for measurement and alerting. You own the policy: when do we stop shipping and fix reliability?

SLOs (Service Level Objectives) and error budgets are a contract between engineering and the business. AI can measure; it can't negotiate.

What AI Handles

  • SLI computation. Availability, latency, error rate—AI aggregates, segments, and trends. Dashboards and reports.
  • Error budget tracking. "You've consumed 80% of your budget this month." AI calculates; you act.
  • Target suggestion. "Historical p99 is 200ms; consider 250ms SLO." AI offers a baseline. You validate against user needs.
  • Anomaly vs. SLO. "Last week we breached; this week we're trending the same." AI surfaces patterns. You decide if it's acceptable.

What Requires Human Judgment

  • Setting targets. 99.9% availability sounds good. So does 99.5%. The difference is 4x in allowable downtime. Product and execs have opinions. AI doesn't.
  • Error budget policy. When we're out of budget, do we stop releases? Slow down? Depends on company culture and risk tolerance.
  • Prioritization. We're over budget. Do we fix the database or the cache? AI can rank by impact; you decide by business priority.
  • SLO scope. What's in scope? What's out? AI can't draw the service boundary. You define what we promise.

How to Use AI for SLOs

Measurement layer: Let AI compute SLIs, track error budgets, and alert when we're trending poorly. Automation here is safe.

Policy layer: You define targets, review cadence, and escalation. AI can propose; you approve.

Optimization layer: AI suggests "if you improve X, you'll gain Y budget." Useful for planning. You decide what to implement.

Manual process. Repetitive tasks. Limited scale.

Click "With AI" to see the difference →

Quick Check

What remains human when AI automates more of this role?

Do This Next

  1. Review your current SLOs. Are they based on data or gut feel? Use AI to analyze historical SLI data and propose evidence-based targets. Then socialize with stakeholders.
  2. Document your error budget policy in one page: what happens when we're out? Who decides? Share it. Use it as the source of truth when debates arise.