United States | English
Locations Careers Contact Us
← Developer Experience · Case studies Operations · Case study

Faster detection & resolution

Observability and AIOps that drive mean-time-to-detect and mean-time-to-recover down — Datadog and resolve.ai for detection, ServiceNow Now Assist for resolution — backed by error budgets, runbooks, and blameless postmortems.

MTTD ↓
Datadog · resolve.ai
MTTR ↓
ServiceNow Now Assist
Practice
Error budgets · runbooks

Overview

Incidents were noticed late and resolved slowly, with on-call engineers piecing together context by hand. The aim was to cut MTTD (mean time to detect) and MTTR (mean time to recover) with AI-assisted operations across the holistic product lifecycle (PLDC).

Datadog and resolve.ai shorten detection and triage; ServiceNow Now Assist accelerates resolution. Around them sit error budgets, runbooks, and blameless postmortems that make the improvement durable, release over release.


The challenge

Our approach

  1. Instrumented services with Datadog for end-to-end observability — metrics, traces, and logs in one place
  2. Used resolve.ai to assist detection and root-cause analysis, cutting mean time to detect
  3. Brought in ServiceNow Now Assist to accelerate triage and resolution, cutting mean time to recover
  4. Set error budgets and on-call practices, with runbooks for common failure modes
  5. Ran blameless postmortems and fed fixes back into the platform and the golden path

Results & business impact

Tools & technology

Datadog resolve.ai ServiceNow Now Assist MTTD MTTR Observability Error budgets Runbooks SRE

Representative reference architecture from the NovasIQ developer-experience practice, illustrating how we approach this pattern across the holistic product lifecycle (PLDC). It reflects standard, proven engineering practice rather than a specific named client engagement, and outcomes are described qualitatively. MTTD and MTTR are core DevOps performance metrics. Delivery metrics follow public research: DORA / Google Cloud State of DevOps and Stack Overflow Developer Survey.

More case studies