Home/Success Stories/SaaS Provider: Full 99.9% Uptime Visibility
Enterprise SaaS6 weeks engagement80–150 employees

SaaS Provider: Full 99.9% Uptime Visibility

An enterprise SaaS provider had SLA commitments with no monitoring to back them up. We built an end-to-end observability platform so they could prove — and protect — their uptime.

New RelicSynthetic MonitoringNRQLPagerDuty Integration

Key Results

99.97%
Measured uptime (first quarter)
4 min
Mean time to detect incidents
Faster incident response
Problem

SLA commitments with no way to measure compliance

The company had signed enterprise contracts promising 99.9% uptime, but had no external synthetic monitoring — only internal health checks. When a customer reported an outage, the team had no data to correlate when it started or how widespread it was.

No synthetic monitoring — uptime measured only by internal checks

SLA breach disputes with no independent data to reference

On-call rotation receiving alerts 12+ minutes after user-visible degradation

No service dependency map — cascading failures were invisible

Zero integration between monitoring and incident management (PagerDuty)

Solution

New Relic-first observability platform with external validation

We built a New Relic-centred platform with synthetic monitors running from 5 geographic locations, distributed tracing across all customer-facing services, powering a service map that visualized dependencies. PagerDuty integration routed alerts to the correct on-call tier within seconds.

1

Synthetic monitor network

22 synthetic scripts across 5 locations; covering login, core workflows, API endpoints

2

Full instrumentation of tracing

New Relic agents on all 8 customer-facing services; distributed tracing enabled

3

Service map configuration

Dependency graph visualising 34 services and their health relationships

4

Alert policy redesign

Symptom-based NRQL policies with AI anomaly detection; 3-tier escalation to PagerDuty

5

SLA reporting dashboard

Automated weekly SLA reports per customer segment; shareable with enterprise accounts

Output

From no visibility to provable SLA compliance

In the first quarter post-deployment, the team detected and resolved 3 production incidents before any customer reported an issue. Uptime measured at 99.97% — beating their 99.9% SLA commitment. One enterprise customer specifically cited the new SLA reporting as a factor in their contract renewal.

99.97%
Measured uptime Q1
4 min
MTTD (down from 12+ min)
22
Synthetic monitors active
0
Customer-reported incidents (Q1)

Your environment

Ready to see similar results?

Let's talk about your observability challenges. Free consultation, no obligations.

Book a Free Consultation