Case Studies

Real migrations. Real savings.

Platform teams are leaving expensive observability platforms — not because the tools are bad, but because the pricing model works against them at scale. Here's how three organizations made the switch.

Annual savings delivered

$2.6M+

Average cost reduction

72%

Dashboards preserved

100%

Fastest implementation

6 weeks

Defense & IntelligenceDefense Contractor · Air-Gapped Environments

Sentinel Defense Systems: “The observability tool itself was never the latest and greatest.”

How a defense contractor replaced painful whole-version upgrades with an adaptive platform that improves in place.

Upgrade cycle

Eliminated

From quarterly 6-week cycles to continuous in-place refinement

Time to deploy

3 weeks

vs. 4+ months for previous commercial tool upgrades

Data leaving perimeter

Zero

100% customer-owned infrastructure, air-gapped by design

The Challenge

Sentinel's classified networks ran a commercial observability platform that was always two or three major versions behind. Every upgrade required importing an entire monolithic software package through a one-way transfer — USB drives, DVDs, and sneakernet — even when only a single feature was actually needed. As Splunk's own engineers acknowledged at .conf19, getting data across an air gap is a nightmare: "A resounding no!" was their assessment of the most common approach. Sentinel's team spent weeks per upgrade cycle coordinating physical media transfers, re-validating security controls, and re-certifying the environment — only to discover that the new version introduced cloud-dependent "phone home" features that broke in disconnected mode. Meanwhile, known vulnerabilities sat unpatched because the next full version wasn't approved yet.

The Solution

ExitGraph deployed forward-deployed engineers with active security clearances directly into Sentinel's classified environment. Instead of importing a monolithic commercial platform, the team built an open-source observability stack — OpenTelemetry collectors, Prometheus-compatible metrics, and structured log pipelines — that could be tailored and improved incrementally while running on the air-gapped network. Need a new alerting rule? Add it. Need a different retention policy? Change it. No waiting for the vendor's next major release. No importing 40 GB of software to get one minor feature. The platform adapts and gets refined over time until it's exactly what the mission requires.

The Outcome

By deploying on infrastructure the customer already controlled, the implementation was dramatically faster than previous commercial tool upgrades. The team eliminated the quarterly "upgrade dread" cycle entirely. Sentinel now runs observability that evolves continuously inside the perimeter — no data leaves, no vendor dependencies, no forced version upgrades that break air-gapped operations.

“We used to dread every upgrade. Import the whole package, pray nothing breaks in disconnected mode, spend weeks re-certifying. Now we tune exactly what we need, when we need it — without ever crossing the air gap.”
— Platform Engineering Lead, Sentinel Defense Systems

Air-GappedClassifiedIL4/IL5Forward-Deployed Engineers

Industry sources

Splunk .conf19: "Running Splunk in an Air-gapped Environment"FedInsider: "The State of Air-Gapped Networks in Government"Carahsoft: "Securing Air-Gapped and Classified Environments"

HealthcareHealthcare · 200+ Microservices · HIPAA-Regulated

Meridian Health Network: “We were spending more on observability than on the infrastructure we were observing.”

How a healthcare network cut observability costs 72% while maintaining full compliance and zero visibility gaps.

Annual savings

$613K

From $847K/yr to $234K/yr (72% reduction)

Migration duration

4 months

Phased parallel deployment with zero downtime

Dashboards preserved

100%

Every dashboard, alert, and search workflow migrated

The Challenge

Meridian's Datadog bill had crossed $847,000 per year and was growing faster than their infrastructure. Custom metrics from high-cardinality Prometheus-style labels generated thousands of unique time series — each one billed individually. Log volume spiked unpredictably during incident response, and APM costs scaled with every new service onboarded. As industry analysts have documented, "Datadog bill shock has become such a common experience that it's practically a rite of passage for engineering teams." Meridian's finance team flagged that observability had become the third-largest line item in their cloud budget, behind only compute and storage. The team was trapped: migration felt risky, dashboards were embedded in workflows, and alerts were relied upon for HIPAA-regulated uptime SLAs.

The Solution

ExitGraph ran a phased migration over four months, starting with a parallel deployment that dual-shipped telemetry to both Datadog and the new open-source stack. The architecture shift was fundamental: instead of per-ingest pricing, Meridian moved to a pay-for-infrastructure model where Prometheus-compatible metrics storage has no per-custom-metric fees, log storage uses label-indexed object storage at a fraction of full-text-indexed costs, and trace storage requires minimal indexing. ExitGraph preserved every dashboard, alert, and search workflow — translating Datadog-specific queries to PromQL and LogQL, recreating alert rules in Alertmanager, and migrating log pipelines to OpenTelemetry collectors.

The Outcome

First-year savings reached $613,000 — a 72% cost reduction — with the cost advantage growing as Meridian scales, because object storage costs scale linearly while commercial per-ingest pricing scales super-linearly. The migration maintained full HIPAA compliance throughout, with zero downtime and no gaps in monitoring coverage during the transition.

“The bill doubled. Then doubled again. Nobody made a bad decision — the pricing model just works against you at scale. ExitGraph showed us there was another way.”
— VP of Platform Engineering, Meridian Health Network

Datadog MigrationCost ReductionHIPAAOpenTelemetry

Industry sources

Gudimetla (2025): "Transitioning to Open-Source Observability" — 68% cost reduction in 800GB/day deployment KubeWright: "How Platform Teams End Up With Six-Figure Observability Bills"OneUptime: "Datadog Bill Shock Is Real" — $100K/yr typical mid-market spend

Financial ServicesFinancial Services · Multi-Cloud · SOC 2 Type II

Atlas Financial Group: “We already had the cloud. We just needed someone to build on it.”

How deploying on the customer's existing infrastructure cut implementation time by 60% and eliminated vendor lock-in.

Implementation time

6 weeks

60% faster than previous platform migrations

License savings

$1.2M/yr

Eliminated annual Splunk Enterprise licensing

New infrastructure

Zero

Deployed entirely on existing EKS clusters and S3 storage

The Challenge

Atlas ran Splunk Enterprise across AWS and Azure for log management and security monitoring. Their annual Splunk license renewal was approaching $1.2 million, and the platform team had been told to find alternatives. But every vendor they evaluated wanted to deploy their own infrastructure — new clusters, new storage backends, new networking rules — adding months of procurement, security review, and compliance certification. Atlas already had well-provisioned Kubernetes clusters, S3-compatible object storage, and a mature DevOps team. They didn't need another vendor's infrastructure. They needed someone to build the right observability stack on what they already had.

The Solution

ExitGraph's approach was fundamentally different: deploy on the customer's existing cloud. The team provisioned the open-source stack directly on Atlas's existing EKS clusters and S3 buckets — no new infrastructure procurement, no new vendor security reviews, no additional cloud accounts. Because the target-state platform ran on infrastructure Atlas already owned and operated, the security and compliance teams had far less to review. The SOC 2 Type II audit scope didn't expand because no new third-party data processors were introduced. Atlas's existing IAM policies, VPC configurations, and encryption standards applied automatically.

The Outcome

Implementation took six weeks — 60% faster than Atlas's previous platform migrations. The team eliminated $1.2 million in annual Splunk licensing while gaining full ownership of their observability data. Because the stack runs on Atlas's own infrastructure, there are no per-GB ingest fees, no per-host charges, and no surprise invoices. Costs scale with the infrastructure they already budget for.

“Every other vendor wanted to sell us their cloud. ExitGraph just built on ours. That's why it took weeks instead of months.”
— CTO, Atlas Financial Group

Splunk MigrationMulti-CloudSOC 2Existing Infrastructure

Industry sources

r/devops: "Observability platform for an air-gapped system" — practitioners discuss self-hosted alternatives OpenObserve: "Top 11 Splunk Alternatives" — 60-90% cost reduction with open-source stacks The New Stack: "A Guide to Safe, Incremental Open Source Observability Migration"

Research & Data

The industry data behind these results

Our case studies are grounded in publicly available industry research, peer-reviewed publications, and vendor documentation. The cost patterns described here are consistent with findings across multiple independent sources.

$1.5–2.5M

Annual cost of commercial observability platforms at 1 TB daily telemetry, according to peer-reviewed research.

Gudimetla, S. (2025). Journal of Computer Science and Technology Studies, 7(12), 495-512.

60–80%

Typical cost reduction when migrating from commercial to open-source observability stacks at scale.

KubeWright (2026). "How Platform Teams End Up With Six-Figure Observability Bills."

~$100K/yr

Typical Datadog spend for a mid-market team with 50 hosts and 100 GB/day log volume.

OneUptime (2026). "Datadog Bill Shock Is Real."

"A resounding no!"

Splunk's own engineers on the most common method of getting data across an air gap, presented at .conf19.

Schohn, S. (2019). Splunk .conf19, Session FN1190.

Ready to write your own case study?

Start with a free assessment. We'll analyze your current observability spend, map your dashboards and alerts, and show you exactly what the migration looks like — with a timeline and projected savings.