CVS Health

Principal Architect – Cloud and Observability

CVS Health

full-time

Posted on:

Location Type: Remote

Location: IllinoisUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $144,200 - $288,400 per year

Job Level

About the role

  • Own the enterprise observability reference architecture covering metrics, logs, traces, and events across all environments (cloud and on-prem).
  • Drive the OpenTelemetry-first instrumentation strategy -- standard libraries, semantic conventions, collector topologies (DaemonSet, gateway, sidecar), and pipeline design.
  • Build and operate telemetry pipelines on Grafana Mimir, Loki, and Tempo, including multi-tenant configurations, retention policies, and capacity planning.
  • Define how we measure reliability: SLOs, SLIs, error budgets, and alerting frameworks -- consistently across all lines of business.
  • Own the integration between observability tooling and incident management (ServiceNow ITOM, xMatters).
  • Drive telemetry schema standards to ensure teams emit data that is useful downstream, not just technically compliant.
  • Build and maintain reference architectures for our hybrid footprint: OpenShift on-prem with KVM/libvirt and Dell PowerFlex storage, plus Azure, AWS, and GCP.
  • Lead standards work around workload identity and federation using SPIFFE/SPIRE and cloud-native IAM patterns.
  • Provide guidance on compute runtime selection -- containers vs. VMs vs. bare metal vs. serverless.
  • Help teams connect autoscaling and capacity planning behavior to actual telemetry signals.
  • Push FinOps maturity forward by integrating cost data into the observability stack, establishing unit economics, and working toward open billing standards like FOCUS.
  • Identify where AI/ML adds practical value in our observability stack.
  • Define observability standards for AI-powered systems (agents, RAG pipelines).
  • Ensure new AI-powered platforms are instrumented correctly from day one.
  • Participate in cross-functional architecture working groups focused on observability and hybrid cloud standards.
  • Publish architecture decision records and reference implementations that teams can actually use.
  • Mentor architects and platform engineers; conduct architecture reviews to raise the bar across the org.
  • Work with security and compliance on HIPAA, SOX, and PCI requirements as they apply to telemetry and cloud infrastructure.
  • Represent CVS Health in vendor evaluations and stay connected to the open-source ecosystem (CNCF, OpenTelemetry, Grafana Labs).

Requirements

  • 10+ years in infrastructure, cloud architecture, platform engineering, or SRE
  • 8+ years of architecture work in observability, cloud infrastructure, or both at a large enterprise
  • Solid experience with at least two of Azure, AWS, or GCP -- including networking, identity, compute, and storage
  • 5+ years with Kubernetes in production (OpenShift, EKS, AKS, or GKE)
  • 5+ years with OpenTelemetry or similar frameworks (collectors, SDKs, semantic conventions, pipeline design)
  • 5+ years with observability platforms: Grafana/Mimir/Loki/Tempo, Prometheus, Datadog, Splunk, Dynatrace, or comparable tools
  • Experience defining SLOs/SLIs and building alerting strategies at an organizational level
  • Proven track record writing architecture standards that other teams adopted and followed
  • Able to communicate clearly with both engineers and senior leadership.
Benefits
  • Affordable medical plan options
  • 401(k) plan (including matching company contributions)
  • Employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching.
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Colleague assistance programs
  • Tuition assistance
  • Retiree medical access
  • Many other benefits depending on eligibility
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
OpenTelemetryKubernetesGrafana MimirLokiTempoAzureAWSGCPSLOsSLIs
Soft Skills
communicationmentoringleadershipcollaborationguidance