CVS Health

Senior Site Reliability Engineer – Metrics and Observability

CVS Health

full-time

Posted on:

Location Type: Remote

Location: LouisianaMississippiUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $83,430 - $203,940 per year

Job Level

About the role

  • Define, implement, and maintain key performance metrics, SLOs, and SLIs to measure system reliability and performance
  • Manage error budgets effectively, collaborating with development teams to balance reliability and feature delivery
  • Design and implement comprehensive monitoring solutions to provide real-time visibility into system health
  • Develop and implement automated quality gates that ensure all releases meet defined reliability and performance standards
  • Assist in incident response efforts by providing insights from metrics and monitoring tools
  • Drive initiatives to enhance monitoring, observability, and reliability practices

Requirements

  • 5+ years of experience in Site Reliability Engineering and/or DevOps
  • 3+ years of experience defining and implementing metrics, SLOs, and SLIs
  • 2+ years of experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
  • 2+ years of experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes)
Benefits
  • Affordable medical plan options
  • 401(k) plan (including matching company contributions)
  • Employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation, and weight management programs
  • Confidential counseling and financial coaching
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Colleague assistance programs
  • Tuition assistance
  • Retiree medical access
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Site Reliability EngineeringDevOpsmetricsSLOsSLIsmonitoring toolsobservability toolscloud platformscontainer orchestration
Soft Skills
collaborationincident responseinitiative driving