CVS Health

Staff Engineer – SRE, Retail & Pharmacy

CVS Health

full-time

Posted on:

Location Type: Remote

Location: MassachusettsTexasUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $118,450 - $284,280 per year

Job Level

About the role

  • Implement and maintain comprehensive observability solutions, providing real-time insights into system performance and overall health
  • Investigate and resolve incidents quickly during critical situations and perform root cause analysis
  • Collaborate with cross-functional teams to build robust monitoring, alerting, and telemetry solutions
  • Design and implement observability solutions tailored for edge computing environments
  • Define and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and business KPIs
  • Build and optimize dashboards, visualizations, and alerting systems
  • Implement distributed tracing and log aggregation systems
  • Collaborate with engineering teams to ensure applications and infrastructure at edge locations are designed with observability in mind
  • Drive proactive identification of issues in edge facilities through advanced observability tools
  • Lead incident postmortems and implement observability-driven improvements
  • Develop and maintain tools, scripts, and automation to enhance observability pipelines
  • Evaluate and integrate industry-standard observability tools
  • Optimize observability data storage, retention, and querying
  • Mentor and guide junior SREs and engineers on observability best practices
  • Partner with solution, engineering, and business teams to align observability efforts with business objectives
  • Stay current with emerging observability trends, tools, and methodologies
  • Contribute to the development of observability standards, runbooks, and documentation
  • Drive cost optimization for observability infrastructure while maintaining high-quality monitoring

Requirements

  • 8+ years of experience in SRE, DevOps, or related technology roles
  • 5+ years of experience in delivering software in a large-scale environment with reliability and resilience concepts (multi-region, multi-cloud, containerization, etc.)
  • 5+ years of experience with observability and monitoring tools such as Splunk, Dynatrace, Datadog, Prometheus, Grafana, etc.
  • 3+ years of experience with programming/scripting languages (e.g., Python, java) for automation and tooling in distributed environments
  • 3+ years of experience on Cloud Technologies (AWS, Microsoft Azure, Google Cloud)
  • 3+ years of experience with source control and continuous integration tools like Git/Stash, BitBucket, or Jenkins
  • 2+ years of engineering team leadership or management experience
  • Experience using customer feedback tools such as Quantum Metrics, Medalia, and Adobe Analytics
  • Deep understanding of microservices architecture and cloud-native technologies
  • Experience in configuring, supporting, and managing Rancher, Kubernetes, and/or Docker
  • Experience in Incident Management, Change Management, Infrastructure Support, and Problem Management concepts and processes
  • Excellent interpersonal and communication skills, including the ability to engage technical and non-technical stakeholders.
Benefits
  • Affordable medical plan options
  • 401(k) plan (including matching company contributions)
  • Employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation, and weight management programs
  • Confidential counseling and financial coaching
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Colleague assistance programs
  • Tuition assistance
  • Retiree medical access
  • Many other benefits depending on eligibility
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
observability solutionsroot cause analysismonitoring toolsalerting systemsdistributed tracinglog aggregationautomationprogramming languagescloud technologiesmicroservices architecture
Soft Skills
interpersonal skillscommunication skillsleadershipcollaborationmentoringproblem-solvingproactive identificationdocumentationstakeholder engagementincident postmortems