
Senior Site Reliability Engineer
BetterUp
full-time
Posted on:
Location Type: Hybrid
Location: California, Illinois, New York, Texas, Virginia • 🇺🇸 United States
Visit company websiteSalary
💰 $164,000 - $205,000 per year
Job Level
Senior
Tech Stack
AWSCloudDistributed SystemsKubernetesPrometheusTerraform
About the role
- Leverage AI-powered tools and automation to transform how we monitor, troubleshoot, and maintain production systems
- Build and operate cloud infrastructure on AWS, using Terraform to codify and version-control our entire environment
- Manage and scale Kubernetes clusters that power BetterUp's platform, ensuring high availability and performance
- Design intelligent alerting and observability systems
- Collaborate with engineering teams to embed reliability into the development lifecycle, shifting left on operational concerns
- Automate incident response workflows and build self-healing infrastructure
- Experiment with and adopt emerging AI tools for log analysis, anomaly detection, and predictive maintenance
- Drive continuous improvement through data-driven retrospectives and reliability metrics
Requirements
- 4+ years of experience in SRE or infrastructure roles
- Genuine excitement about AI tooling: you're already using copilots, AI assistants, or LLM-based tools in your workflow and are excited to push your skillset further in this area
- Deep experience with AWS
- Hands-on Kubernetes experience: deploying, scaling, debugging, and securing clusters
- Strong Terraform skills with experience managing complex, multi-environment infrastructure
- Familiarity with modern observability stacks (Datadog, Prometheus, OpenTelemetry)
- Strong debugging instincts and comfort navigating distributed systems
- Clear communication skills - you can explain a production incident to engineers and executives alike
- A builder's mindset: you see manual processes as opportunities for automation
Benefits
- Access to BetterUp coaching; one for you and one for a friend or family member
- A competitive compensation plan with opportunity for advancement
- Medical, dental, and vision insurance
- Flexible paid time off
- All federal/statutory holidays observed
- 4 BetterUp Inner Workdays
- 5 Volunteer Days to give back
- Learning and Development stipend
- Company wide Summer & Winter breaks
- Year-round charitable contribution of your choice on behalf of BetterUp
- 401(k) self contribution
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
AWSTerraformKubernetesAI toolslog analysisanomaly detectionpredictive maintenanceobservabilityincident response automationdata-driven retrospectives
Soft skills
clear communicationcollaborationbuilder's mindsetdebugging instinctsproblem-solving