Heidi Health

Site Reliability Engineer – Mid-Senior, Operations-Focused

Heidi Health

full-time

Posted on:

Location Type: Hybrid

Location: LondonUnited Kingdom

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Participate in on-call and incident response: Respond to production incidents, contribute to service restoration, and support clear communication during incidents. Over time, take increasing responsibility for leading incidents end-to-end.
  • Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes through better alerting, automation, system changes, or process improvements.
  • Own parts of the production environment: Operate and improve Kubernetes clusters, cloud infrastructure, and core platform services, with growing ownership as familiarity increases.
  • Strengthen observability: Improve dashboards, alerts, logs, and traces so issues are detected earlier and diagnosed faster, with a strong focus on actionable signals.
  • Reduce operational toil: Automate repetitive tasks, simplify runbooks, and improve tooling to make on-call and day-to-day operations easier and safer.
  • Support safe change: Improve deployments, rollback mechanisms, and operational readiness to reduce the risk of incidents caused by change.
  • Contribute to operational practices: Write and maintain runbooks, participate in blameless post-mortems, and help improve incident response processes over time.
  • Collaborate closely with engineers: Work with product and feature teams to improve production readiness, service ownership, and reliability expectations.

Requirements

  • 3–6+ years in SRE, DevOps, Platform, or operations-heavy engineering roles.
  • Experience supporting production systems and participating in on-call rotations.
  • Comfortable debugging live systems under pressure.
  • Experience operating cloud infrastructure (AWS preferred).
  • Working knowledge of Kubernetes and containerised workloads.
  • Infrastructure as Code experience (Terraform or similar).
  • Familiarity with monitoring and alerting tools (Datadog, Prometheus, etc).
  • Scripting or automation experience (Python, Bash, or similar).
Benefits
  • Real product momentum. We’re not trying to generate interest, we’re channeling it.
  • Equity from day one. When Heidi wins, you win. You’ll share directly in the success you help create.
  • Unmatched impact. Play a pivotal role in defining and scaling customer success at a critical growth moment - all while working on a product that delivers tangible value to clinicians and patients every day.
  • Work alongside world-class talent. Join a team of operators and builders who’ve scaled unicorns.
  • Global reach. Help shape our international expansion as we bring Heidi to key international markets.
  • Growth and balance. Enjoy a personal development budget, work from anywhere for a month, dedicated wellness days, and your birthday off to recharge.
  • Flexibility that works. A hybrid environment, with 3 days in the office.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesAWSTerraformPythonBashmonitoring toolsalerting toolsautomationdebuggingproduction systems
Soft Skills
communicationcollaborationproblem-solvingincident responseleadershipoperational readinessprocess improvementreliability focusblameless post-mortemsownership