Site Reliability Engineer – SRE

Sonio

Site Reliability Engineer ensuring platform stability and incident response for leading healthcare AI firm. Collaborating with global DevOps team on infrastructure and code management in a hybrid setup.

Posted 5/1/2026full-time🇺🇸 United StatesMid-LevelSenior💰 $165,000 - $190,000 per yearWebsite

Tech Stack

Tools & technologies

AWSElixirKubernetesTerraform

About the role

Key responsibilities & impact

Own US coverage for releases and incidents as the first responder during PST hours.
Bridge infra and code by working hand-in-hand with our DevOps team on Kubernetes, Terraform, and AWS, while being able to read and patch Elixir code to unblock yourself without waiting for a backend engineer.
Drive incident response end-to-end, managing triage, mitigation, and blameless post-mortems with real follow-through.
Improve the platform’s operability by defining SLOs, tuning alerts to reduce toil, and pushing observability (metrics, logs, tracing) where it’s lacking.
Transfer operational knowledge from France to the US by authoring runbooks and documenting procedures so local teams are empowered to act when something breaks.
Support compliance and security in our regulated medical-device environment, maintaining HIPAA-aligned controls and an audit-ready infrastructure.

Requirements

What you’ll need

4+ years of experience in SRE, DevOps, or Production Engineering, including significant on-call experience on a 24/7 product
You possess a hybrid "code-literate" mindset, acting as an infrastructure expert who can also navigate a backend codebase to triage and patch issues independently.
You bring strong technical foundations in Kubernetes, Terraform, and AWS, along with the ability to architect and tune your own observability signals.
You are highly autonomous and comfortable making technical decisions with limited supervision, which is essential given the timezone difference with France.
You maintain operational rigor and stay calm under pressure, with the written English skills necessary to produce high-quality runbooks and handle async handoffs.

Benefits

Comp & perks

Health Insurance (Medical plan, vision, dental) - up to 30,000$ per year + FSA & HSA
401(k) - up 4% of your salary matched
Life Insurance - covering 2 times your salary, up to $200k
An attractive Parental Policy for primary and secondary caregivers
20 PTO + 1 week offered between Christmas and New Year
Offices in Boston (HQ) & New York (incl. free breakfast, drinks & gym)
Flexible hours & remote policies
Commuter Benefits
One offsite per year in France & regular team building with US team
Ongoing trainings and continuous opportunities for professional growth and development, specifically unlimited access to coaching

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

KubernetesTerraformAWSElixirincident responseSLOsobservabilitytriagemitigationpost-mortems

Soft Skills

autonomytechnical decision-makingcalm under pressurewritten communicationdocumentationcollaborationproblem-solvingknowledge transferoperational rigoradaptability