FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Site Reliability Engineer – SRE
SonioSite Reliability Engineer ensuring platform stability and incident response for leading healthcare AI firm. Collaborating with global DevOps team on infrastructure and code management in a hybrid setup.
Tech Stack
Tools & technologiesAWSElixirKubernetesTerraform
About the role
Key responsibilities & impact- Own US coverage for releases and incidents as the first responder during PST hours.
- Bridge infra and code by working hand-in-hand with our DevOps team on Kubernetes, Terraform, and AWS, while being able to read and patch Elixir code to unblock yourself without waiting for a backend engineer.
- Drive incident response end-to-end, managing triage, mitigation, and blameless post-mortems with real follow-through.
- Improve the platform’s operability by defining SLOs, tuning alerts to reduce toil, and pushing observability (metrics, logs, tracing) where it’s lacking.
- Transfer operational knowledge from France to the US by authoring runbooks and documenting procedures so local teams are empowered to act when something breaks.
- Support compliance and security in our regulated medical-device environment, maintaining HIPAA-aligned controls and an audit-ready infrastructure.
Requirements
What you’ll need- 4+ years of experience in SRE, DevOps, or Production Engineering, including significant on-call experience on a 24/7 product
- You possess a hybrid "code-literate" mindset, acting as an infrastructure expert who can also navigate a backend codebase to triage and patch issues independently.
- You bring strong technical foundations in Kubernetes, Terraform, and AWS, along with the ability to architect and tune your own observability signals.
- You are highly autonomous and comfortable making technical decisions with limited supervision, which is essential given the timezone difference with France.
- You maintain operational rigor and stay calm under pressure, with the written English skills necessary to produce high-quality runbooks and handle async handoffs.
Benefits
Comp & perks- Health Insurance (Medical plan, vision, dental) - up to 30,000$ per year + FSA & HSA
- 401(k) - up 4% of your salary matched
- Life Insurance - covering 2 times your salary, up to $200k
- An attractive Parental Policy for primary and secondary caregivers
- 20 PTO + 1 week offered between Christmas and New Year
- Offices in Boston (HQ) & New York (incl. free breakfast, drinks & gym)
- Flexible hours & remote policies
- Commuter Benefits
- One offsite per year in France & regular team building with US team
- Ongoing trainings and continuous opportunities for professional growth and development, specifically unlimited access to coaching
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesTerraformAWSElixirincident responseSLOsobservabilitytriagemitigationpost-mortems
Soft Skills
autonomytechnical decision-makingcalm under pressurewritten communicationdocumentationcollaborationproblem-solvingknowledge transferoperational rigoradaptability