Senior Site Reliability Engineer, DevOps

Alphatec Spine

full-time

Posted on: 9/20/2025

Origin: • 🇺🇸 United States • California

✨ AI Apply

💰 $135,000 - $145,000 per year

Senior

AzureCloudGrafanaPrometheusPythonTerraform

About the role

Ensure availability, performance, scalability, and operational efficiency of the Informatix cloud platform.
Reduce manual operational toil through automation and engineering solutions.
Serve as a primary contributor to the on-call rotation to maintain 24/7 uptime for production systems.
Proactively monitor and continuously improve SLAs, SLOs, and SLIs across critical services.
Develop and maintain observability tooling including logging, metrics, and tracing (e.g., Azure Monitor, OpenTelemetry, Prometheus).
Conduct postmortems and root cause analysis; implement fixes to prevent repeat incidents.
Design and maintain automated incident detection and response systems; establish runbooks and escalation protocols.
Identify and eliminate manual operational toil through scripting and automation.
Contribute to chaos testing and failure injection to proactively uncover weaknesses.
Promote a culture of operational excellence through data-driven reliability practices.
Proactively communicate status.

5+ years of experience in Site Reliability Engineering, systems engineering, or DevOps roles.
Expertise in monitoring and observability platforms (e.g., Grafana, Prometheus, ELK, Azure Monitor).
Solid background in incident response, root cause analysis, and on-call rotations.
Deep knowledge of Microsoft Azure, including containerized services (AKS), networking, and storage.
Strong automation and scripting experience (e.g., Python, Bash, PowerShell).
Familiarity with IaC tools such as Terraform, Bicep, or ARM templates.
Experience implementing SLIs/SLOs, operational dashboards, and error budgets.
Comfortable designing for resiliency, failover, and graceful degradation.
Knowledge of compliance frameworks (e.g., SOC 2, HITRUST, IEC 62304) is a plus.
Strong written and verbal communication with a focus on transparency and learning.
BS/MS in Computer Science, Engineering, or related technical field preferred.
5+ years in production engineering roles with direct ownership of critical systems.
Microsoft certifications a plus.
For US roles requiring hospital access: must be eligible for and maintain hospital credentials and applicable vaccination requirements.