DevOps Engineer

• Act as a technical authority, mentoring senior engineers and guiding design choices to improve service reliability and resilience
• Lead the definition and enforcement of SLIs, SLOs, and error budgets and drive adherence across engineering teams
• Collaborate with Staff peers and partner with development and product teams to design for failure and operationalize reliability from the start
• Drive company-wide adoption of observability best practices and tooling; ensure metrics, logs, and traces provide deep, actionable insights
• Lead complex incident responses, postmortems, and systemic reliability improvements while promoting a blameless culture
• Lead initiatives in infrastructure as code, deployment automation, and resilience testing; influence chaos engineering and release validation frameworks
• Partner with platform and security teams to ensure production readiness and represent the SRE team in technical leadership forums and product planning

Staff Site Reliability Engineer

Job Level

Tech Stack

About the role

Requirements