Lead, coach and develop a high‑performing SRE team, fostering autonomy, inclusion and continuous improvement.
Partner with Product Owners and Engineering Leads to embed reliability into roadmaps, backlogs and delivery decisions.
Apply SRE principles (SLIs, SLOs, error budgets) to ensure our services remain highly reliable, performant and scalable.
Drive improvements in observability—across metrics, logs, traces and events—ensuring services are observable by design.
Use Dynatrace as the primary observability platform for significant dashboards and customer‑centric alerting.
Own Infrastructure‑as‑Code and CI/CD‑based environments, implementing enhancements and responding to operational change.
Lead coordination of incident response and root cause analysis, supporting teams through major incidents, post‑incident reviews and prevention of recurrence.
Collaborate with multi‑disciplinary engineering teams to remove technical impediments, reduce toil and improve service operability.
Contribute hands‑on engineering where needed, validating technical decisions and guiding best practice.
Bring an approach of curiosity, experimentation, and first‑principles thinking to evolve our engineering culture.

Requirements

Proven experience applying SRE practices within Azure, GCP, or both.
Strong understanding of SLIs, SLOs, error budgets, and how to use these to guide product and engineering decisions.
Experience ensuring reliability of production services, including availability, performance and recoverability.
Hands‑on or leadership experience in incident and problem management, focused on reducing MTTR and avoiding repeat issues.
Background in software engineering or cloud engineering, with good understanding of modern SDLC practices.
Practical experience with DevOps, CI/CD and automation to improve service reliability.
Experience improving observability on complex, distributed systems.
Ability to use data to influence prioritisation and balance reliability with feature delivery.
Collaboration and communication skills, working effectively with product, engineering and platform teams.
Experience mentoring engineers and promoting inclusive, supportive team culture.

Benefits

📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

SRE principlesSLIsSLOserror budgetsInfrastructure-as-CodeCI/CDDevOpsobservabilitycloud engineeringsoftware engineering

Soft Skills

leadershipcoachingcollaborationcommunicationcuriosityexperimentationproblem managementmentoringinclusivitycontinuous improvement