Senior Site Reliability Engineer – SRE

QAD

Senior Site Reliability Engineer at Redzone, ensuring reliability and performance of mission-critical services. Evolving SRE practices while driving automation and operational excellence within the team.

Posted 6/19/2026full-timeRemote • 🇪🇸 SpainSeniorWebsite

Tech Stack

Tools & technologies

Distributed Systems

About the role

Key responsibilities & impact

Drive Operational Excellence: Design, implement, and maintain highly available, scalable, and resilient systems that deliver exceptional customer experience
Datadog Expert: Be one of the go-to experts for Datadog, responsible for defining and implementing best practices
Software Development for Reliability: Develop robust, well-tested, and maintainable software to automate operational tasks
Toil Reduction Champion: Identify and eliminate toil through automation and process improvements
Incident Management & Post-Mortems: Lead blameless post-mortems and contribute to incident response framework
Reliability Metrics & Goals: Collaborate to define, implement, and track Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets
Infrastructure as Code: Leverage and contribute to infrastructure as code efforts
System Design & Architecture: Provide SRE expertise in system design reviews
Knowledge Sharing & Mentorship: Document processes and share expertise with team

Requirements

What you’ll need

Demonstrated experience operating and improving production systems at scale in an SRE, Production Engineering, or Platform Engineering role
Proven ability to rapidly build accurate mental models of complex distributed systems across infrastructure, applications, networking, identity, and observability domains
Strong troubleshooting skills with a methodical, evidence-driven approach to incident response and root cause analysis
Experience defining, implementing, and using Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to guide reliability decisions
Excellent written and verbal communication skills, with the ability to explain complex technical issues clearly to both technical and non-technical audiences

Benefits

Comp & perks

Flexible work arrangements
Professional development opportunities
Continuous improvement culture
Mentorship opportunities

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

software developmentautomationincident managementroot cause analysisinfrastructure as codesystem designreliability metricstroubleshootingprocess improvementsscalable systems

Soft Skills

communicationmentorshipcollaborationproblem-solvingevidence-driven approach