Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
tombola

Site Reliability Engineer, SRE

tombola

Site Reliability Engineer ensuring critical systems are reliable and perform well. Responsible for automation, monitoring, incident response, and collaboration with development teams.

Posted 4/14/2026full-timeSunderland • 🇬🇧 United KingdomMid-LevelSeniorWebsite

Tech Stack

Tools & technologies
AWSCloudTerraform

About the role

Key responsibilities & impact
  • Ensure critical systems are always reliable, available, and performing
  • Implement smart automation, monitoring, and incident response strategies
  • Lead incident management and root cause analysis
  • Set up and maintain monitoring systems and alerting systems
  • Optimize resource usage for scalability and performance
  • Collaborate with development teams for reliability of new features
  • Document infrastructure and procedures

Requirements

What you’ll need
  • Experienced SRE with a passion for building reliable, scalable, and efficient systems
  • Strong knowledge in systems reliability and availability
  • Proficient in monitoring systems like Dynatrace
  • Familiar with incident management processes
  • Experience in automation with tools like Terraform, Git, and TeamCity
  • Skilled in performance optimization and capacity planning
  • Knowledgeable in AWS cloud resources and disaster recovery plans
  • Strong understanding of security best practices and compliance
  • Excellent documentation skills
  • Continuous improvement mindset

Benefits

Comp & perks
  • Flexible work arrangements
  • Professional development opportunities

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
systems reliabilityavailabilityautomationperformance optimizationcapacity planningincident managementroot cause analysismonitoring systemsdisaster recoverysecurity best practices
Soft Skills
collaborationdocumentationcontinuous improvement mindset