Matillion

Site Reliability Engineer

Matillion

full-time

Posted on:

Location Type: Hybrid

Location: ManchesterUnited Kingdom

Visit company website

Explore more

AI Apply
Apply

Salary

💰 £49,600 - £74,400 per year

About the role

  • Engineering Reliability: Designing and implementing self-healing infrastructure using Kubernetes to maintain high uptime and system integrity.
  • Scaling Cloud Ecosystems: Optimizing our cloud footprint (AWS/GCP/Azure) to ensure our platforms can handle rapid growth without breaking a sweat.
  • Innovating with AI: Proactively identifying opportunities to integrate AI tools into our observability stack to automate incident detection and root-cause analysis.
  • Eliminating Toil: Writing clean, efficient code to automate repetitive operational tasks, turning manual workflows into seamless "set and forget" processes.
  • Defining Observability: Building advanced monitoring and alerting frameworks that provide deep insights into system health and performance.

Requirements

  • Kubernetes Power User: Extensive experience managing production-grade K8s environments, including ingress, service mesh, and container security.
  • Cloud Infrastructure Expert: A deep understanding of cloud networking, storage, and compute services within a major provider (AWS, Azure, or GCP).
  • Proactive Mindset: An engineer who doesn't wait for a ticket; you naturally seek out system weaknesses and build solutions to strengthen them.
  • AI Curiosity: An active interest in the AI landscape and a desire to leverage LLMs or machine learning to improve SRE workflows.
  • Programming Literacy: Ideally experience with at least one language (such as Java, Python, Go, or Ruby) to bridge the gap between software engineering and operations.
Benefits
  • Company Equity
  • 30 days holiday + bank holidays
  • 5 days paid volunteering leave
  • Health insurance
  • Life Insurance
  • Pension
  • Access to mental health support
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesAWSGCPAzureAI toolsincident detectionroot-cause analysisprogramming (Java, Python, Go, Ruby)monitoring frameworksalerting frameworks
Soft Skills
proactive mindsetproblem-solvingcuriosity