Anaplan

Senior Site Reliability Engineer

Anaplan

full-time

Posted on:

Origin:  • 🇬🇧 United Kingdom

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

ApacheAWSAzureDistributed SystemsGoogle Cloud PlatformKubernetesLinuxPulsarPython

About the role

  • Support production platforms and participate in on-call rotation
  • Work closely with service teams to continuously improve reliability, scalability, and performance of systems
  • Develop automation solutions and drive improvements in automation, observability, and reliability practices
  • Troubleshoot and resolve production incidents and contribute to sustainable long-term solutions
  • Mentor and support other SRE team members and provide expert guidance to development teams
  • Lead complex changes, influence operating strategy, and identify and deliver impactful SRE-led projects
  • Partner with infrastructure teams to evolve and strengthen the platform

Requirements

  • 6+ years of experience in SRE or equivalent operationally focussed engineering roles
  • Experience of Linux administration will be a day-one skill
  • Experience of operating live, production-grade Kubernetes environments
  • Expertise in problem diagnosis across complex, distributed systems
  • Proficiency in a scripting language suited to automation (e.g., Python, Bash)
  • Experience with Git version control and modern CI/CD and DevOps practices
  • Participate in on-call rotation
  • Hands-on experience with one or more public clouds (AWS, GCP, Azure) (desirable)
  • Experience with Event Streaming, Exception Management, and Integration technologies such as Apache Pulsar (desirable)
  • Experience with Stream-processing and batch-processing frameworks such as Apache Flink (desirable)
  • Experience with configuration management, and infrastructure as code (desirable)
  • Knowledge of observability and monitoring best practices (desirable)
  • Prior experience mentoring or coaching other engineers (desirable)