Tech Stack
AWSAzureCloudDockerGoogle Cloud PlatformJavaJavaScriptKotlinKubernetesMongoDBNode.jsPostgresTerraform
About the role
- Help to develop and maintain our observability stack alongside defining and working on our SLOs.
- Automate operational processes through infrastructure as code and CI/CD pipelines to improve efficiency and reduce manual toil.
- Support our development teams with their monitoring setup and establishing best practice, as well as supporting with infrastructure and traffic management related tasks.
- Participate in on-call rotations, troubleshooting production issues, and conducting post-incident reviews to identify root causes.
- Develop internal tools and guidelines for our delivery teams to establish a well-lit path for them to follow.
- Mentor and support teammates in best practices for observability, automation, and infrastructure management.
Requirements
- 3+ years' experience in a similar SRE position, with a background in software development with a strong bias towards quality and operational excellence.
- Experience in developing, deploying and monitoring production applications used by real users.
- Background in working with scalable software architectures and modern software tools as well as a strong interest in infrastructure and cloud technologies.
- A good understanding of Cloud infrastructure and Infrastructure as Code, as well as basic knowledge in agile development (Scrum or Kanban).
- Familiarity with Cloudflare, Honeycomb, Sumologic, Docker, Kubernetes, GCP, Azure, AWS, Terraform, Java/Kotlin, Node.js, MongoDB, and PostgreSQL.
- Experience in proactively engaging with development teams to guide and support them in a hands-on capacity.
- Willingness to pair, learn, teach, share, communicate and document things every day, as well as participate in the On-Call rotations.