Ada Health

Site Reliability Engineer

Ada Health

full-time

Posted on:

Origin:  • 🇩🇪 Germany

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AWSAzureCloudDockerGoogle Cloud PlatformJavaJavaScriptKotlinKubernetesMongoDBNode.jsPostgresTerraform

About the role

  • Help to develop and maintain our observability stack alongside defining and working on our SLOs.
  • Automate operational processes through infrastructure as code and CI/CD pipelines to improve efficiency and reduce manual toil.
  • Support our development teams with their monitoring setup and establishing best practice, as well as supporting with infrastructure and traffic management related tasks.
  • Participate in on-call rotations, troubleshooting production issues, and conducting post-incident reviews to identify root causes.
  • Develop internal tools and guidelines for our delivery teams to establish a well-lit path for them to follow.
  • Mentor and support teammates in best practices for observability, automation, and infrastructure management.

Requirements

  • 3+ years' experience in a similar SRE position, with a background in software development with a strong bias towards quality and operational excellence.
  • Experience in developing, deploying and monitoring production applications used by real users.
  • Background in working with scalable software architectures and modern software tools as well as a strong interest in infrastructure and cloud technologies.
  • A good understanding of Cloud infrastructure and Infrastructure as Code, as well as basic knowledge in agile development (Scrum or Kanban).
  • Familiarity with Cloudflare, Honeycomb, Sumologic, Docker, Kubernetes, GCP, Azure, AWS, Terraform, Java/Kotlin, Node.js, MongoDB, and PostgreSQL.
  • Experience in proactively engaging with development teams to guide and support them in a hands-on capacity.
  • Willingness to pair, learn, teach, share, communicate and document things every day, as well as participate in the On-Call rotations.