Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
<Undefined>

Staff Site Reliability Engineer – d/f/m

<Undefined>

Site Reliability Engineer designing and maintaining infrastructure automation and monitoring solutions for Personio's HR platform. Collaborating across teams to improve system reliability and scalability.

Posted 4/16/2026full-timeMunich • 🇩🇪 GermanyLeadWebsite

Tech Stack

Tools & technologies
AWSDistributed SystemsDockerJavaJavaScriptKafkaKotlinKubernetesNode.jsPythonTypeScript

About the role

Key responsibilities & impact
  • Engage in and improve the full service lifecycle from initial design through deployment, operation, and continuous improvement.
  • Prepare services for production by engaging in system design reviews, developing shared frameworks and platforms, planning capacity and conducting launch assessments.
  • Operate, monitor, and maintain live services, designing observability stacks and dashboards to track key metrics and improve operational insight.
  • Ensure sustainable scalability through automation, driving continuous evolution to increase reliability and delivery speed.
  • Collaborate with product and engineering teams to define SLOs, error budgets and ensure services are reliable, scalable and observable.
  • Lead incident management processes, including on-call rotations, managing outages, driving post-mortems and conducting root cause analysis.
  • Identify and reduce toil through process automation, creating playbooks and automated runbooks to reduce MTTR.
  • Define resilience strategies and implement chaos testing to proactively uncover weaknesses and validate recovery strategies.
  • Mentor, train and grow the community. Guide engineers across teams in reliability best practices and tooling.

Requirements

What you’ll need
  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 8+ years of experience with SaaS software development in distributed systems using languages such as Kotlin/Java, Typescript, Python, and technologies like IaC, Docker, and Kubernetes.
  • 2+ years’ experience as an SRE or similar role designing, operating, analyzing and troubleshooting distributed systems in agile environments.
  • Strong knowledge of modern application and infrastructure monitoring concepts (Datadog and/or AWS experience advantageous).
  • Systematic problem solving and debugging skills with a strong sense of ownership and bias towards establishing mechanisms which can scale across the entire company.
  • Excellent written, verbal, and documentation skills.
  • Collaborative team player, able to communicate effectively across disciplines.
  • Experience with CI/CD tooling (GitHub Actions/GitOps tools) (Nice to Have/Bonus).
  • Experience tuning JVM-based services and Node.js runtimes (Nice to Have/Bonus).
  • Experience with event-driven architectures (Kafka, SNS/SQS) (Nice to Have/Bonus).

Benefits

Comp & perks
  • Receive a competitive reward package – reevaluated each year – that includes salary, benefits, and pre-IPO equity.
  • Enjoy 28 days of paid vacation, plus an additional day after 2 and 4 years.
  • Make an impact on the environment and society with 1 (fully paid) Impact Day.
  • Receive generous family leave, child support, mental health support, and sabbatical opportunities.
  • We enjoy gathering for meals, cultural initiatives, and events like local Summer Sessions and year-end celebrations. There's also healthy snacks, drinks, and a weekly catered lunch.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KotlinJavaTypescriptPythonInfrastructure as Code (IaC)DockerKubernetesCI/CDJVM tuningNode.js
Soft Skills
systematic problem solvingdebuggingownershipcollaborationcommunicationmentoringtrainingdocumentationleadershipteam player
Certifications
Bachelor’s degree in Computer Sciencerelated fieldequivalent practical experience