
Site Reliability Engineer
Unikraft
full-time
Posted on:
Location Type: Remote
Location: Germany
Visit company websiteExplore more
Tech Stack
About the role
- Maintain and operate customer on-prem and cloud deployments of our platform, ensuring reliability and rapid troubleshooting of technical issues.
- Plan, package, and roll out software updates both internally and to customers, including testing and validation.
- Collaborate with engineering to ensure quality deployments and maintain a high standard of product reliability.
- Deploy, manage, and troubleshoot Kubernetes clusters for reliable, scalable infrastructure.
- Set up and manage monitoring systems to proactively detect and resolve issues in production environments.
- Write scripts and automation for deployment, infrastructure management, and CI/CD workflows.
- Build tooling and automation to streamline deployment and platform integration.
- Contribute to continuous integration pipelines that catch regressions across components and system integrations.
- Create and maintain clear documentation for systems, processes, and tools to support team effectiveness.
Requirements
- At least 2 years of experience working in high-pressure production environments.
- Proven experience in Linux system administration, software packaging, and delivery.
- Solid understanding of Linux networking fundamentals, including firewalls, DNS, proxies, and best practices.
- Experience managing and troubleshooting Kubernetes clusters in production.
- Good understanding of the CNCF/cloud-native landscape and associated tools.
- Familiarity with observability tools such as Prometheus and Grafana.
- Basic scripting skills (e.g., Bash, Python).
- Familiarity with cloud platforms (e.g., AWS, GCP, Azure).
- Interest in automation tools like Ansible, Terraform, or similar.
- Exposure to CI/CD pipelines (e.g., GitHub Actions, Jenkins, GitLab CI).
- Familiarity with microservice architectures, serverless, and DevOps best practices.
- [BONUS] Familiarity with virtualization solutions like QEMU/KVM -- micro-VMMs like Cloud-Hypervisor or Firecracker are a big plus.
Benefits
- Competitive salary
- 6 weeks of vacation
- Development opportunities
- Fully Remote, Fully Flexible
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Linux system administrationsoftware packagingKubernetesscriptingCI/CDcloud platformsnetworking fundamentalsautomation toolsmicroservice architecturesvirtualization solutions
Soft Skills
collaborationtroubleshootingdocumentationproblem-solvingteam effectiveness