FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

CD Operations Engineer
Interval GroupSite Reliability Engineer managing and scaling a production Kubernetes platform for innovative companies. Focusing on automation, CI/CD pipelines, and operational excellence.
Tech Stack
Tools & technologiesGrafanaITSMJenkinsKubernetesPrometheus
About the role
Key responsibilities & impact- Maintain and optimise CI/CD pipelines to ensure deployment readiness and validate all deployment artifacts from an operational perspective.
- Define and enforce quality assurance measures, including standard operating procedures and successful test reporting.
- Implement rollback strategies and comprehensive operational monitoring for all production deployments.
- Manage monitoring, incident, problem, and change management within a multi-tenant managed Kubernetes environment.
- Monitor system health, performance metrics, and service availability, resolving incidents to minimise service disruption.
- Perform root cause analysis and implement corrective and preventive actions to enhance platform stability.
- Automate recurring operational tasks and critical processes to reduce toil and improve service reliability.
- Validate automated procedures through the full software development lifecycle, including staging and testing.
- Implement logging and monitoring strategies to adhere to security and audit compliance standards.
- Conduct routine security scans and remediate vulnerabilities across the platform.
Requirements
What you’ll need- Professional proficiency in both English and German (C1 level minimum)
- At least 3 years of hands-on operational experience with self-managed Kubernetes clusters and productive applications in on-premise environments
- Deep understanding of networking concepts, including protocols, load balancing, and security
- Extensive experience with CI/CD processes and tooling, such as GitLab, Jenkins, Tekton, or ArgoCD
- Fundamental understanding of core operations processes including incident, change, and problem management (ITSM) alongside SRE concepts
- Experience gathering operational insights from monitoring and observability tools, including managing SLI/SLA/SLOs
- Proven ability to document procedures and enforce clear runbooks or playbooks
- Practical experience with monitoring and logging stacks such as Prometheus, Grafana, Mimir, or Loki
Benefits
Comp & perks- Flexible working hours
- Freedom to choose your own projects
- Access to exciting projects in various industries
- Competitive pay
- Dedicated team support
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
CI/CD pipelinesKubernetesroot cause analysisnetworking conceptsincident managementchange managementproblem managementmonitoringobservabilitysecurity compliance
Soft Skills
documentationcommunicationproblem-solvinganalytical thinkingcollaborationattention to detailproactive mindsetadaptabilityorganizational skillscritical thinking