
Operations SME
Interval Group
contract
Posted on:
Location Type: Remote
Location: Remote • 🇩🇪 Germany
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
GrafanaITSMKubernetesPrometheus
About the role
- Consulting on CI/CD and Operational Readiness
- Validate deployment artifacts from a critical operations perspective
- Define and enforce quality assurance measures for product quality
- Ensure robust rollback strategies and comprehensive operational monitoring for all production deployments
- Monitor system health, performance metrics, and service availability across multi-tenant environments
- Identify, analyze, and resolve incidents swiftly to minimize service disruption
- Trigger root cause analysis and oversee the implementation of corrective and preventive actions
- Reduce operational toil and enhance service reliability by addressing recurring operational issues through automation
- Validate all automated procedures following established software development lifecycles
- Implement monitoring and logging strategies to meet audit and compliance requirements
- Perform routine security scans and actively remediate identified vulnerabilities
- Ensure operational documentation is accurate and up to date by providing feedback and improvements to existing runbooks.
Requirements
- At least 5 years of operational experience
- Self-managed Kubernetes clusters
- Self-managed services providing Kubernetes clusters and productive applications/systems in on-premise environments
- Deep understanding and expertise in networking concepts, including protocols, load balancing, and security
- Profound knowledge and implementation experience with CI/CD processes, tooling, concepts, and associated quality/security assurance
- Fundamental understanding of core operations processes (Incident, Change, Problem Management, IT Service Management) and SRE concepts
- Experience in gathering operational insights from monitoring or observability, including SLI/SLA/SLO management and tracking
- Hands-on experience with documenting procedures properly and enforcing clear runbooks or playbooks
- Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog)
- Proficiency in both speech and writing in English (at least C1).
Benefits
- Flexible working hours
- Freedom to choose projects
- Access to exciting projects in various industries
- Competitive pay
- Dedicated team support
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
CI/CDKubernetesnetworking conceptsload balancingsecurityquality assuranceIncident ManagementChange ManagementProblem ManagementSRE concepts
Soft skills
self-managedanalytical skillsproblem-solvingcommunicationdocumentation