FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAnsibleAWSAzureCloudDockerGoGrafanaKubernetesPrometheusPuppetPythonTerraform
About the role
Key responsibilities & impact- Lead and mentor the SRE team, fostering a culture of reliability, accountability, and continuous improvement.
- Develop and implement strategies for multi-cloud, system reliability, monitoring, and incident response.
- Collaborate with development and other SRE teams to enhance system performance and scalability.
- Drive automation efforts to improve deployment processes, infrastructure as code (IaC), and operational efficiency.
- Manage and optimize observability tools for logging, metrics, and alerting.
- Establish and refine service-level objectives (SLOs), service-level indicators (SLIs), and service-level agreements (SLAs).
- Oversee root cause analysis (RCA) where applicable and post-mortem processes to ensure continuous learning and improvement.
- Implement and advocate for best practices in site reliability engineering and DevOps methodologies.
- Ensure high-availability and disaster recovery strategies are in place and regularly tested.
- Drive cost optimization efforts for cloud infrastructure across all cloud providers.
Requirements
What you’ll need- Bachelor’s or master’s degree in computer science, engineering, or a related field.
- 7+ years of experience in software engineering, site reliability engineering, or DevOps, with at least 3 years in a managerial or leadership role.
- Strong knowledge of cloud platforms (Azure, AWS, IBM Cloud) and containerization technologies (Docker, Kubernetes).
- Proficiency in automation and configuration management tools (Terraform, Ansible, Puppet, The Foreman).
- Experience with monitoring and observability tools (Prometheus, Grafana, PagerDuty, Graylog, etc.).
- Solid programming and scripting skills in Python, Go, Bash, or similar languages.
- Expertise in CI/CD pipelines and modern deployment strategies.
- Strong analytical and problem-solving skills with a focus on performance optimization.
- Excellent communication and leadership abilities, with a track record of fostering collaboration between teams.
Benefits
Comp & perks- Flexible ways of working
- Continuous learning opportunities
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Software EngineeringDevOpsProgramming (Python, Go, Bash)CI/CD PipelinesPerformance Optimization
Soft Skills
LeadershipCommunicationAnalytical Problem-Solving
