Tech Stack
AWSAzureCloudDNSDockerJavaKubernetesPythonSQLTerraform
About the role
- Build, architect and maintain secure, scalable, and highly available cloud infrastructure in Azure, with some support for AWS.
- Lead the design and implementation of Infrastructure as Code (IaC) using Bicep or Terraform.
- Champion DevSecOps practices and CI/CD pipeline improvements across teams.
- Integrate security and compliance into infrastructure and deployment workflows from the ground up.
- Own observability and reliability strategies, applying SRE principles such as incident response, SLOs, and postmortems.
- Mentor and advise other engineers and peers, influencing architectural decisions.
- Own delivery of key DevOps initiatives from planning through execution, collaborating across engineering, security, and product teams to align infrastructure with business goals.
- Participate in Agile ceremonies and influence release planning and architectural decisions.
- Evaluate and implement new tools and technologies to enhance system performance, reliability, and developer experience.
- Define and track KPIs and SLOs to measure system health and team effectiveness.
- Lead incident response efforts and conduct blameless postmortems to drive continuous improvement.
- Partner with security and compliance teams to ensure infrastructure meets regulatory and organizational standards.
- Contribute to the development of internal documentation, runbooks, and knowledge-sharing initiatives.
- Participate in on-call rotations and provide Level 3 support for production systems.
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field—or equivalent practical experience.
- At least 8 years of hands-on experience in a DevOps, Cloud Engineering, or Site Reliability Engineering role.
- Deep expertise in Microsoft Azure; working knowledge of AWS is a plus.
- Strong experience with web-based environments and SQL or cloud-native databases.
- Advanced proficiency with IaC tools such as Bicep or Terraform.
- In-depth knowledge of Azure services including Azure AD, networking, DNS, load balancing, and VPNs.
- Experience with containerization and orchestration (e.g., Docker, Kubernetes).
- Familiarity with automated testing, security scanning, and compliance tools.
- Proficient in scripting or programming languages (e.g., PowerShell, Python, Java).
- Strong understanding of Git workflows and version control best practices.
- Hands-on experience with Azure DevOps or similar CI/CD platforms.
- Proven track record of applying SRE and DevOps principles to improve system reliability and team efficiency.
- Excellent communication, able to tailor messaging across engineering and business audiences.
- Stragetic thinking with the ability to assess risk and design scalable solutions.
- Demonstrated ability to lead and deliver complex initiatives involving multiple stakeholders.
- Proven track record of influencing decisions and mentoring engineers.
- Comfortable working autonomously in ambiguous situations and driving clarity through experimentation and iteration.