Salary
💰 $160,000 - $175,000 per year
Tech Stack
AzureCloudDistributed SystemsFluxGrafanaKubernetesPrometheusPythonSDLCTerraform
About the role
- Azure Cloud & Kubernetes (AKS): Architect, deploy, and manage highly available Kubernetes clusters in Azure Kubernetes Service (AKS), ensuring scalability, performance, and security for production workloads.
- GitOps Enablement: Design and implement GitOps workflows leveraging GitHub and related tooling to enable declarative, version-controlled infrastructure and application deployments.
- CI/CD Modernization: Build and optimize CI/CD pipelines (GitHub Actions, Azure DevOps, or similar) to accelerate delivery while maintaining quality, security, and compliance.
- Observability & Reliability: Expand and refine observability platforms (Prometheus, Grafana, ELK, Azure Monitor) to delivery actionable metrics, logs, and traces for proactive system health management.
- Infrastructure as Code: Use Terraform, Helm, and related tooling to automate infrastructure provisioning and configuration in Azure.
- Collaboration & Enablement: Partner with engineering, QA, and product teams to embed DevOps best practices into the SDLC, promoting automation, repeatability, and continuous improvement.
- Incident Response: Participate in high-severity incident response, conduct blameless postmortems, and implement preventative measures to strengthen system resilience.
- Mentorship & Technical Leadership: Coach and guide DevOps engineers and developers in cloud-native, GitOps , and automation practices.
Requirements
- 7+ years of DevOps, SRE, or cloud engineering experience, with significant hands-on expertise in Azure and Kubernetes (preferably AKS).
- Strong track record designing and managing cloud-native, containerized workloads in production environments.
- Proven experience implementing CI/CD pipelines using GitHub Actions, Azure DevOps, or similar tool, including automated testing, security scanning, and deployment strategies.
- Practical knowledge of GitOps methodologies using ArgoCD , Flux, or similar tools.
- Pro ficiency in Infrastructure as Code (Terraform, Bicep, Helm) and configuration management.
- Skilled in observability tooling such as Prometheus, Grafana, OTEL, Azure Monitor, and log aggregation platforms.
- Strong scripting skills (Python, Bash, PowerShell) for automation and tooling.
- Familiarity with DevSecOps practices, cloud security principles, and compliance considerations for SaaS platforms.
- SaaS environment experience highly preferred.
- Must be able to consistently work EST hours.