Salary
💰 $75,000 - $85,000 per year
Tech Stack
AnsibleAzureChefCloudDNSGrafanaJenkinsKubernetesPuppetPythonRubyTerraform
About the role
- Maintaining high levels of system performance through monitoring and performance tuning
- Implementing scalability and fault tolerance
- Automating processes and improving operational efficiencies
- Troubleshooting application and middleware challenges
- Collaborating with engineering teams to support high-throughput production environments
- Building and maintaining robust deployment pipelines
Requirements
- Proficiency with Microsoft Azure
- Strong expertise in Terraform, App Services, and Kubernetes
- Fluent in both written and spoken English
- A genuine passion for reliability in systems
- Experience in creating and modifying Terraform deployments
- Prior experience in an operations role, ideally as a Site Reliability Engineer
- Ability to work cross-functionally, take ownership of tasks, and prioritize effectively
- Excellent communication and collaboration skills
- Experience with monitoring solutions (e.g., Datadog, Azure Application Insights, Log Analytics)
- Programming/scripting skills for automation (favoring PowerShell, but also comfortable with Bash, C#, Ruby, or Python)
- Experience with web-based applications
- Familiarity with Azure DevOps pipelines
- Experience with Microsoft Server Operating Systems
- Understanding of service level objectives and operational requirements for cloud-based solutions
- Comprehensive knowledge of Microsoft Azure Cloud offerings (especially in PaaS)
- Experience with tools such as Terraform, Ansible, VSTS, ARM, Puppet, Chef, Jenkins, ELK, and Grafana
- Understanding of DNS, Load Balancer configuration, Active Directory, and network infrastructure in the cloud
- Experience in agile environments and methodologies including TDD, Scrum, or Kanban
- Knowledge of monitoring and alerting systems for microservice architectures
- Applied knowledge of cloud security best practices