Salary
💰 $120,000 - $135,000 per year
Tech Stack
AnsibleAzureChefCloudDNSGrafanaJenkinsKubernetesPuppetPythonRubyTerraform
About the role
- Monitoring systems to maintain high levels of performance through analysis and tuning
- Implementing scalability and fault tolerance into our solutions
- Automating processes to drive operational efficiencies
- Troubleshooting application and middleware issues
- Collaborating with engineering teams to enhance production environments handling high-throughput
- Developing and maintaining robust deployment pipelines for seamless code delivery
Requirements
- Experience with Microsoft Azure
- Strong expertise in Terraform, App Services, and Kubernetes
- Fluent in written and spoken English
- A solid passion for reliability and operational excellence
- Hands-on experience in creating and modifying Terraform deployments
- Previous operations experience, ideally in a Site Reliability Engineer role
- Ability to effectively collaborate across multiple teams, take ownership of tasks, and prioritize effectively
- Excellent communication and interpersonal skills
- Familiarity with monitoring solutions such as Datadog, Azure Application Insights, or Log Analytics
- Proficiency in scripting/programming for automation, especially in PowerShell (preferred), Bash, C#, Ruby, or Python
- Experience supporting web-based applications
- (Desirable) Experience with Azure DevOps pipelines
- (Desirable) Familiarity with Microsoft Server Operating Systems
- (Desirable) Understanding of service level objectives and operational requirements for cloud-based solutions
- (Desirable) Deep knowledge of Microsoft Azure Cloud offerings, particularly in PaaS (Web Apps, Storage, Functions)
- (Desirable) Experience with tools like Terraform, Ansible, VSTS, ARM, Puppet, Chef, Jenkins, ELK, and Grafana
- (Desirable) Understanding of DNS, Load Balancer configuration, Active Directory, and cloud-based infrastructure
- (Desirable) Experience working in an agile environment, particularly with methodologies such as TDD, Scrum, or Kanban
- (Desirable) Knowledge of implementing monitoring and alerting systems for microservice architectures
- (Desirable) Applied understanding of cloud security best practices