Tech Stack
AnsibleAzureCloudDNSFirewallsKubernetesLinuxPythonServiceNowSQLTerraform
About the role
- Consult and contribute to business process improvements of IT policies, procedures, tools, security controls and infrastructure
- Conduct research and Proof of Concept activities and provide recommendations for new technologies, techniques and tools
- Maintain the operational Azure cloud environment and perform infrastructure deployments
- Ensure the production environment is operating securely and efficiently using automation tools
- Work with Infrastructure Engineering to make sure all deployments follow standards, are compliant and secure
- Partner with other cloud team members, application owners and business owners for faster delivery of resources
- Report to the Cloud Platform Team Lead
- Develop, implement and maintain scalable, high-performance AI infrastructure on Microsoft Azure platform
- Collaborate with business stakeholders to understand their AI requirements and design effective AI solutions
- Provide technical expertise and support troubleshooting AI-related issues and performance bottlenecks
- Recommend AI based solutions, document and communicate to technical and non-technical stakeholders effectively
- Design and manage scalable and resilient cloud infrastructure and automation tooling for routine operation tasks
- Deploy enterprise scaled cloud resources that can be used across technology organizations using code
- Be the contact to business application owners and build bridge between them and cloud team
- Follow security audit requirements and controls in every action
- Implement and suggest strategies in Infrastructure as Code (IaC), CI/CD, develop tooling and associated processes for applying automated, faster, efficient, and consistent error-free deployments
- Enhance existing deployment processes and Infrastructure as Code scripts to improve automation, efficiency, and consistency
- Participate in team activities such as peer code reviews, design collaboration efforts and rotating on-call process
- Use configuration management and automated deployment tools to provide cloud-based solutions and services ensuring best practices
- Work with solution architects, tech leads and partners to confirm developed solutions align with existing standards, meet security controls and meet cloud computing best practices
- Collaborate with other engineers, provide guidance and training
- Manage implementations and ongoing operation of cloud services following approved designs, adherence to patterns, security requirements and certify all logging and monitoring are active for security and compliance
- Investigate and monitor current-state cloud usage, solutions in use, risks, gaps, limitations and recommend solutions, optimizations and remediation
- Maintain comprehensive and systematic configuration, lifecycle and security management of servers and services
- Provide international and 24/7 support via on-call duties and/or working a flexible off-hours schedule for planned and unplanned maintenance
Requirements
- 10+ years of experience deploying resources and managing infrastructure through both manual and automated pipelines in an IT infrastructure and operational environment
- 5+ years of experience maintaining resources in cross platform environments
- 5+ years of experience with Azure cloud infrastructure deployments using IaC, preferably Terraform
- 5+ years of experience building, deploying and managing public cloud IaaS and PaaS services
- 5+ years of experience deploying resources using Terraform
- 4+ years of experience with GitHub and GitHub Actions
- 4+ years of experience building automated cloud infrastructure across multiple Production and Non-Production environments
- 4+ years of experience scripting and automating builds, workflows, tasks and other integration aspects
- 4+ years of experience with Azure CLI
- 3+ years of experience with CI/CD pipelines and YAML/YML for GitHub actions
- 3+ years of experience troubleshooting in an Azure Cloud environment
- 3+ years of experience designing and managing cloud solutions
- Python programming knowledge, particularly using with Azure AI is highly preferred
- Good knowledge of Azure AI and exposure to Artificial Intelligence technologies, frameworks, or solutions
- Experience with Azure IaaS and PaaS services (Virtual Machines, Storage, Networking, App Services, Azure SQL, Azure Functions, Azure Data Factory)
- Knowledge of identity management, security controls, compliance and encryption
- Knowledge of Azure Kubernetes Services and other containers
- Experience with Linux and Microsoft Windows server environments
- Experience using Jira, Confluence, or ServiceNow preferred
- Knowledge in Octopus Deploy or Ansible preferred
- Experience using PowerShell
- Solid understanding of DNS, LAN, WAN, Firewalls, File systems, IAM, etc.
- Ability to help build and drive culture in Azure Cloud environments and DevOps
- Ability to plan work of self and others and resolve technical problems
- Bachelor's degree in MIS, computer science, or a related field; advanced degree in a related field