Infrastructure as Code (IaC) Development: Design, implement, and maintain infrastructure as code primarily using Terraform, but also with tools such as CloudFormation or Ansible to automate the provisioning and management of cloud-based resources.
Cloud Platform Expertise: Develop and maintain infrastructure solutions on one or more major cloud platforms (AWS, Azure, GCP), including compute, storage, networking, and security services.
Configuration Management: Utilize configuration management tools (e.g., Ansible, Puppet, Chef) to automate the configuration of servers, applications, and services.
CI/CD Pipeline Integration: Integrate infrastructure automation processes into CI/CD pipelines to enable automated deployments and infrastructure changes.
Monitoring and Logging: Implement monitoring and logging solutions to track infrastructure performance and health, ensuring reliability and uptime.
Automation Scripting: Develop custom automation scripts using scripting languages (e.g., Python, Bash) to streamline infrastructure-related tasks and workflows.
Linux Administration: Manage and maintain Linux-based servers, including system administration, package management, and performance tuning.
Containerization and Orchestration: Design and implement solutions using containerization technologies (Docker) and orchestration tools (Kubernetes) to improve application deployments and scalability.
Serverless Technologies: Develop and manage serverless solutions, particularly using AWS Lambda, and integrate them into our infrastructure and workflows.
Kubernetes Ecosystem and CNCF Landscape: Stay informed and work with various technologies in the Kubernetes ecosystem and the Cloud Native Computing Foundation (CNCF) landscape (e.g., service meshes, operators, etc.).
AWS Security and Networking: Implement and maintain secure infrastructure and network configurations within AWS environments, ensuring compliance with security policies and best practices.
Security Best Practices: Implement security best practices in all aspects of infrastructure automation, including access controls, vulnerability management, and compliance.
Performance Optimization: Identify opportunities for infrastructure optimization to improve performance, reduce costs, and enhance scalability.
Application Development with Python: Develop simple tools and applications using Python to enhance our automation efforts and address specific operational needs.
Collaboration: Collaborate closely with development teams, security teams, and other stakeholders to understand their needs and deliver effective solutions.
Documentation: Create and maintain clear, comprehensive documentation for all infrastructure automation processes, scripts, configurations, and serverless deployments.
Troubleshooting and Incident Response: Troubleshoot infrastructure-related issues, participate in incident response, and implement corrective actions.
Continuous Improvement: Continuously identify opportunities to improve infrastructure automation processes and adopt best practices.
Stay up to date: Stay informed about new technologies and trends in infrastructure automation, Linux administration, containerization, serverless technologies, Python development, Kubernetes and the broader CNCF landscape, and cloud security and networking.
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
5+ years of experience as a DevOps Engineer, with a specialization in infrastructure automation.
Deep understanding and hands-on expertise with Terraform for Infrastructure as Code (IaC).
Proven experience with cloud platforms (AWS, Azure, GCP), including core services and best practices.
Proficiency in configuration management tools (Ansible, Puppet, Chef).
Extensive scripting skills in Python and Bash, with proven experience in developing custom automation solutions.
Experience developing simple tools and applications with python.
Experience with CI/CD pipelines and their integration with infrastructure automation.
Solid understanding of Linux administration, including command-line tools, package management, and performance tuning.
Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK, CloudWatch, Azure Monitor, GCP Monitoring).
Solid understanding of networking concepts (e.g., VPCs, subnets, routing, DNS), and AWS networking specifically.
Experience implementing security best practices in infrastructure automation.
Strong understanding of containerization technologies (Docker) and container orchestration tools (Kubernetes).
Hands-on experience with serverless technologies, particularly AWS Lambda.
Deep understanding of the Kubernetes ecosystem, including various tools, extensions and the broader CNCF landscape.
Strong working knowledge of AWS security best practices and AWS networking concepts.
Excellent problem-solving, analytical, and troubleshooting skills.
Strong communication and collaboration skills.
Ability to work independently and in a team environment.
Additional Experience Desired:
Experience with serverless technologies (Azure Functions, GCP Cloud Functions).
Experience with infrastructure performance analysis and optimization.
Experience with database infrastructure automation.
Knowledge of database technologies (e.g., PostgreSQL, MySQL, MongoDB).