Tech Stack
AnsibleAWSEC2KubernetesLinuxPythonTerraform
About the role
- Build, maintain, and improve CI/CD pipelines using GitLab CI/CD or similar tools
- Automate infrastructure provisioning, deployment, and maintenance using Terraform, Ansible, or related technologies
- Collaborate with developers and QA to create reliable deployment paths from local dev to production
- Implement infrastructure-as-code practices across environments (e.g., AWS, Kubernetes, bare-metal)
- Design and implement monitoring, alerting, and observability systems to maintain high availability and performance
- Respond to incidents, lead root cause analysis, and implement preventive measures
- Establish and evolve SLOs/SLIs to ensure measurable system reliability
- Participate in on-call rotation and build automation to reduce human intervention
- Drive capacity planning, performance tuning, and cost optimization initiatives
- Administer Linux (Ubuntu/Debian) and Windows-based infrastructure
- Manage self-hosted GitLab instances and ensure secure, performant operation
- Implement and enforce security best practices across infrastructure (IAM, RBAC, least privilege)
- Support both containerized and virtualized workloads across environments
Requirements
- 5+ years in DevOps, SRE, or Infrastructure Engineering roles
- Hands-on experience and proficiency with AWS services (EC2, S3, RDS, VPC, IAM, etc.)
- Experience deploying and managing infrastructure using Terraform and/or Ansible
- Proven experience managing and automating GitLab, including CI/CD pipelines
- Proficiency in at least one programming or scripting language (Python, Bash, etc.)
- Solid knowledge of Linux system administration (Ubuntu/Debian)
- Strong skills in Windows system administration environments
- Experience implementing monitoring, logging, and alerting solutions (CloudWatch, Datadog, CloudTrail)
- Solid understanding of networking, security best practices, and high-availability system design
- Familiarity with version control systems (Git) and GitLab workflows
- Strong troubleshooting and incident response skills, with a focus on automation and root cause analysis
- Ability to travel up to 20%
- Applicants must be a U.S. Citizen and willing and eligible to obtain a U.S. Security Clearance at the Secret or Top-Secret level (existing clearance preferred)