Tech Stack
AnsibleAWSCloudDistributed SystemsDockerEC2KubernetesLinuxNode.jsPackerPythonTerraform
About the role
- Lead design and deployment of production cloud infrastructure in AWS, including EC2, S3, RDS, VPC, IAM, Route 53, ELB, Auto Scaling, EBS, EFS, Lambda, CloudWatch, CloudTrail, Config, SNS/SQS; multi-tier VPCs and hybrid connectivity.
- Implement IaC using Terraform, Packer, Ansible; modular, version-controlled automation.
- Deploy and operate Kubernetes (EKS) clusters at scale; configure node groups, autoscaling, network policies, ingress, and secrets management.
- Apply security best practices including encryption, secrets management, patching, logging/monitoring, and FedRAMP/DoD IL4/IL5 compliance for GovCloud.
- Proficiency in scripting (PowerShell, Bash, Python); strong troubleshooting.
- Bachelor's degree in a relevant field; AWS GovCloud IL4/IL5 experience is a differentiator.
Requirements
- 5+ years hands-on experience designing, implementing, and supporting secure, scalable, highly available AWS infrastructure
- Deep expertise with core AWS services (EC2, S3, RDS, VPC, IAM, Route 53, ELB, Auto Scaling, EBS, EFS, Lambda, CloudWatch, CloudTrail, Config, SNS/SQS)
- Strong experience with multi-tier VPC architectures, site-to-site VPNs, Transit Gateway, Direct Connect, and hybrid connectivity
- IaC experience with Terraform, Packer, and Ansible; modular, version-controlled automation
- Experience deploying and operating Kubernetes clusters in AWS (EKS) at scale (node groups, autoscaling, network policies, ingress, secrets)
- CI/CD integration in AWS (GitLab CI and FluxCD)
- IAM design and governance (custom roles/policies, cross-account roles, OIDC/SAML SSO, service-linked roles)
- AWS security best practices: encryption (KMS, SSL/TLS), Secrets Manager/Parameter Store, patching, logging/alerting (CloudWatch, GuardDuty, Security Hub), security automation
- GovCloud (US) experience and understanding of FedRAMP High, DoD IL4/IL5, NIST 800-53, etc.
- Proficiency with scripting (PowerShell, Bash, Python)
- Troubleshooting distributed systems (network, performance, container, and application issues)
- Familiarity with monitoring/observability stacks (CloudWatch)
- Cost optimization, tagging, right-sizing
- Bachelor's degree in CS/IS/Engineering or related field