Tech Stack
AWSCloudEC2GrafanaLinuxNode.jsPrometheusPythonTerraform
About the role
- Architect and deliver AWS infrastructure using Terraform (modular, reusable, versioned IaC) across multiple accounts and environments
- Design, deploy, and harden EKS clusters (networking, autoscaling, security, ingress, service mesh optional) and standardize app deployment patterns
- Implement GitOps with Argo/Argo CD (ApplicationSets, sync policies, RBAC, SSO, secrets management, progressive delivery)
- Build CI/CD pipelines with GitHub Actions (reusable workflows, artifact strategy, rollouts/rollbacks, automated quality gates)
- Define SLI/SLOs and enable observability with metrics, logs, traces, alerting, and runbooks using best-in-class tools
- Drive platform reliability and security: patching, upgrades, backups, disaster recovery, IAM least privilege, secrets, and compliance guardrails
- Optimize performance and cost: autoscaling, right-sizing, spot/on-demand strategies, efficiency dashboards
- Partner with developers to create golden paths, templates, and documentation that accelerate safe delivery
- Participate in on-call, lead incident RCAs, and drive improvements through code and process changes
Requirements
- 5–8+ years building and operating production cloud infrastructure (preferably AWS)
- Strong Terraform expertise: modules, workspaces, state management, CI validation (fmt/validate/tflint), policy as code (Sentinel/OPA)
- GitHub & GitHub Actions: branch protection, environments, reusable workflows, OIDC to AWS, secrets/variables, caching, matrices
- AWS services: VPC, IAM, EKS, ALB/NLB, EC2/EKS node groups, ECR, RDS/ElastiCache, S3, CloudWatch/CloudTrail, KMS, Secrets Manager
- EKS & container orchestration: Helm/Kustomize, rollout strategies, cluster autoscaler, networking (CNI/ingress), HPA/PDBs
- Argo & Argo CD: Applications & ApplicationSets, sync/health checks, drift detection, RBAC, SSO
- Observability tools: metrics/logs/traces (Prometheus/Grafana, OpenTelemetry, Datadog, New Relic, Honeycomb, CloudWatch, ELK/OpenSearch)
- CI/CD strategies: trunk-based, quality gates (tests, SAST/DAST, IaC scans), artifact/versioning, environment promotions, canary/blue-green
- Strong scripting (Bash/Python), configuration tooling (Helm/Kustomize), solid Linux and networking fundamentals
- Clear communicator with a bias for automation, documentation, and collaborative problem-solving
- Health insurance
- Professional development
- Flexible work arrangements
- Paid time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
AWSTerraformGitOpsEKSCI/CDScriptingObservabilityContainer orchestrationPolicy as codeNetworking
Soft skills
Clear communicatorCollaborative problem-solvingBias for automationDocumentationIncident management