Tech Stack
AWSCloudCyber SecurityDNSFirewallsGoGrafanaJenkinsKubernetesPrometheusPythonTCP/IPTerraform
About the role
- Infrastructure Management: Architect, build, and scale AWS infrastructure using Infrastructure as Code (IaC) tools such as Terraform. CI/CD & Deployment: Design, implement, and optimize CI/CD pipelines using tools like GitHub Actions, ArgoCD, or similar to streamline deployments and improve release velocity. Kubernetes Operations: Manage and optimize Kubernetes-based infrastructure (Amazon EKS) to ensure scalability, reliability, and efficient resource utilization. Observability & Incident Response: Build and maintain monitoring, alerting, and logging systems (Prometheus, Grafana, Datadog, Loki) to ensure high availability; participate in the on-call rotation to resolve incidents. Security & Compliance: Implement and maintain security controls to meet PCI DSS, HIPAA, GDPR, and SOC 2 standards, and support audit readiness. System Architecture: Contribute to designing fault-tolerant architectures with disaster recovery and high-availability strategies within and out of the CDE environments. Developer Enablement: Partner with developers to improve deployment workflows, reduce lead time for changes, and provide platform tooling support. Documentation & Knowledge Sharing: Create clear runbooks, technical documentation, and knowledge base articles to support team-wide learning and operational excellence.
Requirements
- 3-5 years of experience in SRE, DevOps, or Platform Engineering roles, with at least 2 years in a senior or mid-level capacity.
- Strong hands-on experience with AWS services and IaC tools like Terraform.
- Expertise in Kubernetes operations in production environments (Amazon EKS preferred).
- Proficiency in CI/CD pipeline tools (e.g., GitHub Actions, Jenkins, ArgoCD).
- Strong knowledge of monitoring and observability tooling (Prometheus, Grafana, Datadog, CloudWatch).
- Familiarity with compliance frameworks (PCI DSS, HIPAA, GDPR, SOC 2) and cloud security best practices.
- Excellent problem-solving, troubleshooting, and incident management skills.
- Preferred: Experience supporting developers in platform engineering or internal tooling contexts.
- Familiarity with NIST Cybersecurity Framework (CSF) implementation in SaaS/cloud environments.
- Strong networking fundamentals (TCP/IP, DNS, HTTP, TLS, firewalls).
- Experience with AWS networking services (VPC, Route 53, NAT Gateway, ALB/NLB).
- Background in cost optimization and cloud governance.
- Strong scripting/programming skills (Bash, Python, Go).