
Senior Cloud DevOps Engineer
Digital Zone
full-time
Posted on:
Location Type: Remote
Location: Egypt
Visit company websiteExplore more
Job Level
About the role
- Design, architect, and implement scalable, secure, and cost-efficient AWS infrastructure across multiple regions, aligned with the AWS Well-Architected Framework.
- Build, maintain, and evolve all cloud infrastructure using Terraform, enforcing module reusability, remote state management, and IaC best practices across environments.
- Own and optimize PostgreSQL database: including performance, scalability, and reliability.
- Deploy, manage, and optimize workloads on Kubernetes clusters using Helm and Kustomize.
- Design, implement, and maintain CI/CD pipelines using GitHub Actions.
- Lead DR planning, runbook creation, and failure scenario modeling — including database backup and recovery strategies.
- Support and unblock engineering teams on infrastructure and database needs.
- Operate and improve monitoring, logging, and alerting systems — including database-specific monitoring (query performance, replication health, connection saturation) — to ensure high availability and fast incident response.
- Participate in and help improve the weekly on-call rotation.
Requirements
- 6–8+ years of professional experience in Cloud Engineering, DevOps, or SRE roles, with a proven track record operating highly scalable, high-availability systems in production.
- Deep, hands-on experience with AWS core services (EKS, ECS, EC2, VPC, IAM, RDS, Amazon Aurora, S3, Route 53, CloudFront, ALB/NLB, etc.) in real production workloads.
- Expert-level proficiency with Terraform, including module design, remote state management, and multi-environment/multi-region setups.
- Strong PostgreSQL expertise in production, including: query and index performance tuning, sharding strategies (e.g., application-level sharding, or partitioning), replication setup and management (streaming, logical), connection pooling (PgBouncer), vacuum tuning, and planning/executing major version upgrades with minimal downtime.
- Experience managing large-scale PostgreSQL databases (hundreds of GBs to TBs) under high-traffic workloads, with a solid understanding of how schema design, indexing, and partitioning decisions affect performance at scale.
- Strong production experience operating and optimizing Kubernetes clusters (deployments, scaling, RBAC, networking, security policies, cluster upgrades).
- Proven experience designing and maintaining CI/CD pipelines using GitHub Actions.
- Solid experience with GitOps principles and tools; hands-on experience with ArgoCD is strongly preferred.
- Strong understanding of networking fundamentals (DNS, VPC peering, Transit Gateway, VPN, load balancing) and cloud security best practices.
- Experience with logging, monitoring, and alerting stacks (e.g., ELK, EFK, LGTM, CloudWatch) across multiple environments, including database-specific monitoring.
- Proficiency in Bash and Python for automation and tooling.
- Strong Git workflow knowledge, including branching strategies and code review practices.
- Experience designing and implementing multi-region architectures with failover and DR strategies.
Benefits
- Flexible work arrangements
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AWSTerraformPostgreSQLKubernetesCI/CDGitHub ActionsBashPythonGitOpsnetworking fundamentals
Soft Skills
leadershipproblem-solvingcollaborationcommunicationplanningincident responseoptimizationmonitoringscalabilityreliability