Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
AlphaSense

Cloud Reliability Engineer – Recovery

AlphaSense

Cloud Reliability & Recovery Engineer focusing on designing and improving AWS BCP and DR capabilities at AlphaSense, a market intelligence company. Collaborates across teams for system resilience and recovery from disruptions.

Posted 5/5/2026full-timeRemote • 🇮🇳 IndiaMid-LevelSeniorWebsite

Tech Stack

Tools & technologies
AWSCloudDNSDynamoDBEC2KubernetesPythonTerraform

About the role

Key responsibilities & impact
  • Design and implement multi-region, multi-AZ AWS architectures that meet RTO/RPO targets
  • Engineer active-active and active-passive failover patterns using Route 53, Global Accelerator, and CloudFront
  • Build automated DR runbooks and playbooks using AWS Systems Manager Automation and Step Functions
  • Implement chaos engineering practices using AWS Fault Injection Simulator (FIS) to validate resiliency
  • Architect cross-region replication strategies for S3, DynamoDB Global Tables, RDS, and Aurora Global
  • Review containerized workloads using Kubernetes, ensuring resilience through self-healing, auto-scaling, and multi-cluster or multi-region deployments.
  • Administer AWS Backup across all services (EC2, EBS, RDS, EFS, FSx, DynamoDB, Aurora) with policy-based automation
  • Design immutable backup vaults and cross-account/cross-region backup replication pipelines
  • Develop and automate data recovery testing procedures, ensuring integrity and meeting defined SLAs
  • Implement point-in-time recovery (PITR) for databases and storage; validate via regular restore drills
  • Maintain Business Continuity Plans (BCP) and Disaster Recovery (DR) strategies, including tracking RTO (Recovery Time Objective) and RPO (Recovery Point Objective).
  • Author and maintain Terraform/CloudFormation templates for all BCP/DR infrastructure components
  • Automate DR testing pipelines through CI/CD (CodePipeline, CodeBuild, GitHub Actions)
  • Write Python/Bash/PowerShell scripts to orchestrate failover, failback, and health-check workflows
  • Manage infrastructure state in AWS Control Tower and implement Landing Zone DR patterns
  • Build CloudWatch dashboards, alarms, and composite alarms for availability and DR-readiness indicators
  • Integrate AWS Health, Personal Health Dashboard events into PagerDuty/OpsGenie alerting workflows
  • Participate in on-call rotations and lead DR incident response; conduct post-incident reviews (PIRs)
  • Develop and maintain runbooks for AWS service degradations, regional outages, and data corruption events
  • Conduct regular BCP/DR tabletop exercises and full failover simulations to validate recovery procedures and improve organizational readiness, document results and action items.
  • Ensure DR controls meet SOC 2, ISO 22301, NIST 800-53, and HIPAA/PCI requirements as applicable
  • Maintain current and accurate DR documentation: BIAs, BCPs, DRP runbooks, and recovery evidence
  • Collaborate with audit and compliance teams to provide DR evidence and remediation tracking

Requirements

What you’ll need
  • 5+ years in cloud infrastructure, SRE, or IT disaster recovery engineering roles
  • 3+ years of hands-on AWS experience in production environments at scale
  • Proven delivery of multi-region DR architectures with defined and tested RTO/RPO targets
  • Expert-level proficiency with core AWS resilience services
  • Strong scripting skills: Python, Bash, or PowerShell for automation and orchestration
  • Experience with Infrastructure as Code: Terraform and/or AWS CloudFormation
  • Solid understanding of networking fundamentals: VPC, TGW, Direct Connect, VPN, DNS failover
  • Excellent written and verbal communication; able to produce executive-level DR reports.

Benefits

Comp & perks
  • Competitive salary
  • Remote work options

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSTerraformCloudFormationPythonBashPowerShellRoute 53Global AcceleratorCloudFrontKubernetes
Soft Skills
communicationleadershiporganizationalcollaborationincident responsedocumentation