
Tech Operations Lead – Disaster Recovery
Ameriprise Financial Services, LLC
full-time
Posted on:
Location Type: Hybrid
Location: Noida • India
Visit company websiteExplore more
Job Level
About the role
- Design, implement, and maintain disaster recovery (DR) plans for the organizations IT infrastructure, ensuring business continuity.
- Assess and analyze business impact, defining recovery objectives (RTO and RPO) and aligning them with organizational goals.
- Regularly test disaster recovery procedures through simulations and mock drills to ensure operational readiness.
- Work with different teams to identify critical systems and services that need to be included in the disaster recovery plan.
- Evaluate DR tools and solutions, focusing on AWS-based services, to ensure a scalable and cost-effective recovery solution.
- Ensure that all IT systems are designed with resiliency in mind, ensuring high availability and fault tolerance.
- Implement and maintain cloud-based disaster recovery strategies using AWS services such as Amazon EC2, S3, RDS, Route 53, and more.
- Collaborate with architecture teams to ensure resiliency and continuity measures are embedded into infrastructure design.
- Oversee and optimize backup strategies, ensuring that systems can be quickly restored with minimal data loss.
- Automate disaster recovery processes and workflows using modern DevOps tools such as AWS CloudFormation, Tidal, Terraform, Ansible, or other automation frameworks.
- Implement Infrastructure as Code (IaC) practices to streamline the provisioning and management of recovery environments.
- Use SumoLogic, Dynatrace, AWS Lambda, CloudWatch, and other automation tools to proactively monitor and respond to system events or failures.
- Maintain clear and up-to-date documentation of disaster recovery plans, runbooks, and processes.
- Provide detailed post-disaster recovery reports, outlining the effectiveness of the recovery process and any lessons learned.
- Report on resiliency metrics, recovery objectives, and automation progress to senior leadership.
- Lead the response during actual disaster recovery events, coordinating with IT and business units to ensure a smooth recovery process.
- Perform post-incident analysis to identify root causes, implement corrective actions, and improve recovery plans.
- Collaborate closely with cross-functional teams including IT operations, security, engineering, and business continuity.
- Provide training and awareness on disaster recovery procedures to staff, helping them understand the importance of disaster recovery and their roles during recovery scenarios.
Requirements
- Proven experience in designing, implementing, and managing disaster recovery plans for both on-premises and cloud-based infrastructure.
- Experience with automation tools such as Tidal, Terraform, AWS CloudFormation, Ansible, or similar.
- Proficiency in scripting languages (Python, Shell, etc.) to automate processes and workflows.
- Excellent verbal and written communication skills for technical and non-technical stakeholders.
- Ability to lead recovery efforts, coordinate between various teams, and communicate effectively during high-pressure situations.
- AWS Certified Practitioner and Solutions Architect
Benefits
- Health insurance
- Flexible work arrangements
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
disaster recovery planningbusiness impact analysisrecovery time objective (RTO)recovery point objective (RPO)cloud-based disaster recoveryInfrastructure as Code (IaC)scripting (Python, Shell)automation processesbackup strategiespost-incident analysis
Soft Skills
communication skillsleadershipcollaborationtraining and awarenesscoordinationproblem-solvingattention to detailadaptabilitycritical thinkingtime management
Certifications
AWS Certified PractitionerAWS Certified Solutions Architect