Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Blink Health

Senior Cloud Resilience Architect

Blink Health

. Evaluate and mature the organization’s disaster recovery posture, including recovery objectives (RTO/RPO), dependency mapping, and failure domain analysis across applications, data, and infrastructure.

Posted 5/6/2026full-timeRemote • New York • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies
AnsibleAWSAzureCloudDNSGoogle Cloud PlatformKubernetesTerraform

About the role

Key responsibilities & impact
  • Evaluate and mature the organization’s disaster recovery posture, including recovery objectives (RTO/RPO), dependency mapping, and failure domain analysis across applications, data, and infrastructure.
  • Define, document, and establish disaster recovery standards and best practices across cloud infrastructure, platforms, and application architectures.
  • Partner with SRE, platform, security, and product engineering teams to design and implement resilient, fault-tolerant systems, progressing from backup-based recovery to multi-region and active-active architectures.
  • Lead the disaster recovery roadmap, balancing technical feasibility, cost, risk, and business priorities.
  • Design and recommend reference architectures for disaster recovery patterns, including pilot-light, warm standby, hot standby, and active-active.
  • Drive adoption of active-active disaster recovery for critical systems, including traffic management, data replication, consistency models, and automated failover.
  • Define and operationalize testing strategies for DR, including game days, chaos testing, and regular recovery exercises.
  • Establish clear documentation, runbooks, and escalation paths to ensure recoverability is well understood and not dependent on individuals.
  • Evaluate and recommend platform upgrades, cloud services, and tooling that improve resilience, recovery speed, and reliability.
  • Serve as a technical authority and advisor on disaster recovery and resilience for leadership and engineering teams.
  • Provide architectural guidance, design reviews, and mentorship to engineers implementing DR-related changes.
  • Partner with security and compliance teams to ensure DR strategies meet regulatory, audit, and data protection requirements.

Requirements

What you’ll need
  • Bachelor’s or Master’s degree in Computer Science or equivalent practical experience.
  • 8+ years of experience in cloud infrastructure, platform engineering, SRE, or reliability-focused architecture roles.
  • Deep understanding of disaster recovery concepts including RTO/RPO, blast radius reduction, failure domains, and dependency isolation.
  • Proven experience designing and implementing multi-region and multi-availability zone architectures.
  • Hands-on experience moving systems toward active-active or highly available architectures.
  • Strong grasp of data replication strategies, consistency tradeoffs, and recovery patterns for databases and stateful systems.
  • Extensive experience with major cloud providers (AWS preferred, GCP/Azure acceptable).
  • Strong understanding of managed cloud services and their DR characteristics and limitations.
  • Experience with Kubernetes-based platforms, including regional failover, workload portability, and cluster recovery strategies.
  • Familiarity with global traffic management, DNS, load balancing, and service mesh patterns.
  • Experience designing and maintaining Infrastructure as Code using tools such as Terraform, Pulumi, CloudFormation, or Ansible.
  • Strong focus on automation for recovery workflows, failover testing, and environment provisioning.
  • Ability to eliminate manual recovery steps and reduce time-to-recovery through software.
  • Experience defining and running DR tests, game days, and failure simulations.
  • Comfortable working across organizational boundaries to influence priorities and standards.
  • Strong documentation and communication skills, with the ability to translate complex technical risk into business impact.

Benefits

Comp & perks
  • Health insurance
  • Remote work flexibility
  • Professional development
  • Paid time off

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
disaster recoveryRTORPOdependency mappingfailure domain analysisdata replicationconsistency modelsInfrastructure as CodeKubernetescloud architecture
Soft Skills
leadershipcommunicationdocumentationmentorshipinfluencecollaborationorganizational skillstechnical authorityproblem-solvingstrategic thinking