Ada

Staff DevOps Engineer

Ada

full-time

Posted on:

Origin:  • 🇨🇦 Canada

Visit company website
AI Apply
Manual Apply

Job Level

Lead

Tech Stack

AWSAzureCloudGoogle Cloud PlatformKubernetesMongoDBPostgresPythonRedisSparkTerraform

About the role

  • Act as a trusted technical leader setting direction on architecture, reliability, scalability, and developer experience.
  • Architect, build, and support scalable and highly reliable software systems that power Ada’s platform growth.
  • Lead the design and implementation of resilient cloud infrastructure (multi-region, multi-cloud where appropriate) to ensure uptime, scalability, and operational safety.
  • Continuously analyze and optimize infrastructure for reliability, performance, and cost—removing bottlenecks, modernizing tooling, and streamlining workflows.
  • Support developer tools and processes (CI/CD pipelines, deployment frameworks, environment provisioning) to maximize engineering velocity.
  • Troubleshoot and resolve complex infrastructure issues; participate in and elevate on-call operations and incident response.
  • Implement advanced DevOps practices across infrastructure as code, deployments, monitoring, and platform abstractions.
  • Establish and maintain reliability standards, define and enforce SLOs/SLAs, and ensure observability is built into all systems.
  • Lead cross-cutting initiatives to improve uptime, resiliency, and incident response processes, driving systemic reliability improvements.
  • Create and evolve platform abstractions, patterns, frameworks, and tooling to accelerate developer velocity and reduce operational toil.
  • Coach and mentor senior and mid-level engineers, contribute to engineering excellence, and represent DevOps in executive and cross-functional forums.
  • Outcomes: scalable, reliable, cost-effective infrastructure supporting rapid growth; measurable improvements in uptime, resiliency, and developer velocity; reduced operational toil.

Requirements

  • 8+ years of experience in DevOps, Site Reliability Engineering (SRE), or platform teams, with at least 2+ years operating at a Staff/Principal or equivalent senior technical leadership level.
  • Recognized expertise in building and scaling cloud infrastructure (AWS/Azure/GCP), with proven experience designing multi-region, highly available systems.
  • Deep technical knowledge of Kubernetes and container orchestration at scale (100s/1000s of nodes), including performance tuning, cost optimization, and failure mode analysis.
  • Strong experience managing and scaling data infrastructure (e.g., MongoDB, PostgreSQL, Redis), with a focus on horizontal scaling, sharding, and performance optimization.
  • Strong background in Infrastructure as Code (eg, Terraform) and GitOps tooling (eg, ArgoCD).
  • Proficiency in Python, Bash, or equivalent scripting languages for automation.
  • Experience creating and supporting cloud-based systems at scale (AWS/Azure/GCP), with a strong emphasis on Infrastructure as Code (IaC).
  • Experience with MongoDB and horizontally scaling data stores (i.e. sharding).
  • Experience leading incident response, root cause analysis, and systemic reliability improvements.
  • A track record of technical leadership: driving cross-team initiatives, mentoring engineers, and shaping long-term infrastructure strategy.
  • Excellent communication skills to translate technical complexity into business impact and influence cross-functional stakeholders.
  • Nice to have: Experience with multi-cloud architecture and hybrid deployments.
  • Nice to have: Familiarity with support tooling (PagerDuty, Datadog, Loft, Doppler) at organizational scale.