Staff DevOps Engineer

Ada

full-time

Posted on: 9/3/2025

Location: 🇨🇦 Canada

Visit company website

✨ AI Apply

Apply

Job Level

Lead

Tech Stack

AWSAzureCloudGoogle Cloud PlatformKubernetesMongoDBPostgresPythonRedisSparkTerraform

About the role

Act as a trusted technical leader setting direction on architecture, reliability, scalability, and developer experience.
Architect, build, and support scalable and highly reliable software systems that power Ada’s platform growth.
Lead the design and implementation of resilient cloud infrastructure (multi-region, multi-cloud where appropriate) to ensure uptime, scalability, and operational safety.
Continuously analyze and optimize infrastructure for reliability, performance, and cost—removing bottlenecks, modernizing tooling, and streamlining workflows.
Support developer tools and processes (CI/CD pipelines, deployment frameworks, environment provisioning) to maximize engineering velocity.
Troubleshoot and resolve complex infrastructure issues; participate in and elevate on-call operations and incident response.
Implement advanced DevOps practices across infrastructure as code, deployments, monitoring, and platform abstractions.
Establish and maintain reliability standards, define and enforce SLOs/SLAs, and ensure observability is built into all systems.
Lead cross-cutting initiatives to improve uptime, resiliency, and incident response processes, driving systemic reliability improvements.
Create and evolve platform abstractions, patterns, frameworks, and tooling to accelerate developer velocity and reduce operational toil.
Coach and mentor senior and mid-level engineers, contribute to engineering excellence, and represent DevOps in executive and cross-functional forums.
Outcomes: scalable, reliable, cost-effective infrastructure supporting rapid growth; measurable improvements in uptime, resiliency, and developer velocity; reduced operational toil.

Requirements

8+ years of experience in DevOps, Site Reliability Engineering (SRE), or platform teams, with at least 2+ years operating at a Staff/Principal or equivalent senior technical leadership level.
Recognized expertise in building and scaling cloud infrastructure (AWS/Azure/GCP), with proven experience designing multi-region, highly available systems.
Deep technical knowledge of Kubernetes and container orchestration at scale (100s/1000s of nodes), including performance tuning, cost optimization, and failure mode analysis.
Strong experience managing and scaling data infrastructure (e.g., MongoDB, PostgreSQL, Redis), with a focus on horizontal scaling, sharding, and performance optimization.
Strong background in Infrastructure as Code (eg, Terraform) and GitOps tooling (eg, ArgoCD).
Proficiency in Python, Bash, or equivalent scripting languages for automation.
Experience creating and supporting cloud-based systems at scale (AWS/Azure/GCP), with a strong emphasis on Infrastructure as Code (IaC).
Experience with MongoDB and horizontally scaling data stores (i.e. sharding).
Experience leading incident response, root cause analysis, and systemic reliability improvements.
A track record of technical leadership: driving cross-team initiatives, mentoring engineers, and shaping long-term infrastructure strategy.
Excellent communication skills to translate technical complexity into business impact and influence cross-functional stakeholders.
Nice to have: Experience with multi-cloud architecture and hybrid deployments.
Nice to have: Familiarity with support tooling (PagerDuty, Datadog, Loft, Doppler) at organizational scale.

Staff DevOps Engineer

Job Level

Tech Stack

About the role

Requirements

Similar jobs on JobTailor

Senior Site Reliability Engineer, SRE

Intermediate Software Engineer – AI

Senior DevOps Engineer

Senior DevOps Consultant – AI Foundry

Intern – DevOps