Own the Quality & Reliability Program: Define and drive the vision for quality—across proactive practices (testing, deployment, observability), reactive processes (incident response, external communications), and cultural expectations (quality ownership, readiness).
Lead Cross-Functional Programs: Drive reliability and quality initiatives across Engineering, Product, Operations, and Customer Success.
Production Readiness: Own the Production Readiness Review (PRR) process; ensure all releases meet reliability standards before they go live.
Define and Drive SLOs: Establish and track Service Level Objectives (SLOs). Build visibility into reliability metrics and lead efforts to meet or exceed targets.
Improve Incident Management: Streamline incident response and postmortems. Drive structural improvements in tooling, communication, and ownership.
Scale Tooling & Automation: Collaborate across teams to enhance observability, alerting, testing automation, and response tooling.
Mitigate System Risk: Identify risk vectors early, build mitigation plans, and drive resolution with urgency.
Drive Alignment: Influence across Eng, Product, Ops, and GTM teams to prioritize reliability and integrate quality into every initiative.
Track Progress: Use tools like Atlas, Jira, and internal dashboards to maintain clarity on goals, risks, and outcomes.
Embed Continuous Learning: Build programs that ensure we learn from every incident, test edge cases, and continuously harden our systems.
Requirements
8+ years of program management experience, with at least 3 years in technical, reliability, or quality-focused domains.
Strong understanding of system architecture, distributed systems, and reliability engineering principles.
Familiarity with SDLC models, CI/CD pipelines, deployment automation, observability, and incident management tooling.
Demonstrated success defining and improving SLOs, SLIs, and production readiness processes.
Proven ability to lead large-scale, cross-functional programs across Engineering, Product, Operations, and Customer Success.
Skilled at translating complex technical goals into clear, actionable, and measurable outcomes.
Experienced in using Atlassian tools (e.g., Jira, Atlas) for program tracking, reporting, and executive communication.
Adept at navigating ambiguity, building alignment, and driving decision-making without formal authority.
Comfortable balancing technical depth with business priorities to influence outcomes.
Bachelor’s degree in Computer Science, Engineering, or related technical field, or equivalent practical experience.
Bonus: Experience in regulated or high-availability industries such as fintech, healthcare, or infrastructure.
Benefits
Base salary per year (paid semi-monthly)
Fast-paced and professional work culture
Stock options with standard startup vesting - 1 year cliff; 4 years total
$50 monthly communication expense stipend to go towards your phone/internet bill
$250 stipend to enhance your WFH setup
Reimbursement for peripheral equipment: monitor (up to $400), keyboard and mouse (up to $200)
Premium medical benefits including vision and dental (100% coverage for employees)
Company-sponsored life and disability insurance
Paid parental bonding leave
Paid sick leave, jury duty, bereavement
401k plan
Flexible Time Off (our team members typically take off ~3-4 weeks per year)
Volunteer Time Off
13 scheduled holidays
2x / year in-person team meet-ups
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
program managementreliability engineeringsystem architecturedistributed systemsSLOsSLIsproduction readinessdeployment automationobservabilityincident management