Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Supabase

Site Reliability Engineer

Supabase

Site Reliability Engineer at Supabase enhancing reliability practices across engineering teams. Collaborating on observability and operational readiness for millions of Postgres instances.

Posted 6/19/2026full-timeRemote • 🌎 Anywhere in the WorldSeniorLeadWebsite

Tech Stack

Tools & technologies
AWSCloudPostgresTerraform

About the role

Key responsibilities & impact
  • Partner with service teams to define meaningful SLIs and SLOs grounded in customer experience, and build the error budget policies that turn them into engineering decisions
  • Own and evolve the Operational Readiness Review (ORR) process — conducting reviews for new services and major changes across observability, alerting, runbooks, capacity, and graceful degradation
  • Strengthen the incident-to-improvement pipeline: connecting postmortem findings to operational readiness gaps, identifying repeat failure patterns, and driving systemic fixes
  • Act as the reliability expert teams pull in for architecture reviews, failure mode analysis, dependency mapping, and resilience design
  • Identify and quantify operational toil across the org, and build or advocate for automation that eliminates it
  • Help teams design sustainable on-call practices: alert quality, escalation paths, runbook coverage, and noise reduction
  • Track and report on org-wide operational maturity, surfacing systemic gaps and driving remediation

Requirements

What you’ll need
  • Have 7+ years of experience in SRE, production engineering, or reliability-focused roles, including experience shaping SRE practices and driving adoption across engineering teams
  • Have a software engineering mindset — you write code and build tools, not just configure them
  • Have hands-on experience defining and operationalizing SLOs/SLIs at scale, including error budget policies that actually influenced engineering decisions
  • Have deep experience with incident response, postmortem facilitation, and turning incident learnings into systemic improvements
  • Have worked with large-scale multi-tenant systems (bonus: managed database platforms or Postgres)
  • Are proficient with cloud infrastructure (AWS preferred) and infrastructure-as-code (Pulumi preferred, Terraform/CDK also acceptable)
  • Communicate clearly and persuasively — this role requires influencing without authority across a distributed org
  • Have experience in async or globally distributed teams
  • Are energized by making other teams more effective rather than being the one who fixes everything

Benefits

Comp & perks
  • Fully Remote
  • ESOP
  • Tech Allowance
  • Health Benefits
  • Annual Off-Sites
  • Flexible Work
  • Professional Development

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
SREproduction engineeringreliability engineeringSLOsSLIsincident responsepostmortem facilitationinfrastructure-as-codeautomationerror budget policies
Soft Skills
communicationinfluencing without authoritycollaborationproblem-solvingorganizational skillsleadershipadaptabilitypersuasivenessteam effectivenesscritical thinking